What gets monitored

Four signal types come out of the SDK. Each is independent — if one fails (e.g. queue depth on an unsupported broker) the others keep working.

Task lifecycle

The SDK hooks four Celery signals and emits one event per signal:

Every event carries the task ID, task name, worker hostname, retry count, args/kwargs, and a timestamp. task-started additionally carries the queue. task-succeeded carries a runtime in seconds. task-failed carries the exception repr() and the traceback string; task-retried carries the same shape but the exception is rendered with str() on Celery's retry reason.

Celery reuses the same task_id across retry attempts. The SDK leans into that: every event for an attempt is tagged with a retries counter (0 on first attempt, 1 on second, etc.), which is how the task chain view groups events under "Attempt 1 / Attempt 2 / …" headings.

What you see in the dashboard

Notes

Worker heartbeats

Celery's worker process emits a heartbeat_sent signal periodically. The SDK listens for it and forwards a worker-heartbeat event upstream, throttled to one every 30 seconds per worker process. The payload is the worker hostname and the list of queues the worker is consuming.

On the backend, heartbeat writes are an upsert keyed on (api_key, hostname) with GREATEST(existing, incoming) semantics on last_seen — so an out-of-order heartbeat (e.g. one that landed late from the SDK retry queue after a CR-side outage) can never push last_seen backward and fire a phantom worker_offline alert.

Worker name resolution

The hostname sent on heartbeats — and on every task event — is resolved fresh on each call:

  1. CELERYRADAR_WORKER_NAME environment variable, if set and non-empty.
  2. The worker_name= kwarg passed to connect().
  3. Falling back to socket.gethostname().

In Kubernetes, ECS, or anywhere else where the host's name rotates on every restart, set CELERYRADAR_WORKER_NAME in your manifest to a stable per-deployment value. Otherwise every restart adds a new "worker" row to your dashboard and the previous one drifts into offline state.

Beat schedules

If you run Celery beat — either a dedicated beat process or beat embedded in a worker — the SDK monitors your scheduled tasks automatically. No extra configuration.

How it works

The SDK hooks two beat signals:

To pick up admin-side changes (a user adding a new entry in django-celery-beat, or changing a crontab in RedBeat) without a beat restart, the SDK wraps the scheduler's tick() method and re-syncs the schedule list every 30 seconds. So adding or deleting a beat entry while the beat process is running propagates within half a minute.

Supported schedulers

Schedule types

Solar and clocked schedules don't fit the "expected next fire" abstraction the dashboard uses to detect missed runs. They'll be supported when the model adapts — for now, beat fires for those entries land in the task log but don't get a dedicated schedule row.

What you see in the dashboard

Queue depth

Queue depth monitoring is the only piece of the SDK that talks to your broker directly. Every 30 seconds it samples the depth of every declared queue with a single Redis pipeline and emits one queue-depth event per poll, batching all queues into a samples array.

Leader election

If you run multiple worker processes — and you almost certainly do — every one of them imports the SDK and spawns a queue depth poller. Without coordination, each would sample independently and you'd see N copies of every depth sample.

The SDK avoids this with a Redis-backed leader lock at the key celeryradar::queue-poll-lock. Pollers contend for the lock; the winner samples and ships, the losers sleep. The lock has a 60-second TTL and is refreshed every poll interval; if the leader crashes, the next contender takes over within a minute.

This means queue depth monitoring only works when at least one process can reach Redis with the broker's credentials — which it always can, because that's how Celery itself talks to the broker.

Broker support

Today: standard Redis list-mode brokers (redis:// or rediss:// URLs). Auto-detected from app.conf.broker_url; pass broker_url= to connect() if you need to override.

Not yet supported for queue depth (but tasks/workers/beat all still work):

If your broker isn't supported, the queue depth charts will silently stay empty. You'll still see queue names on workers, in task events, and as alert rule targets.

What you see in the dashboard