Weblate Celery Queue Stuck: Translations Not Committing
When Weblate stops committing translations, fires the “Celery queue is too long” admin alert, and the UI feels frozen, the cause is almost always a dead celery-beat or a backlogged worker queue (celery, backup, translate, or memory) — not your data. This page shows how to diagnose a stuck queue with celery status and Redis queue length, identify which of the four queues is wedged, and fix the worker layout in docker-compose.override.yml so commits flow again.
Weblate offloads almost everything slow to Celery: committing pending changes to Git, pushing to the remote, running automatic translation, updating translation memory, and nightly backups. If a worker dies or a queue saturates, those jobs pile up in Redis and nothing visible happens — the editor still saves your string, but the Weblate Self-Hosted Setup never writes it to disk. The fix is mechanical once you know which queue is stuck.
Root cause: which component actually stalled
“Stuck queue” is a symptom with four distinct root causes. Weblate runs separate Celery worker processes, each subscribed to a subset of named queues, plus one celery-beat scheduler. The official Docker image runs them as one container (CELERY_* env-driven), but a custom docker-compose often splits them — and that is where a queue silently loses its only worker.
celery-beat is dead. Beat is the cron of Celery. The periodic task weblate.trans.tasks.commit_pending is what flushes translations that have been sitting past their commit delay; daily and cleanup jobs also live here. If beat crashes or was never started, the commit queue is never fed — workers are idle, Redis is near-empty, yet translations never commit. This is the most confusing variant because nothing looks backed up.
A worker for a specific queue is gone. If you scaled the celery service to subscribe to only celery,notify and forgot translate, every automatic-translation job enqueues into translate and stays there forever. LLEN translate climbs without bound. The “Celery queue is too long” alert in Manage → Performance report fires when any queue exceeds its threshold (default 200 pending tasks for the default queue, lower for others).
A single task is poisoned and blocking the queue. A repository with a merge conflict, a broken Git remote credential, or a huge memory-rebuild task can occupy the worker indefinitely (or crash-loop on retry), so the queue behind it never drains.
Redis ran out of memory or lost the broker. If Redis hit maxmemory with an eviction policy that drops keys, queued tasks vanish or the broker refuses writes; Weblate then logs broker connection errors and no task is durable.
Minimal reproducible example
The smallest way to reproduce the most common case — a queue with no worker — is to start Weblate with the translate queue unsubscribed, then trigger an automatic-translation job:
# docker-compose.override.yml — BROKEN: worker ignores the translate + memory queues
services:
celery:
environment:
# Only these two queues are drained; translate/memory/backup are orphaned.
CELERY_MAIN_OPTIONS: "--queues=celery,notify --concurrency=4"
Now run an automatic translation from Tools → Automatic translation on any component. The job is accepted, the UI shows “queued”, and it never finishes. Inspect Redis and you can watch the orphaned queue grow while the worker sits idle:
# Length of each Weblate queue in the broker. translate climbs; celery stays at 0.
docker compose exec cache redis-cli LLEN translate
docker compose exec cache redis-cli LLEN celery
docker compose exec cache redis-cli LLEN memory
docker compose exec cache redis-cli LLEN backup
A non-zero, monotonically rising LLEN translate with idle workers is the signature of an unsubscribed queue.
Diagnosis: pinpoint the stuck queue
Run these in order. They take you from “is anything alive” to “exactly which queue”.
# 1. Are workers even responding? Empty/timeout = no live worker at all.
docker compose exec --user weblate weblate celery -A weblate status
# 2. What is each live worker actually subscribed to? Look for missing queues.
docker compose exec --user weblate weblate celery -A weblate inspect active_queues
# 3. Is beat running? No recent "Scheduler: Sending due task" = beat is dead.
docker compose logs --tail=50 weblate | grep -i beat
# 4. Backlog depth per queue, straight from the broker.
for q in celery translate memory backup notify; do
printf '%s = ' "$q"; docker compose exec -T cache redis-cli LLEN "$q"
done
Map the output to a cause: empty celery status means no worker (fix the service/restart); a worker that is up but whose active_queues omits translate means an unsubscribed queue; a healthy worker subscribed to everything with a high LLEN and no progress means a poisoned task (check inspect active for a long-running task and the component’s repository status). No beat log lines with low LLEN everywhere but uncommitted strings means dead beat.
Fix: correct worker layout in docker-compose
The durable fix is to make one worker subscribe to every Weblate queue and to guarantee beat is running. With the official image, do not split queues by hand unless you also account for all of them:
# docker-compose.override.yml — CORRECT
services:
weblate:
environment:
# The official entrypoint runs the worker AND beat when this is unset/0.
# Setting it to 0 keeps beat co-located so commit_pending always fires.
WEBLATE_WORKERS: "4" # autoscale ceiling for the worker pool
CELERY_MAIN_OPTIONS: "--concurrency=4"
# Dedicate concurrency to slow queues so one big job can't starve commits.
CELERY_NOTIFY_OPTIONS: "--concurrency=2"
CELERY_TRANSLATE_OPTIONS: "--concurrency=2"
CELERY_MEMORY_OPTIONS: "--concurrency=1"
CELERY_BACKUP_OPTIONS: "--concurrency=1"
CELERY_BEAT_OPTIONS: "" # presence of this var keeps beat enabled
cache:
image: redis:7-alpine
# Never let Redis evict queued tasks: persist and do not drop keys.
command: >-
redis-server --save 60 1
--maxmemory 256mb
--maxmemory-policy noeviction # noeviction => broker errors loudly, not silently
restart: unless-stopped
The load-bearing decisions:
--maxmemory-policy noeviction— the single most important Redis line. Withallkeys-lru(a common copy-paste default), Redis will silently delete queued tasks under pressure and your commits disappear with no error.noevictionmakes the broker reject writes loudly so you find out immediately.- Per-queue
CELERY_*_OPTIONS— the official image starts one worker per named queue from these vars. Leaving any of them undefined is fine (the image has defaults), but overridingCELERY_MAIN_OPTIONSto a hand-rolled--queues=list that omits a queue is exactly the bug from the reproduction above. Tune concurrency; do not narrow the queue set. - Co-locating beat — keeping beat inside the main container (the image’s default) means there is no separate service to forget to start. If you must run beat standalone, it must be exactly one replica — two beats double-schedule every commit.
After editing, recreate and force a flush of anything already pending:
docker compose up -d --force-recreate weblate cache
# Drain the backlog now instead of waiting for the next beat tick:
docker compose exec --user weblate weblate weblate commit_pending --all
Verification
Prove the queues drain and commits flow. The check is: every LLEN returns to ~0, beat logs a due task, and a freshly edited string lands in Git.
# 1. All queues should settle to 0 within a minute of the worker coming up.
for q in celery translate memory backup notify; do
printf '%s = ' "$q"; docker compose exec -T cache redis-cli LLEN "$q"
done
# 2. Beat is alive and scheduling commit_pending.
docker compose logs --since=2m weblate | grep -i 'Scheduler: Sending due task'
# 3. End-to-end: edit one string in the UI, wait for the commit delay, then:
docker compose exec --user weblate weblate git -C /app/data/vcs/<project>/<component> log -1 --oneline
# Expect a fresh "Translated using Weblate" commit timestamped after your edit.
A green run means all four worker queues have live consumers, beat is feeding the commit schedule, and the Git working tree reflects the editor again — the contract the rest of your Translation Workflows & CI/CD Pipeline Sync depends on. If you also drive jobs from a webhook, confirm the Weblate webhooks for auto-translation enqueue into the now-drained translate queue.
When to escalate
This fix assumes the queues are healthy once workers and beat are correct. It is insufficient when the blocker is inside a task rather than the queue plumbing. If commit_pending runs but the component’s repository shows a merge conflict or a rejected push, no amount of worker tuning helps — the worker dequeues the task, fails to push, and the change stays “pending” in the editor. Resolve the repository state from Manage → Repository maintenance (or weblate CLI loadpo/commit_pending per component), then re-check the queue. Likewise, a memory queue that never empties usually means a translation-memory rebuild on a very large corpus; that is expected to be slow, so raise the alert threshold rather than treating it as stuck.
If beat genuinely cannot stay up, or celery status is empty even after --force-recreate, the problem has moved below Weblate into Docker/Redis networking, and you should debug it as a broker-connectivity issue against the parent Weblate Self-Hosted Setup rather than as a queue backlog.
FAQ
Why does Weblate save my translation but never commit it to Git?
Saving writes the string to the database immediately; committing to Git is a deferred Celery task scheduled by celery-beat via commit_pending after the component’s commit delay. If beat is dead, that scheduled task never runs, so the string lives in the database but never reaches the repository. Check docker compose logs weblate | grep -i beat for recent “Sending due task” lines; if there are none, beat is not running. Restart the container so beat is co-located, then run weblate commit_pending --all to flush immediately.
What does the “Celery queue is too long” admin alert actually mean?
It means at least one named queue in Redis has more pending tasks than its threshold (default around 200 for the default celery queue, lower for others), measured on the Performance report page. That happens when a queue’s only worker died or was never subscribed to it, so tasks enqueue but nothing consumes them. Run celery -A weblate inspect active_queues to see what each live worker is subscribed to, compare against celery, translate, memory, backup, notify, and restore the missing subscription.
How do I check the Celery queue length in Weblate’s Redis?
Use redis-cli LLEN <queue> against the cache container — for example docker compose exec cache redis-cli LLEN translate. Weblate uses Redis lists keyed by queue name, so the list length is the number of pending tasks. Check celery, translate, memory, backup, and notify; a non-zero value that keeps rising while workers are idle pinpoints the queue that has lost its consumer.
Related
- Weblate Self-Hosted Setup — the parent setup whose worker and Redis services this page repairs.
- Configuring Weblate webhooks for auto-translation — webhooks enqueue into the
translatequeue this page keeps drained. - Extracting translation keys with i18next-parser — the upstream catalog that feeds the commits Celery is meant to push.
- GitHub Actions i18n CI gates — what consumes the Git commits once Celery starts pushing them again.
- Machine-translation pre-fill workflows — jobs that load the
translatequeue and can back it up.
Part of Weblate Self-Hosted Setup.