Extracting Translation Keys with i18next-parser
i18next-parser walks your source AST and writes a translation catalog, but its defaults silently drop plural and context variants, mangle keys when keySeparator disagrees with runtime, and delete strings a translator is still working on. This page shows a deterministic i18next-parser.config.js that produces stable keys, preserves removed strings, and fails CI when a translation is missing — the exact handoff your Weblate Self-Hosted Setup consumes.
The parser is the first stage of any code-driven Translation Workflows & CI/CD Pipeline Sync: it converts t('key') calls into JSON resource files. Get the separators and merge flags wrong and every downstream stage — Weblate import, machine pre-fill, the CI gate — inherits corrupt keys. The fix is one well-understood config file plus a lint command.
Root cause: where extraction goes wrong
Most extraction bugs trace to four mechanical decisions the parser makes at scan time, not to bad source code.
Separators are structural, not cosmetic. keySeparator (default .) and namespaceSeparator (default :) tell the parser how to split a string like nav:menu.home into a namespace (nav) and a nested path (menu → home). If your code uses flat keys like nav.menu.home as literal strings, the parser will still split on . and produce a deeply nested object, which vue-i18n or react-i18next then fails to resolve at runtime because the runtime separator config disagrees. The parser config and the i18next init config must declare the same separators.
Plural and context keys are generated, not written. When you call t('item', { count }), the parser must emit item_one and item_other (English) plus every other CLDR category the locale needs. When you call t('greeting', { context: 'formal' }), it emits greeting_formal. Both are controlled by pluralSeparator and contextSeparator; if those don’t match runtime, the generated key never matches the lookup.
Merge defaults can delete in-flight work. By default the parser keeps existing translated values and only fills new keys with defaultValue. But if keepRemoved is left off, any key whose t() call you deleted (or that the parser failed to see because of a dynamic expression) vanishes from the catalog — taking the translation with it.
Dynamic keys are invisible to the AST. t(`page.${id}`) cannot be statically resolved, so it is skipped. The string looks translated in code but never reaches the catalog.
Minimal reproducible example
A single component shows all four failure modes at once:
// Inbox.tsx
import { useTranslation } from 'react-i18next';
export function Inbox({ count, formal, section }: Props) {
const { t } = useTranslation('inbox');
return (
<>
<h1>{t('title')}</h1>
<p>{t('unread', { count })}</p> {/* plural */}
<button>{t('greeting', { context: formal ? 'formal' : 'casual' })}</button>
<span>{t(`tab.${section}`)}</span> {/* dynamic — skipped */}
</>
);
}
Run with the stock config and you get inbox.json containing title, a single unread (no _one/_other), no greeting_formal/greeting_casual, and nothing for the dynamic tab. Three of four strings are wrong or missing — and nothing errors.
The corrected i18next-parser.config.js
// i18next-parser.config.js
module.exports = {
// Separators MUST match your i18next/react-i18next/vue-i18n init exactly.
keySeparator: '.', // nav.menu.home -> nested object
namespaceSeparator: ':', // inbox:title -> ns "inbox", key "title"
pluralSeparator: '_', // unread_one / unread_other
contextSeparator: '_', // greeting_formal / greeting_casual
defaultNamespace: 'translation',
// Sentinel value, NOT empty string: empty values look "translated" to a TMS.
defaultValue: '__MISSING__',
locales: ['en', 'de', 'ar'], // generate per-locale plural forms
output: 'locales/$LOCALE/$NAMESPACE.json',
input: ['src/**/*.{js,jsx,ts,tsx}'],
lexers: {
js: ['JavascriptLexer'],
ts: ['JavascriptLexer'],
jsx: ['JsxLexer'],
tsx: ['JsxLexer'], // JSX needs JsxLexer or <Trans> is missed
},
sort: true, // canonical ordering -> stable diffs in CI
keepRemoved: true, // do NOT drop a key just because its t() moved
createOldCatalogs: false, // no *_old.json noise; TMS owns deprecation
failOnWarnings: true, // dynamic-key / unparseable warnings fail the run
};
The load-bearing lines:
keepRemoved: true— the single most important flag. A refactor that temporarily comments out at()call, or a dynamic expression the parser can’t read, will not erase the translator’s existing value. Stale keys are cleaned up deliberately (see escalation), never as an accidental side effect of a scan.defaultValue: '__MISSING__'— a visible sentinel. An empty string is indistinguishable from a real translation in Weblate and slips through review;__MISSING__is greppable and easy to gate on.failOnWarnings: true— turns the dynamic-keytab.${section}warning into a non-zero exit, so CI catches the invisible string instead of shipping it.localeswithde/ar— forces generation of every plural category. Arabic needszero/one/two/few/many/other; without the locale listed, those keys are never scaffolded. The Arabic and Slavic pluralization rules are what determine the exact set.
Handle the dynamic key by making it statically visible:
// Give the parser a literal it can see, via the keys option on <Trans>
// or an explicit allow-list comment the parser reads:
// t('tab.inbox'); t('tab.sent'); t('tab.drafts');
const tab = t(`tab.${section}` as const);
Verification: lint for missing keys
Two checks belong in CI. First, fail the run on any parser warning:
# Non-zero exit if any t() call is dynamic/unparseable (failOnWarnings).
npx i18next-parser --config i18next-parser.config.js
Second, prove extraction is idempotent and complete — re-running it must not change tracked files, and no catalog may still contain the sentinel:
# Re-extract, then fail if the catalog changed (someone forgot to run it)…
npx i18next-parser --config i18next-parser.config.js
git diff --exit-code -- 'locales/**/*.json'
# …and fail if any string is still the missing-value sentinel.
! grep -rq '"__MISSING__"' locales/en/
A green run guarantees every t() key is in the source-locale catalog with a real value, ordering is canonical, and no translator work was dropped. This is the contract that lets the GitHub Actions i18n CI gates block a merge on untranslated keys without false positives.
Configuration reference
| Option | Type | Description / default |
|---|---|---|
keySeparator |
string | false |
Splits nested key paths. Default .. Set false for flat keys; must equal runtime config. |
namespaceSeparator |
string | false |
Splits ns:key. Default :. |
pluralSeparator |
string |
Joins base key and CLDR category, e.g. item_one. Default _. |
contextSeparator |
string |
Joins base key and context value, e.g. greeting_formal. Default _. |
keepRemoved |
boolean |
Keep keys whose t() call is gone. Default false — set true. |
defaultValue |
string | fn |
Placeholder for new keys. Use a greppable sentinel, not ''. |
failOnWarnings |
boolean |
Exit non-zero on dynamic/unparseable keys. Default false. |
sort |
boolean |
Canonical key ordering for stable diffs. Default false. |
createOldCatalogs |
boolean |
Write *_old.json for removed keys. Default true; set false. |
locales |
string[] |
Locales to scaffold; drives which plural categories are generated. |
When to escalate
This config covers static t() and <Trans> usage. It is insufficient when keys are genuinely runtime-computed (a key returned by an API, or built from user data) — no AST parser can see those, and forcing them through failOnWarnings only blocks the build. In that case, maintain an explicit allow-list module of every possible key as literal t() calls (or a typed enum) so the parser extracts them, and treat the dynamic call site as a lookup into that known set.
Likewise, keepRemoved: true means stale keys accumulate. Schedule a deliberate cleanup: diff the extracted catalog against a fresh keepRemoved: false run on a throwaway branch, review what dropped, and prune in one reviewed commit — never let a routine scan delete strings. Once the catalog is clean it flows into the parent Weblate Self-Hosted Setup, where Weblate becomes the system of record for deprecation. If the catalog format itself is the friction point, the formatjs vs Lingui extraction pipeline comparison covers alternative extractors.
FAQ
Why do my plural keys never appear in the catalog?
The parser only generates _one/_other (and the other CLDR forms) when the t() call passes a count option and the target locale is listed in locales. A call like t('item', { count }) with locales: ['en'] yields item_one/item_other; without the count option you get a single flat item. Confirm pluralSeparator matches your runtime, then add every shipping locale to locales so categories like Arabic’s few/many are scaffolded.
How do I stop i18next-parser from deleting keys I’m still translating?
Set keepRemoved: true. By default the parser removes any key whose t() call it no longer sees — including calls hidden behind dynamic expressions or temporarily commented out during a refactor. With keepRemoved on, existing values survive; you prune stale keys deliberately in a separate reviewed step rather than losing them to an incidental scan.
Should keySeparator be the same in the parser and at runtime?
Yes — they must be identical. The parser uses keySeparator to decide whether menu.home is one flat key or a menu → home nesting, and i18next uses the same setting to resolve lookups. If they disagree, the parser writes nested objects the runtime can’t find (or vice versa), producing missing-key warnings for strings that are actually present.
Related
- Weblate Self-Hosted Setup — the system that imports the catalog this parser produces.
- Configuring Weblate webhooks for auto-translation — fire downstream translation jobs once extracted keys land.
- GitHub Actions i18n CI gates — turn the lint commands here into a merge-blocking gate.
- Handling pluralization in Arabic and Slavic languages — the CLDR categories your plural keys must cover.
- formatjs vs Lingui extraction pipeline — alternative extractors when i18next-parser doesn’t fit.
Part of Weblate Self-Hosted Setup.