Extracting Translation Keys with i18next-parser

i18next-parser walks your source AST and writes a translation catalog, but its defaults silently drop plural and context variants, mangle keys when keySeparator disagrees with runtime, and delete strings a translator is still working on. This page shows a deterministic i18next-parser.config.js that produces stable keys, preserves removed strings, and fails CI when a translation is missing — the exact handoff your Weblate Self-Hosted Setup consumes.

The parser is the first stage of any code-driven Translation Workflows & CI/CD Pipeline Sync: it converts t('key') calls into JSON resource files. Get the separators and merge flags wrong and every downstream stage — Weblate import, machine pre-fill, the CI gate — inherits corrupt keys. The fix is one well-understood config file plus a lint command.

i18next-parser extraction and merge pipeline Source files are scanned by lexers, keys are extracted, sorted, then merged against the existing catalog so translator work is preserved and removed keys are kept; a lint step then fails CI on any missing value. Source scan *.tsx / *.vue Lexer extract t() + context Sort + merge keep existing values Catalog write {{lng}}/{{ns}}.json CI lint: fail on missing value --fail-on-warnings / git diff
Extraction is a merge, not an overwrite: sort + merge preserves translator values and removed keys, and a lint step turns missing values into a build failure.

Root cause: where extraction goes wrong

Most extraction bugs trace to four mechanical decisions the parser makes at scan time, not to bad source code.

Separators are structural, not cosmetic. keySeparator (default .) and namespaceSeparator (default :) tell the parser how to split a string like nav:menu.home into a namespace (nav) and a nested path (menuhome). If your code uses flat keys like nav.menu.home as literal strings, the parser will still split on . and produce a deeply nested object, which vue-i18n or react-i18next then fails to resolve at runtime because the runtime separator config disagrees. The parser config and the i18next init config must declare the same separators.

Plural and context keys are generated, not written. When you call t('item', { count }), the parser must emit item_one and item_other (English) plus every other CLDR category the locale needs. When you call t('greeting', { context: 'formal' }), it emits greeting_formal. Both are controlled by pluralSeparator and contextSeparator; if those don’t match runtime, the generated key never matches the lookup.

Merge defaults can delete in-flight work. By default the parser keeps existing translated values and only fills new keys with defaultValue. But if keepRemoved is left off, any key whose t() call you deleted (or that the parser failed to see because of a dynamic expression) vanishes from the catalog — taking the translation with it.

Dynamic keys are invisible to the AST. t(`page.${id}`) cannot be statically resolved, so it is skipped. The string looks translated in code but never reaches the catalog.

Minimal reproducible example

A single component shows all four failure modes at once:

// Inbox.tsx
import { useTranslation } from 'react-i18next';

export function Inbox({ count, formal, section }: Props) {
  const { t } = useTranslation('inbox');
  return (
    <>
      <h1>{t('title')}</h1>
      <p>{t('unread', { count })}</p>                 {/* plural */}
      <button>{t('greeting', { context: formal ? 'formal' : 'casual' })}</button>
      <span>{t(`tab.${section}`)}</span>             {/* dynamic — skipped */}
    </>
  );
}

Run with the stock config and you get inbox.json containing title, a single unread (no _one/_other), no greeting_formal/greeting_casual, and nothing for the dynamic tab. Three of four strings are wrong or missing — and nothing errors.

The corrected i18next-parser.config.js

// i18next-parser.config.js
module.exports = {
  // Separators MUST match your i18next/​react-i18next/vue-i18n init exactly.
  keySeparator: '.',          // nav.menu.home -> nested object
  namespaceSeparator: ':',    // inbox:title  -> ns "inbox", key "title"
  pluralSeparator: '_',       // unread_one / unread_other
  contextSeparator: '_',      // greeting_formal / greeting_casual

  defaultNamespace: 'translation',
  // Sentinel value, NOT empty string: empty values look "translated" to a TMS.
  defaultValue: '__MISSING__',

  locales: ['en', 'de', 'ar'],          // generate per-locale plural forms
  output: 'locales/$LOCALE/$NAMESPACE.json',
  input: ['src/**/*.{js,jsx,ts,tsx}'],

  lexers: {
    js: ['JavascriptLexer'],
    ts: ['JavascriptLexer'],
    jsx: ['JsxLexer'],
    tsx: ['JsxLexer'],            // JSX needs JsxLexer or <Trans> is missed
  },

  sort: true,                    // canonical ordering -> stable diffs in CI
  keepRemoved: true,             // do NOT drop a key just because its t() moved
  createOldCatalogs: false,      // no *_old.json noise; TMS owns deprecation
  failOnWarnings: true,          // dynamic-key / unparseable warnings fail the run
};

The load-bearing lines:

  • keepRemoved: true — the single most important flag. A refactor that temporarily comments out a t() call, or a dynamic expression the parser can’t read, will not erase the translator’s existing value. Stale keys are cleaned up deliberately (see escalation), never as an accidental side effect of a scan.
  • defaultValue: '__MISSING__' — a visible sentinel. An empty string is indistinguishable from a real translation in Weblate and slips through review; __MISSING__ is greppable and easy to gate on.
  • failOnWarnings: true — turns the dynamic-key tab.${section} warning into a non-zero exit, so CI catches the invisible string instead of shipping it.
  • locales with de/ar — forces generation of every plural category. Arabic needs zero/one/two/few/many/other; without the locale listed, those keys are never scaffolded. The Arabic and Slavic pluralization rules are what determine the exact set.

Handle the dynamic key by making it statically visible:

// Give the parser a literal it can see, via the keys option on <Trans>
// or an explicit allow-list comment the parser reads:
// t('tab.inbox'); t('tab.sent'); t('tab.drafts');
const tab = t(`tab.${section}` as const);

Verification: lint for missing keys

Two checks belong in CI. First, fail the run on any parser warning:

# Non-zero exit if any t() call is dynamic/unparseable (failOnWarnings).
npx i18next-parser --config i18next-parser.config.js

Second, prove extraction is idempotent and complete — re-running it must not change tracked files, and no catalog may still contain the sentinel:

# Re-extract, then fail if the catalog changed (someone forgot to run it)…
npx i18next-parser --config i18next-parser.config.js
git diff --exit-code -- 'locales/**/*.json'

# …and fail if any string is still the missing-value sentinel.
! grep -rq '"__MISSING__"' locales/en/

A green run guarantees every t() key is in the source-locale catalog with a real value, ordering is canonical, and no translator work was dropped. This is the contract that lets the GitHub Actions i18n CI gates block a merge on untranslated keys without false positives.

Configuration reference

Option Type Description / default
keySeparator string | false Splits nested key paths. Default .. Set false for flat keys; must equal runtime config.
namespaceSeparator string | false Splits ns:key. Default :.
pluralSeparator string Joins base key and CLDR category, e.g. item_one. Default _.
contextSeparator string Joins base key and context value, e.g. greeting_formal. Default _.
keepRemoved boolean Keep keys whose t() call is gone. Default false — set true.
defaultValue string | fn Placeholder for new keys. Use a greppable sentinel, not ''.
failOnWarnings boolean Exit non-zero on dynamic/unparseable keys. Default false.
sort boolean Canonical key ordering for stable diffs. Default false.
createOldCatalogs boolean Write *_old.json for removed keys. Default true; set false.
locales string[] Locales to scaffold; drives which plural categories are generated.

When to escalate

This config covers static t() and <Trans> usage. It is insufficient when keys are genuinely runtime-computed (a key returned by an API, or built from user data) — no AST parser can see those, and forcing them through failOnWarnings only blocks the build. In that case, maintain an explicit allow-list module of every possible key as literal t() calls (or a typed enum) so the parser extracts them, and treat the dynamic call site as a lookup into that known set.

Likewise, keepRemoved: true means stale keys accumulate. Schedule a deliberate cleanup: diff the extracted catalog against a fresh keepRemoved: false run on a throwaway branch, review what dropped, and prune in one reviewed commit — never let a routine scan delete strings. Once the catalog is clean it flows into the parent Weblate Self-Hosted Setup, where Weblate becomes the system of record for deprecation. If the catalog format itself is the friction point, the formatjs vs Lingui extraction pipeline comparison covers alternative extractors.

FAQ

Why do my plural keys never appear in the catalog?

The parser only generates _one/_other (and the other CLDR forms) when the t() call passes a count option and the target locale is listed in locales. A call like t('item', { count }) with locales: ['en'] yields item_one/item_other; without the count option you get a single flat item. Confirm pluralSeparator matches your runtime, then add every shipping locale to locales so categories like Arabic’s few/many are scaffolded.

How do I stop i18next-parser from deleting keys I’m still translating?

Set keepRemoved: true. By default the parser removes any key whose t() call it no longer sees — including calls hidden behind dynamic expressions or temporarily commented out during a refactor. With keepRemoved on, existing values survive; you prune stale keys deliberately in a separate reviewed step rather than losing them to an incidental scan.

Should keySeparator be the same in the parser and at runtime?

Yes — they must be identical. The parser uses keySeparator to decide whether menu.home is one flat key or a menuhome nesting, and i18next uses the same setting to resolve lookups. If they disagree, the parser writes nested objects the runtime can’t find (or vice versa), producing missing-key warnings for strings that are actually present.

Part of Weblate Self-Hosted Setup.