Core i18n Architecture & Locale Negotiation

A production internationalization pipeline is a deterministic, cache-aware sequence of stages — locale resolution, message externalization, resource loading with fallback, runtime formatting, and bidirectional layout — that must produce the same output for the same request every time. This page is the architectural map for that pipeline: it defines the contracts each stage owes the next, the spec references that keep the stages interoperable, and the places where a careless default silently corrupts every downstream stage. The audience is full-stack engineers, UX engineers, product teams, and localization leads who own this surface end to end.

Most i18n bugs are not translation bugs. They are ordering bugs, caching bugs, and fallback bugs that happen to surface as a missing string or a mirrored icon. Treating the pipeline as one deterministic system — rather than a bag of framework features — is what lets a team ship dozens of locales without a combinatorial explosion of edge cases.

The deterministic i18n pipeline An incoming request flows through locale resolution, then message externalization, then resource loading with fallback, then runtime formatting, then RTL and layout, producing a rendered localized UI. Compliance and edge-cache headers wrap every stage. Cross-cutting: compliance flags + edge-cache headers (Vary, stale-while-revalidate) applied around every stage below 1 · Locale resolution URL > cookie > header > geo 2 · Externalization ICU + AST extraction 3 · Load + fallback code-split, degrade gracefully 4 · Runtime formatting Intl date/number/currency 5 · RTL + layout logical properties, dir Rendered localized UI
Five sequential pipeline stages; each stage's output is the next stage's input, wrapped by compliance and cache concerns.

Architecture overview

This area sits upstream of everything else in localization. Once the locale is resolved and the message catalog is loaded, two adjacent bodies of work take over. The Framework i18n & Component Routing documentation covers how Next.js, React, Vue, Angular, SvelteKit, and Astro wire that resolved locale into routing, components, and hydration — it consumes the contract this pipeline produces. The Translation Workflows & CI/CD Pipeline Sync documentation covers how the externalized messages travel to translators (Crowdin, Weblate), come back, and get gated in CI before they re-enter the catalog this pipeline loads.

Keep the boundary sharp: this pipeline decides which locale and which strings; the framework area decides how those strings reach a component; the workflow area decides how strings are authored, translated, and validated. When a bug crosses two of those areas — a hydration mismatch after a locale switch, for example — fix it at the boundary contract, not by patching one side.

The contract this pipeline exposes to the rest of the system is small and explicit:

Contract field Produced by Consumed by
resolvedLocale (BCP 47 tag) Stage 1 resolution framework routing, formatting, RTL
fallbackChain (ordered tags) Stage 3 resolver message lookup, framework loaders
messages (compiled catalog) Stage 2 + 3 components, runtime formatters
isRTL / dir Stage 5 document root, layout engine
cache key (Vary, locale) cross-cutting CDN, edge middleware

Everything below is an implementation of one of those five contract fields.

Locale resolution: a deterministic order

The first stage answers one question — which locale is this request in? — and it must answer the same way every time. A non-deterministic answer fragments your cache, triggers redirect loops, and makes “works on my machine” bugs unreproducible. The hierarchy is: URL prefix > explicit cookie/header override > Accept-Language parsing (q-weighted, per RFC 4647 lookup matching) > geo-IP last resort. Resolve it at the edge, before application hydration, so the client never flashes unlocalized content. The full decision tree, including domain-based routing and the q-weight tie-breaks, lives in Locale Negotiation Strategies, with a worked backend example in implementing locale negotiation in Express.js.

Implement edge resolution in this order:

  1. Short-circuit when the path already carries a locale. If /de/... is present and supported, return immediately — never re-resolve, or you create redirect loops and double-counted cache entries.
  2. Read an explicit override. A user who picked a language gets a cookie; that beats their browser header. Validate it against the supported set so a stale or tampered cookie can’t route to a non-existent locale.
  3. Parse Accept-Language by q-weight, then strip the region to match against supported base languages. Do not blindly take split(',')[0]; that discards the user’s second preference.
  4. Fall back to geo-IP only as a last resort, and persist the decision in a cookie so the next request is a cheap path match, not a re-resolution.
  5. Set Vary: Accept-Language, Cookie on the response so the CDN keys cached variants correctly.
// middleware.ts — Next.js / Vercel Edge
import { NextRequest, NextResponse } from 'next/server';

const SUPPORTED_LOCALES = ['en', 'de', 'ja', 'es'];
const DEFAULT_LOCALE = 'en';

export function middleware(req: NextRequest) {
  const url = req.nextUrl.clone();

  // 1. Path already localized → never re-resolve (prevents redirect loops).
  const pathLocale = url.pathname.split('/')[1];
  if (SUPPORTED_LOCALES.includes(pathLocale)) return NextResponse.next();

  // 2. Explicit override beats the browser header.
  const cookieLocale = req.cookies.get('i18n_locale')?.value;
  // 3. q-weighted parse, region stripped, first supported match wins.
  const headerLocale = (req.headers.get('accept-language') ?? '')
    .split(',')
    .map((part) => {
      const [tag, q] = part.trim().split(';q=');
      return { tag: tag.split('-')[0], q: q ? parseFloat(q) : 1 };
    })
    .sort((a, b) => b.q - a.q)
    .find((c) => SUPPORTED_LOCALES.includes(c.tag))?.tag;

  let resolvedLocale = DEFAULT_LOCALE; // 4. geo-IP would slot in just above this.
  if (cookieLocale && SUPPORTED_LOCALES.includes(cookieLocale)) resolvedLocale = cookieLocale;
  else if (headerLocale) resolvedLocale = headerLocale;

  url.pathname = `/${resolvedLocale}${url.pathname}`;
  const res = NextResponse.redirect(url);
  res.cookies.set('i18n_locale', resolvedLocale, { path: '/', maxAge: 31536000 });
  res.headers.set('Vary', 'Accept-Language, Cookie'); // 5. correct cache keying
  return res;
}

export const config = {
  matcher: ['/((?!api|_next/static|_next/image|favicon.ico).*)'],
};

The resolved locale is the single source of truth for every downstream stage. Propagate it explicitly (a request header like x-locale, a context provider, or a compiled config) rather than re-deriving it — re-deriving in a component is the classic root cause of hydration mismatches, because the server and client can disagree on the inputs.

// i18n.config.ts — compile the routing matrix at build time, expose it as a contract
export const i18nConfig = {
  locales: ['en', 'de', 'ja', 'es'],
  defaultLocale: 'en',
  domains: {
    en: { host: 'app.global.com', defaultLocale: 'en' },
    de: { host: 'app.de.global.com', defaultLocale: 'de' },
  },
  featureFlags: {
    enableRTL: ['ar', 'he'],
    phasedRollout: { ja: { enabled: true, trafficPercentage: 0.15 } },
  },
} as const;

export function getLocaleContext(request: Request) {
  const locale = request.headers.get('x-locale') || i18nConfig.defaultLocale;
  return { locale, isRTL: i18nConfig.featureFlags.enableRTL.includes(locale) };
}

Feature flags on the config let you roll a locale out to 15% of traffic without redeploying core routing — useful when a market is launching but the catalog is only partially translated.

Message externalization: ICU and AST extraction

The second stage answers what does each string say in this locale? — and it must never let a hardcoded English string reach production. Externalization means every user-facing string lives in a catalog keyed by a stable id, written in a portable message syntax. The portable syntax is ICU MessageFormat, because it encodes plurals, gender, and nested selects in a way every locale’s grammar can satisfy. The full syntax, including the traps in nested selects and the difference between plural, select, and selectordinal, is in the ICU Message Format Deep Dive; the hardest grammatical cases are worked through in ICU syntax for complex plurals.

Set up the externalization stage in this order:

  1. Author messages in ICU, not string concatenation — concatenation cannot express agreement rules and breaks the moment a translator reorders clauses.
  2. Extract with an AST parser, not a regex. Regex extraction misses template literals and dynamic ids; AST extraction (formatjs, Lingui) walks the real syntax tree.
  3. Hash the message content into the id so identical strings dedupe and a changed source string gets a new id, forcing re-translation.
  4. Compile the catalog to the runtime format your loader expects, failing the build on any ICU syntax error.
  5. Sync with translation memory and glossary enforcement so terminology stays consistent across teams — see how that gate is applied in CI in the Translation Workflows documentation.
{count, plural,
  =0 {No new messages}
  one {# new message}
  other {# new messages}
} from {gender, select,
  female {her}
  male {him}
  other {them}
}
{
  "extract": {
    "outFile": "locales/en/messages.json",
    "format": "simple",
    "idInterpolationPattern": "[sha512:contenthash:base64:6]",
    "preserveWhitespace": true
  }
}

The output of this stage — an en source catalog of {id: ICU string} pairs — is what travels to translators and what comes back as per-locale catalogs. The id is the join key across the entire system, which is why a content hash beats a hand-written key: it is collision-resistant and it makes a stale translation impossible to mistake for a current one.

Resource loading and fallback: degrade, never break

The third stage answers what string do we show when this locale is missing a key? The answer must never be a blank UI or a raw key. Translation bundles are code-split by locale and dynamically imported for the negotiated route, so a French user never downloads Japanese strings. When a key is missing — partial translation, a brand-new string, a region-only catalog — resolution walks a fallback chain: en-USen → default locale. Build that chain explicitly in Fallback Chain Configuration, and see the production-grade version that avoids dead ends in setting up graceful fallback chains for missing strings.

Wire the loading-and-fallback stage like this:

  1. Cache the in-flight load promise, not just the resolved bundle, so concurrent requests for the same locale share one network fetch.
  2. Code-split per locale with dynamic import() keyed on the resolved tag.
  3. Compute the fallback chain once from the requested tag and the supported set, deduped and ending at the default.
  4. Resolve a key against the chain in order, returning the first hit — never an empty string.
  5. Set stale-while-revalidate so a catalog update propagates without a cache stampede against the origin.
// i18nLoader.ts — share one in-flight load per locale, code-split per locale
const localeCache = new Map<string, Promise<Record<string, string>>>();

export async function loadLocaleBundle(locale: string) {
  if (localeCache.has(locale)) return localeCache.get(locale)!;
  const loader = import(`../locales/${locale}/messages.json`).then((mod) => mod.default);
  localeCache.set(locale, loader); // cache the promise, not the result
  return loader;
}

// fallbackResolver.ts — en-US → en → default, deduped, never empty
export function resolveFallbackChain(requested: string, supported: string[]): string[] {
  const chain: string[] = [];
  const [lang, region] = requested.split('-');
  if (supported.includes(requested)) chain.push(requested);
  if (region && lang && supported.includes(lang)) chain.push(lang);
  if (!chain.includes(supported[0])) chain.push(supported[0]); // default last
  return [...new Set(chain)];
}

// CDN headers — propagate updates without a stampede:
// Cache-Control: public, max-age=3600, stale-while-revalidate=86400
// Vary: Accept-Language, Cookie

A region-to-language fallback that hits a 404 is a defect, not graceful degradation — the chain must always terminate at a locale that exists.

Runtime formatting: Intl, not handrolled

The fourth stage answers how is this date, number, or amount written in this locale? Hand-rolled formatting is where i18n quietly breaks: de-DE uses . for thousands and , for decimals, en-IN groups digits in lakhs, and JPY has zero fraction digits. The browser’s native Intl APIs encode all of CLDR, so use them and cache the formatter instances (constructing Intl.DateTimeFormat is expensive). The standards, the timezone and DST traps, and the currency-rounding rules are detailed in Date & Number Formatting Standards.

Build the formatting stage in this order:

  1. Memoize formatters by type:locale:options — constructing them per render is a measurable hotspot.
  2. Pass an explicit timeZone for any server-rendered date, or SSR and the client will disagree and you get a hydration mismatch.
  3. Use style: 'currency' with a per-locale currency map rather than concatenating a symbol.
  4. Let Intl choose grouping and decimal separators — never substitute them yourself.
  5. Normalize to a canonical instant (UTC) before formatting, and format at the edge of the system, not in the middle.
// utils/formatters.ts — memoized Intl registry
const formatCache = new Map<string, Intl.DateTimeFormat | Intl.NumberFormat>();

export function getFormatter(
  type: 'date' | 'number',
  locale: string,
  options?: Intl.DateTimeFormatOptions | Intl.NumberFormatOptions,
) {
  const key = `${type}:${locale}:${JSON.stringify(options)}`;
  if (!formatCache.has(key)) {
    formatCache.set(
      key,
      type === 'date'
        ? new Intl.DateTimeFormat(locale, options as Intl.DateTimeFormatOptions)
        : new Intl.NumberFormat(locale, options as Intl.NumberFormatOptions),
    );
  }
  return formatCache.get(key)!;
}

// Currency: drive the currency from the locale, let Intl place the symbol and decimals.
const CURRENCY_MAP: Record<string, string> = { 'en-US': 'USD', 'de-DE': 'EUR', 'ja-JP': 'JPY' };
export function formatTransactionAmount(amount: number, locale: string): string {
  const currency = CURRENCY_MAP[locale] ?? 'USD';
  return new Intl.NumberFormat(locale, { style: 'currency', currency }).format(amount);
}

// Usage — always pass timeZone for SSR-safe dates:
const dateFmt = getFormatter('date', 'de-DE', { dateStyle: 'long', timeZone: 'Europe/Berlin' });

Formatting is the stage most likely to produce a hydration mismatch, because the server’s default timezone differs from the browser’s. The fix is always an explicit timeZone, never a client-only render.

RTL and bidirectional layout

The fifth stage answers which way does the UI flow? For Arabic, Hebrew, Persian, and Urdu the entire layout mirrors. Engineer this at the CSS layer with logical properties — padding-inline-start instead of padding-left, border-inline-end instead of border-right, text-align: start instead of left — so a single dir="rtl" on the document root mirrors the whole tree. A PostCSS plugin converts physical properties to logical ones at build time, with a physical fallback for legacy browsers. The systematic approach to mirroring, and the cases logical properties don’t cover (icons, gradients, shadows), are covered under RTL & bidirectional layout engineering.

// postcss.config.js — auto-convert physical → logical, keep physical fallback
module.exports = {
  plugins: {
    'postcss-logical': { preserve: true },
    'postcss-dir-pseudo-class': {},
    autoprefixer: {},
  },
};
/* Author with logical properties; one dir flip mirrors the whole layout. */
.card {
  padding-inline-start: 1rem;
  border-inline-end: 2px solid var(--border-color);
  text-align: start;
}

Logical properties mirror box layout automatically, but directional content — a back arrow, a progress chevron, a logo with a gradient — still needs deliberate handling. Mirroring an icon that should not mirror (a clock, a checkmark) is as wrong as failing to mirror one that should (a reply arrow). Set dir from the isRTL flag the resolution stage already computed, so direction is a function of the resolved locale, not an ad-hoc per-component decision.

Cross-cutting concerns: compliance, accessibility, cache

Three concerns wrap every stage rather than sitting inside one.

Compliance and data residency. The same geo and locale signals that drive resolution can drive regulatory routing — which data region processes the request, whether a strict-consent banner renders, which CMP locale loads. Inject these as headers at the edge so application code reads a decision rather than re-deriving jurisdiction.

// middleware/complianceRouter.ts — derive residency + consent from locale at the edge
import { NextRequest, NextResponse } from 'next/server';

const DATA_RESIDENCY_RULES: Record<string, string> = {
  de: 'eu-west-1', fr: 'eu-west-1', us: 'us-east-1', default: 'global-cdn',
};

export function applyComplianceRouting(req: NextRequest) {
  const locale = req.headers.get('x-locale') || 'default';
  const region = DATA_RESIDENCY_RULES[locale] ?? DATA_RESIDENCY_RULES['default'];
  const res = NextResponse.next();
  res.headers.set('x-data-region', region);
  res.headers.set('x-cmp-locale', locale);
  res.headers.set('x-gdpr-strict', locale === 'de' || locale === 'fr' ? 'true' : 'false');
  return res;
}

Accessibility. Direction and language are not just visual. Set lang on <html> to the resolved tag so screen readers pick the correct pronunciation engine, and set dir so assistive tech reads in the right order. A resolved locale that never reaches the lang attribute is a half-finished resolution.

Edge-cache headers. Every cached response must Vary on the signals that changed the output (Accept-Language, Cookie), and translation assets should carry stale-while-revalidate so a catalog update never causes a synchronous origin stampede. Cache keying is the difference between an i18n pipeline that scales and one that serves German strings to English users from a poisoned cache.

Five non-negotiable engineering principles

A production-ready i18n architecture must hold these five rules without exception:

  1. Build-time extraction over runtime string scanning. Eliminates client-side parsing overhead and guarantees catalog completeness before deploy.
  2. Edge-level locale resolution before application hydration. Prevents flash-of-unlocalized-content and removes client-side redirect latency.
  3. Immutable, semantically versioned translation catalogs. Enables atomic rollbacks and deterministic cache invalidation; a catalog is an artifact, not a mutable file.
  4. Automated CI gates for syntax validation and missing-key detection. Blocks malformed ICU and placeholder drift from reaching staging — wire these in the Translation Workflows documentation.
  5. Cache-aware fallback chains. Ensure graceful degradation without origin fetch storms or 404 cascades; the chain must always terminate at an existing locale.

Enforcing these boundaries lets an organization scale localization across dozens of markets while holding sub-100ms TTFB, zero hydration mismatches, and strict regulatory compliance.

Troubleshooting & gotchas

Symptom Root cause + fix
Infinite redirects on every page load Middleware re-resolves a path that already carries a locale. Short-circuit with NextResponse.next() when pathname.split('/')[1] is a supported locale.
CDN serves the wrong language to some users Missing Vary: Accept-Language, Cookie. The edge cached one variant under a shared key. Add the Vary header and key cache on the negotiated locale.
Hydration mismatch after a locale switch A component re-derives the locale or formats a date without an explicit timeZone. Pass the resolved locale as a prop and always set timeZone on Intl.DateTimeFormat.
Blank UI or raw keys for a partially translated locale Fallback chain hits a dead end or returns an empty string. Build the chain to always terminate at the default locale and return the first non-empty hit.
Wrong plural form in Polish/Arabic Hardcoded English n === 1 logic instead of CLDR categories. Use ICU plural and let the runtime apply CLDR rules — see the pluralization rules documentation.
Icons point the wrong way in RTL Logical properties mirror the box but not the icon. Mirror directional icons explicitly via [dir="rtl"] and exempt non-directional ones (clocks, checkmarks).
Thousands separator looks wrong A hand-rolled formatter substituted separators. Delete it; let Intl.NumberFormat(locale) choose grouping per CLDR.

FAQ

In what order should locale signals be evaluated?

URL prefix first, then an explicit cookie or header override, then a q-weighted parse of Accept-Language, then geo-IP as a last resort. Resolve at the edge before hydration and persist the result in a cookie so subsequent requests are a cheap path match rather than a re-resolution. Any other order fragments your cache or creates redirect loops.

Why externalize messages with an AST parser instead of regex?

Regex extraction misses template literals, dynamically built ids, and strings split across expressions, so hardcoded text leaks to production. AST parsers like formatjs and Lingui walk the real syntax tree, catch every call site, and let you hash message content into a stable id that dedupes identical strings and forces re-translation when a source string changes.

How do I stop a missing translation from showing a blank UI?

Build an explicit fallback chain — for example en-USen → default locale — deduped and guaranteed to terminate at a locale that exists. Resolve each key against the chain in order and return the first non-empty hit. Never return an empty string or a raw key, and never let a region-only catalog dead-end at a 404.

What causes a hydration mismatch after switching locale?

Almost always a component re-deriving the locale or formatting a date without an explicit timeZone, so the server and client disagree on the inputs. Propagate the resolved locale as an explicit prop or context value, pass timeZone to every Intl.DateTimeFormat, and normalize instants to UTC before formatting. The Framework i18n & Component Routing documentation covers the framework-specific cases.

Are logical CSS properties enough for full RTL support?

They handle box mirroring — margins, padding, borders, text alignment — when you set dir="rtl" on the root. They do not handle directional content: icons, gradients, shadows, and any asset with inherent left/right meaning still need deliberate mirroring, and non-directional icons must be exempted. Drive dir from the isRTL flag the resolution stage already computed.

Part of the i18n & l10n Pipelines documentation.