User Rating 0.0
Total Usage 0 times
Drop .po files here or click to browse Supports multiple files
Is this tool helpful?

Your feedback helps us improve.

About

Gettext PO files remain the dominant format for managing translations in GNU-based projects, WordPress plugins, and many PHP frameworks. JSON, however, is the required format for JavaScript i18n libraries such as Jed, i18next, and FormatJS. Manual conversion introduces encoding errors, drops plural forms, and silently corrupts msgctxt scoping. This tool implements a strict line-by-line state machine parser that handles multi-line concatenated strings, plural indices msgstr[n], obsolete entries, and header metadata extraction. It outputs valid JSON in four formats: raw key-value, Jed 1.x, messageformat, and po2json default. All processing runs locally in your browser. No files are uploaded to any server.

Limitation: fuzzy-flagged entries (marked #, fuzzy) are excluded from output by default, matching the behavior of GNU msgfmt --check. PO files referencing external .mo binaries are not supported. The parser assumes UTF-8 encoding. Pro tip: always verify your PO file compiles cleanly with msgfmt -c before converting, as syntax errors in the source propagate silently into malformed JSON keys.

po to json gettext converter po file parser translation converter i18n tool po2json localization

Formulas

The PO parser operates as a finite state machine with states: Sidle, Smsgctxt, Smsgid, Smsgid_plural, Smsgstr. Transitions occur on keyword detection at line start.

parse(line)
{
S Smsgctxt if line starts with "msgctxt"S Smsgid if line starts with "msgid"S Smsgstr if line starts with "msgstr"append to current buffer if line starts with """emit entry & reset if blank line

For the Jed output format, the compound key is constructed as:

key =
{
msgctxt + "\u0004" + msgid if msgctxt NULLmsgid otherwise

Where msgctxt is the disambiguation context, msgid is the source string, and "\u0004" is the EOT character used as separator per Jed specification. The value for plural entries becomes an array: [msgid_plural, msgstr[0], msgstr[1], …, msgstr[n]].

String unescaping applies the mapping: \n newline, \t tab, \\ backslash, \" quote.

Reference Data

PO DirectiveDescriptionJSON MappingExample
msgidSource string (key)Object keymsgid "Hello"
msgstrTranslated string (value)Object value (string)msgstr "Hola"
msgctxtDisambiguation contextCompound key: ctx\u0004msgidmsgctxt "menu"
msgid_pluralPlural source formArray value in Jed formatmsgid_plural "%d items"
msgstr[0]Singular translationArray index 0msgstr[0] "%d elemento"
msgstr[N]Nth plural formArray index Nmsgstr[2] "%d elementos"
#:Source code referenceIgnored in output#: src/app.js:42
#.Extracted commentIgnored in output#. Translator note
#,Flags (fuzzy, c-format)fuzzy → entry skipped#, fuzzy
#~Obsolete entryExcluded from output#~ msgid "old"
"" (empty msgid)PO header metadataParsed for Plural-Forms"Plural-Forms: nplurals=2;..."
Plural-FormsHeader: plural ruleJed plural_forms fieldnplurals=3; plural=(n%10==1 ...)
Content-TypeHeader: charsetInformational onlytext/plain; charset=UTF-8
LanguageHeader: target languageJed lang fieldLanguage: es_ES
Multi-line stringsAdjacent "..." lines concatenatedSingle string value"line1" + "line2"
Format: rawSimple key → value map{"Hello": "Hola"}Flat object
Format: jedJed 1.x compatibleNested with domain, locale_dataArray values for plurals
Format: jed1.xSame as jedIncludes header entryUsed by wp-i18n
Format: mfMessageFormat styleKey → string (singular only)No plural arrays

Frequently Asked Questions

Entries marked with #, fuzzy are excluded from the JSON output by default. This matches the behavior of GNU msgfmt, which treats fuzzy entries as unfinished translations. In the converter settings, you can toggle "Include fuzzy" to override this behavior, but be aware that shipping fuzzy translations to production can cause untranslated or partially translated UI elements.
In raw format, only msgstr[0] (the singular form) is used as the value. Plural information is lost. If your application requires plural support, use the Jed format, which stores translations as arrays: [msgid_plural, msgstr[0], msgstr[1], ...]. The Plural-Forms header is preserved in the Jed output under plural_forms.
Yes. Per the PO specification, a multi-line string is expressed as an empty initial string followed by continuation lines. For example: msgid "" followed by "first line\n" and "second line". The parser concatenates all adjacent quoted lines and applies escape sequence processing (\n, \t, \\, \") to produce the final string.
In Jed format, context is prepended to the key using the EOT separator character (U+0004): context\u0004msgid. This is the standard used by Jed, wp-i18n, and related JavaScript i18n libraries. In raw format, context is prepended with a configurable separator (default: \u0004). In mf format, context is discarded.
The converter reads files as UTF-8 using the browser's FileReader API. If your PO file uses a different charset (e.g., ISO-8859-1), the Content-Type header inside the PO file will be noted but the browser may misinterpret byte sequences. Convert your PO file to UTF-8 before using this tool. Most modern translation tools (Poedit, Weblate, Crowdin) default to UTF-8.
Yes. You can drag and drop or select multiple .po files simultaneously. Each file is parsed independently and produces a separate JSON output. You can download all results as individual JSON files. The domain name setting applies to all files unless each file's header contains a distinct Project-Id-Version, which is then used as the domain.
There is no hard limit, but files larger than 10 MB trigger a warning. PO files for typical projects range from 10 KB to 2 MB. Very large files (thousands of entries) are parsed in chunks using setTimeout batching to prevent the browser from becoming unresponsive. Parsing a 5 MB file typically completes in under 500 ms.