GChemPaint (.gchp) to JSON Converter
Convert GChemPaint (.gchp) XML chemical structure files to structured JSON with atoms, bonds, coordinates, and molecular data. Free online tool.
.gchp file here or click to browse
Max 5 MB
Converted JSON will appear here...
About
GChemPaint stores molecular structures in an XML dialect that encodes atoms (element symbol, 2D coordinates x, y, charge q) and bonds (order n, connectivity pairs). Downstream toolchains - visualization libraries, cheminformatics pipelines, web renderers - almost universally expect JSON input. Manual transcription of even a modest molecule like caffeine (14 atoms, 17 bonds) is error-prone: a single transposed atom index silently corrupts the topology graph. This converter parses the full GChemPaint XML DOM, resolves internal atom-ID references to zero-based indices, and emits a clean JSON document ready for programmatic consumption.
The parser handles multi-molecule documents, bond stereochemistry annotations (wedge, dash, hash), isotope mass numbers, and formal charges. It approximates standard GChemPaint export conventions. Note: 3D coordinates are not stored in GChemPaint files. Only 2D layout positions are extracted. Pro tip: validate your source XML in GChemPaint before converting to catch orphaned bond references that would produce index −1 in the output.
Formulas
The converter performs a deterministic mapping from XML DOM nodes to JSON objects. No mathematical transformation is applied to coordinates; they are extracted verbatim. The core logic resolves bond endpoint references from string atom IDs to integer array indices.
index(atomRef) = i where atoms[i].id = atomRef, i ∈ [0, n − 1]
Where atomRef is the string value of the begin or end attribute on a <bond> element, and n is the total atom count in that molecule. If no match is found, the index defaults to −1, signaling a broken reference.
order = parseInt(orderAttr) with fallback 1 (single bond)
Bond order defaults to 1 when the attribute is absent or non-numeric, matching GChemPaint's implicit single-bond convention. Element symbols default to "C" (carbon) when the Element attribute is omitted, consistent with organic chemistry skeletal notation.
Reference Data
| GChemPaint XML Element | Attribute | JSON Output Key | Type | Description |
|---|---|---|---|---|
<atom> | id | atoms[i].id | string | Internal atom identifier (e.g., "a1") |
<atom> | Element / element | atoms[i].element | string | Element symbol: C, N, O, S, etc. |
<atom> | x | atoms[i].x | number | 2D x-coordinate in GChemPaint units |
<atom> | y | atoms[i].y | number | 2D y-coordinate in GChemPaint units |
<atom> | Charge / charge | atoms[i].charge | integer | Formal charge (e.g., −1, +1) |
<atom> | A (mass number) | atoms[i].isotope | integer | Isotope mass number if specified |
<atom> | Hydrogens | atoms[i].hydrogens | integer | Explicit hydrogen count |
<bond> | id | bonds[j].id | string | Internal bond identifier (e.g., "b1") |
<bond> | order | bonds[j].order | integer | 1 = single, 2 = double, 3 = triple |
<bond> | begin | bonds[j].begin | integer | Zero-based index into atoms array |
<bond> | end | bonds[j].end | integer | Zero-based index into atoms array |
<bond> | type | bonds[j].type | string | "normal", "wedge", "dash", "hash" |
<molecule> | id | molecules[k].id | string | Molecule-level identifier |
<molecule> | - | molecules[k].atoms | array | Array of atom objects in this molecule |
<molecule> | - | molecules[k].bonds | array | Array of bond objects in this molecule |
| Document root | - | molecules | array | Top-level array of all molecules |