About

TTML (Timed Text Markup Language), defined by W3C as DFXP, encodes subtitle data in XML with a styling model that does not map directly to WebVTT. Incorrect conversion loses italic markers, drops alignment cues, or miscalculates end times when dur must be added to begin rather than treated as an absolute timestamp. This tool parses the full TTML DOM including <styling> blocks and nesting, resolves style references by id, and emits spec-compliant WebVTT with proper cue tags. It handles offset-time formats (12.5s, 500ms) and clock-time with frames. The conversion runs entirely in the browser with no server round-trip. Limitation: TTML region-based positioning is approximated because WebVTT positioning semantics differ from TTML's region model.

Formulas

The core timing computation converts a TTML duration-based cue into an absolute end timestamp:

t_end = t_begin + t_dur

where t_begin is the parsed begin attribute in milliseconds, and t_dur is the parsed dur attribute in milliseconds. If an end attribute is present instead, it is used directly as t_end.

Timestamp parsing normalizes all TTML time expressions to milliseconds:

parse(clock) → H × 3600000 + M × 60000 + S × 1000 + ms

For frame-based timestamps (HH:MM:SS:FF), the frame count F is converted assuming a default frame rate of 30 fps:

ms = F30 × 1000

For offset-time expressions, the numeric value is multiplied by the unit factor: h → 3600000, m → 60000, s → 1000, ms → 1.

Style resolution follows a lookup chain: each style attribute value on a or is matched against the id of <style> elements in the TTML <head>. The resolved properties are then mapped to WebVTT cue tags: tts:fontStyle="italic" → , tts:fontWeight="bold" → , tts:textDecoration="underline" → .

Reference Data

TTML Feature	TTML Syntax	WebVTT Equivalent	Support Status
Italic text	`tts:fontStyle="italic"`	`<i>...</i>`	Full
Bold text	`tts:fontWeight="bold"`	`<b>...</b>`	Full
Underline	`tts:textDecoration="underline"`	`<u>...</u>`	Full
Text alignment	`tts:textAlign="left\|center\|right"`	`align:left\|center\|right`	Full
Font color	`tts:color="#RRGGBB"`	`<c.colorname>` or inline	Mapped
Background color	`tts:backgroundColor`	`<c>` with class	Approximated
Duration attribute	`dur="00:00:05.000"`	Computed end time	Full
End attribute	`end="00:00:10.000"`	Direct end time	Full
Clock-time format	`HH:MM:SS.mmm`	`HH:MM:SS.mmm`	Full
Clock-time with frames	`HH:MM:SS:FF`	Converted to `.mmm`	Full (assumes 30 fps)
Offset-time seconds	`12.5s`	Converted to `HH:MM:SS.mmm`	Full
Offset-time milliseconds	`500ms`	Converted to `HH:MM:SS.mmm`	Full
Offset-time hours	`2.5h`	Converted to `HH:MM:SS.mmm`	Full
Offset-time minutes	`5m`	Converted to `HH:MM:SS.mmm`	Full
Nested `<span>`	Inline style spans	Nested WebVTT tags	Full
Line breaks	`<br />`	Newline character	Full
Region positioning	`<region>` with origin/extent	`position`/`line` settings	Approximated
Font size	`tts:fontSize`	Not supported in WebVTT	Dropped
Font family	`tts:fontFamily`	Not supported in WebVTT	Dropped
Writing mode	`tts:writingMode`	`vertical` cue setting	Partial
Opacity	`tts:opacity`	Not supported in WebVTT	Dropped
Multiple `<div>` blocks	Separate content divisions	Sequential cues	Full

Frequently Asked Questions

When a TTML  element has a dur attribute, the converter adds that duration to the begin timestamp to compute the end time. When an end attribute is present, it is used directly. If both are present, end takes precedence. If neither exists, the cue is skipped with a warning.

The converter detects the colon-separated frame component and converts it to milliseconds assuming a default frame rate of 30 fps. For example, 00:01:23:15 becomes 00:01:23.500. If your source uses a different frame rate (24, 25, 29.97), the frame-to-millisecond conversion will be slightly off. Adjust the source timestamps to clock-time format for precision.

Yes. The converter recursively walks child nodes of each  element. A referencing a style with tts:fontStyle="italic" will produce ... in the WebVTT output. Multiple nesting levels (e.g., bold inside italic) produce nested tags like text.

WebVTT does not support font-size, font-family, opacity, or background-color with the same fidelity as TTML. These properties are silently dropped. Text alignment (left, center, right) is preserved via the align cue setting. Region-based positioning is approximated but may not match the TTML layout exactly.

Yes. The converter queries all  elements across all <div> containers within the <body>. Cues are emitted in document order. Each <div> does not create a separate WebVTT file; all cues are merged into a single output.

The tool uses the browser's native DOMParser with an XML content type. If the input contains malformed XML (unclosed tags, invalid entities), the parser will return an error document. The converter detects this and displays a specific error message indicating the parse failure location when available.