About

Markdown is a lightweight markup language with roughly 30 formatting primitives. Converting it to PDF requires a full text layout engine: glyph-width computation, word-wrap against a fixed page width of 595.28 pt (A4), line-height stacking, and page-break insertion when y exceeds the bottom margin. A naive conversion that ignores font metrics produces lines that overflow or orphan headings at page bottoms. This tool parses Markdown into a block-level AST, resolves inline formatting (bold, italic, code, links), then runs a layout pass using Helvetica width tables (CP1252 encoding, 315 glyph entries) to compute precise line breaks. The output is a valid PDF 1.4 binary with proper cross-reference tables - not an image-wrapped hack. Limitation: embedded images render as linked text references since Base64 bitmap embedding without a library would exceed practical single-page scope. Tables, nested lists, and fenced code blocks are fully supported.

Formulas

The PDF layout engine computes line breaks using per-glyph width accumulation against the available text area width. The usable width W on an A4 page is:

W = 595.28 − M_left − M_right = 595.28 − 72 − 72 = 451.28 pt

For each word, the engine computes its rendered width:

w_word = n∑i=1 g_i × fontSize1000

where g_i is the glyph width of character i from the Helvetica metrics table (Adobe standard, values in 1/1000 of a unit). When the accumulated line width x + w_word > W, a line break is inserted. The vertical cursor y decrements by the line height L = fontSize × 1.4. A page break triggers when y < M_bottom (72 pt).

The PDF cross-reference table offset for object k is computed as the cumulative byte length of all preceding objects:

offset_k = k−1∑j=1 len(obj_j)

where len returns the byte length of the serialized PDF object including line terminators. This ensures the PDF reader can random-access any object for rendering.

Reference Data

Markdown Element	Syntax	PDF Rendering	Font	Size (pt)
Heading 1	`# Text`	Bold, large, top margin	Helvetica-Bold	22
Heading 2	`## Text`	Bold, medium	Helvetica-Bold	18
Heading 3	`### Text`	Bold, small	Helvetica-Bold	15
Heading 4	`#### Text`	Bold, body-size	Helvetica-Bold	13
Heading 5	`##### Text`	Bold, small	Helvetica-Bold	11
Heading 6	`###### Text`	Bold, smallest	Helvetica-Bold	10
Paragraph	Plain text	Regular body text	Helvetica	11
Bold	`text`	Bold inline span	Helvetica-Bold	Inherited
Italic	`text`	Oblique inline span	Helvetica-Oblique	Inherited
Bold Italic	`*text*`	Bold-Oblique span	Helvetica-BoldOblique	Inherited
Inline Code	`code`	Monospace, gray bg	Courier	Inherited
Code Block	```lang	Monospace block, shaded	Courier	9
Unordered List	`- item`	Bullet • indented	Helvetica	11
Ordered List	`1. item`	Number. indented	Helvetica	11
Blockquote	`> text`	Indented, gray bar	Helvetica-Oblique	11
Horizontal Rule	`---`	Gray line across page	-	-
Link	`[text](url)`	Blue underlined text	Helvetica	11
Image	`![alt](url)`	Alt text as caption	Helvetica-Oblique	10
Table	GFM pipe syntax	Bordered grid layout	Helvetica	10
Strikethrough	`~~text~~`	Rendered as regular text (noted)	Helvetica	11
Task List	`- [x] done`	Checkbox symbol + text	Helvetica	11

Frequently Asked Questions

The converter handles headings (levels 1-6), bold, italic, bold-italic, inline code, fenced code blocks, unordered and ordered lists (including nested), blockquotes, horizontal rules, links (rendered as blue underlined text with URL in parentheses), GFM-style tables, task lists with checkbox symbols, and images (rendered as italic caption text with the URL). Strikethrough text (~~text~~) is parsed but rendered without the strike line due to PDF text operator limitations in the built-in generator.

The built-in PDF generator uses Helvetica with WinAnsiEncoding (CP1252), which covers standard Latin characters, common punctuation, and Western European accented letters. Characters outside this range - such as CJK glyphs, Cyrillic, Arabic, or emoji - fall outside the encoding table and may render as substitution characters. For full Unicode support, a TrueType font embedding system would be required, which exceeds the scope of a client-side vanilla implementation.

The layout engine maintains a vertical cursor starting at 769.89 pt (A4 height 841.89 minus 72 pt top margin). After each line, the cursor decrements by the line height (font size × 1.4). When the cursor falls below 72 pt (bottom margin), a new PDF page object is created and the cursor resets. Headings include orphan protection: if fewer than 2 lines remain on a page when a heading is encountered, the heading moves to the next page.

Yes. You can either drag-and-drop a .md file onto the editor area or click the upload button to select a file from your filesystem. The FileReader API reads the file as UTF-8 text and populates the editor. Files up to approximately 5 MB are supported; larger files may cause noticeable layout computation delays.

The output conforms to PDF specification 1.4 (ISO 32000-1:2008 subset). It uses standard Type1 fonts (Helvetica, Courier) which are guaranteed to be available in all PDF readers. The cross-reference table and trailer are correctly computed, so the file opens reliably in Adobe Acrobat, Chrome's built-in viewer, Preview.app, and other compliant readers. It is not PDF/A compliant (no embedded color profiles or metadata XMP streams), so it is not suitable for long-term digital archiving without post-processing.

Tables are parsed from pipe-delimited syntax (|col1|col2|). The generator computes column widths by distributing the available page width equally among columns. Cell text is word-wrapped within each column. Borders are drawn using PDF line-drawing operators (re, S). Header rows are rendered in Helvetica-Bold. Alignment markers (:---, :---:, ---:) from GFM are detected and applied as left, center, or right text alignment within cells.