User Rating 0.0
Total Usage 0 times
CSV Input
YAML Output
Is this tool helpful?

Your feedback helps us improve.

About

CSV parsing appears trivial until you encounter quoted fields containing delimiters, embedded newlines within double-quoted values, or escaped quotes ("""). A naive split on commas will silently corrupt your data. This tool implements an RFC 4180-compliant finite state machine parser that correctly handles all edge cases. It then serializes to YAML with automatic type inference: values like 42 become integers, TRUE becomes a boolean, and empty cells become NULL. Pro tip: always verify your delimiter. European CSVs often use semicolons because the comma serves as a decimal separator in those locales.

The converter operates entirely in your browser. No data leaves your machine. Output conforms to YAML 1.2 specification with configurable indentation (2 to 8 spaces). Note: this tool assumes the first row contains headers. If your CSV lacks headers, enable the "No Headers" option to generate indexed keys (field_0, field_1, etc.). Maximum recommended file size is 10 MB for responsive performance.

csv to yaml csv converter yaml converter data format converter csv parser yaml generator file converter

Formulas

The CSV parser uses a finite state machine with four states. For each character c at position i, the transition function δ determines the next state:

δ: S × Σ S

where S {FIELD_START, UNQUOTED, QUOTED, QUOTE_IN_QUOTED} and Σ is the input alphabet containing the delimiter d, the quote character q (0x22), newline characters, and all other Unicode code points.

{
FIELD_START + q QUOTEDFIELD_START + d emit empty, stay FIELD_STARTFIELD_START + c UNQUOTEDQUOTED + q QUOTE_IN_QUOTEDQUOTE_IN_QUOTED + q append literal quote, QUOTEDQUOTE_IN_QUOTED + d emit field, FIELD_START

Type inference applies a priority chain to each raw string value v:

type(v) =
{
NULL if v {"", "null", "~"}BOOLEAN if v {"true", "false"}INTEGER if v matches /^-?\d+$/FLOAT if v matches /^-?\d+\.\d+$/STRING otherwise

where v = the trimmed cell content. YAML output indentation uses n spaces per nesting level, where n [2, 8]. Strings containing YAML-reserved characters (:, #, {, }, [, ], ,, &, *, !, |, >, ', ", %, @, `) are automatically single-quoted in the output.

Reference Data

CSV FeatureRFC 4180 RuleThis ToolExample
Simple fieldUnquoted, no special chars✓ Supportedhello
Comma in fieldMust be double-quoted✓ Supported"New York, NY"
Newline in fieldMust be double-quoted✓ Supported"Line1 Line2"
Double-quote in fieldEscaped as ""✓ Supported"He said ""hi"""
Empty fieldAdjacent delimiters✓ → nulla,,c
Trailing CRLFOptional on last record✓ Trimmed -
Header rowOptional (first record)✓ Configurable -
Semicolon delimiterNot in RFC (common EU)✓ Selectablea;b;c
Tab delimiterTSV variant✓ Selectablea\tb\tc
Pipe delimiterCustom variant✓ Selectablea|b|c
Integer detection - 42 → int42
Float detection - 3.14 → float3.14
Boolean detection - true/falseTRUE
Null detection - ✓ empty → null(empty cell)
Date detection - ✓ ISO 8601 preserved2024-01-15
Unicode contentEncoding-dependent✓ UTF-8日本語
Whitespace trimmingNot specified✓ Configurable hello
YAML indent size - 2-8 spaces -
YAML string quoting - Auto (special chars only)"contains: colon"
Max file size - 10 MB -

Frequently Asked Questions

When the parser enters the QUOTED state (after encountering an opening double-quote), all characters including newline (LF, CR, CRLF) are accumulated as part of the field value. The field only terminates when a closing quote is followed by a delimiter or end-of-record. In the YAML output, such multi-line values are rendered using YAML literal block scalar syntax (pipe character) to preserve line breaks faithfully.
Select "Semicolon" from the delimiter dropdown. The parser will then treat semicolons as field boundaries and commas as regular characters. This is common in European locales (Germany, France, Italy) where CSV exports from Excel use semicolons. The tool also supports Tab and Pipe delimiters for TSV and other variants.
By default, type inference recognizes strings matching the pattern /^-?\d+$/ as integers. Leading zeros are stripped during numeric conversion. If you need to preserve leading zeros (e.g., ZIP codes, product codes), disable the "Infer Types" option. All values will then be output as quoted YAML strings, preserving the original representation exactly.
Yes. Enable the "No Headers" toggle. The converter will generate synthetic keys using the pattern field_0, field_1, field_2, etc. Each row becomes a YAML mapping with these generated keys. Without this option, the first row is always consumed as header names and will not appear in the output data.
The serializer checks each string value against a set of YAML special characters including colon (:), hash (#), curly braces, square brackets, and others. If any are detected, the value is wrapped in single quotes. If the value itself contains single quotes, they are escaped by doubling them ('') per YAML spec. Values that look like booleans or nulls when unquoted (e.g., the string "true" or "null") are also quoted when type inference is disabled.
The tool processes files up to 10 MB. Parsing occurs entirely in the browser using a streaming state machine that processes characters sequentially without loading the entire parsed structure into memory at once. For files approaching the limit, a progress indicator displays completion percentage. Files exceeding 10 MB are rejected with an error toast to prevent browser tab crashes.
The output conforms to YAML 1.2 (the current specification). Booleans are output as lowercase true/false (not the YAML 1.1 variants like Yes/No/On/Off). Null values are output as the literal word null. The document starts with the optional YAML directive marker (---) which can be toggled off in settings. Indentation defaults to 2 spaces per level, configurable up to 8.