User Rating 0.0
Total Usage 0 times
Drop a CLOC CSV file here or click to browse Supports .csv files up to 50 MB
or paste CSV content below
Is this tool helpful?

Your feedback helps us improve.

About

The cloc utility produces CSV reports listing language, filename, blank, comment, and code counts per source file. That flat tabular format is useless for tree-based visualizations such as CodeFlower, D3 sunbursts, or treemaps, which require nested JSON with name and children properties. This converter parses each file path, splits on directory separators, and recursively inserts nodes into a hierarchical tree where every leaf carries a size equal to total lines (blank + comment + code). Directories that contain only a single child are optionally collapsed to reduce visual noise. The tool handles both Windows (\) and POSIX (/) path separators. Limitation: CLOC's summary rows (those without a file path) are silently skipped. If your CSV uses a non-standard delimiter or was generated by a cloc fork, verify the header row matches the canonical format before conversion.

cloc csv to json code metrics lines of code codeflower cloc converter csv parser code analysis

Formulas

Each row in the CLOC CSV is parsed and the file path is decomposed into a tree. The total size for each leaf node is computed as:

size = blank + comment + code

Where blank is the count of blank lines, comment is the count of comment lines, and code is the count of code lines reported by cloc for that file.

The tree insertion algorithm operates as follows for each file path p:

split(p, /) [d1, d2, …, f]

For each directory segment di, the algorithm searches the current node's children array for a matching name. If found, it descends into that node. If not found, it creates a new branch node {name: di, children: []} and appends it. The final segment f becomes a leaf node with size, language, blank, comment, and code properties. Time complexity per file: O(k c) where k is path depth and c is max children at each level.

Reference Data

CLOC CSV ColumnTypeJSON Mapping (Tree)JSON Mapping (Flat)Notes
languageStringStored on leaf node as languagelanguage fielde.g. JavaScript, Python
filenameString (path)Split into nested children hierarchyfilename fieldSupports / and \ separators
blankIntegerStored on leaf nodeblank fieldBlank line count
commentIntegerStored on leaf nodecomment fieldComment line count
codeIntegerStored on leaf nodecode fieldCode line count
(computed)Integersize = blank + comment + codetotal fieldUsed for visualization sizing
Root nameStringTop-level name defaults to "root"N/AConfigurable in the tool
Header rowN/ASkipped during parsingSkippedMust match canonical cloc format
Summary rowsN/ASkipped (no filename)SkippedRows with SUM or empty filename
Path depthIntegerDetermines nesting level in treeN/ADeeper paths → more nesting
Duplicate dirsN/AMerged into single nodeN/AFiles in same dir share parent
Output formatJSONHierarchical: {name, children}Array: [{...}, ...]Two output modes available
File size limitN/AMax 50 MB inputBrowser memory constraint
EncodingUTF-8Assumed UTF-8Other encodings may cause errors
CLOC versionAnyCompatible with cloc 1.x - 2.xCanonical 5-column CSV format

Frequently Asked Questions

The converter expects the canonical CLOC CSV output with exactly five columns: language, filename, blank, comment, code. This is the default output when running cloc --csv --by-file. The --by-file flag is critical; without it, CLOC only reports per-language summaries which lack individual file paths and cannot be converted to a tree. Header rows and summary rows (containing "SUM" or lacking a filename) are automatically filtered out.
Rows without a valid filename are treated as summary/aggregate rows and are skipped. If your CLOC report was generated without the --by-file flag, every row is a language summary with no file path, and the converter will produce an empty tree. Additionally, rows where blank, comment, or code cannot be parsed as integers are skipped. Check the toast notifications for a count of skipped rows.
The Tree format produces a hierarchical JSON object mirroring the directory structure: each directory becomes a node with a children array, and each file becomes a leaf node with size (total lines). This is compatible with D3.js hierarchical layouts, CodeFlower, sunburst diagrams, and treemaps. The Flat format produces a simple JSON array where each element is an object with language, filename, blank, comment, code, and total. Flat format is suitable for table-based analysis or importing into spreadsheets.
Yes. The parser normalizes all path separators by splitting on both / and \ characters. A path like src\utils\helpers.js is treated identically to src/utils/helpers.js. Leading drive letters (e.g., C:\) or leading slashes are stripped to avoid creating empty root nodes.
The tool enforces a 50 MB limit on uploaded CSV files. This is a practical browser memory constraint. A typical CLOC report for a project with 100,000 files produces a CSV of approximately 5 - 15 MB. If your CSV exceeds the limit, consider filtering it with cloc --exclude-dir=vendor,node_modules before conversion.
The Tree output format produces exactly the structure expected by CodeFlower and D3's d3.hierarchy(): a root object with name (string) and children (array). Leaf nodes carry a size property instead of children. You can directly feed the output JSON into d3.hierarchy(data).sum(d => d.size) for rendering.