About

Importing CSV data into relational databases requires precise mapping of delimited text fields to typed SQL columns. Errors in quoting, type casting, or identifier escaping corrupt entire import batches. This converter parses CSV input according to RFC 4180 rules, infers column types (INTEGER, REAL, DATE, BOOLEAN, VARCHAR) by scanning all rows, and emits dialect-correct CREATE TABLE and INSERT statements. It handles quoted fields containing commas, escaped double-quotes (""), and multi-line cell values. Supported dialects: MySQL (backtick identifiers), PostgreSQL (double-quote identifiers), SQLite, SQL Server (bracket identifiers), and Oracle.

Type inference scans every non-null value in a column. A single non-numeric value demotes the column to VARCHAR. Date detection uses ISO 8601 patterns. The tool limits VARCHAR length to the maximum observed string length rounded up to the nearest power of 2 or dialect default. Batch size controls how many value tuples appear per INSERT statement, which matters for MySQL's 16 MB max_allowed_packet and similar server limits. This tool approximates type inference assuming clean, consistently formatted source data. Columns with mixed formats (e.g., dates and free text) will default to VARCHAR.

Formulas

The converter applies a deterministic type inference pipeline to each column C_j across all n data rows.

T(C_j) =

{

INTEGER if every v_i,j ≠ NULL matches ^-?\d{1,10}$FLOAT if every v_i,j matches ^-?\d+\.\d+([eE][+-]?\d+)?$BOOLEAN if every v_i,j ∈ {true,false,0,1,yes,no}DATE if every v_i,j matches ISO 8601 date patternVARCHAR(L) otherwise

Where L = max(len(v_i,j)) rounded up to the nearest power of 2, with a minimum of 16 and a maximum of 4096. NULL values (empty strings in CSV) are excluded from type inference. If all values in a column are NULL, the type defaults to VARCHAR(255).

String literal escaping replaces each single quote with two single quotes: v → replace(v, ', ''). Identifier escaping wraps column/table names in dialect-specific quote characters and strips any existing quote characters from the name.

Batch INSERT row count B controls how many value tuples are grouped per statement. For Oracle, B = 1 always (Oracle does not support multi-row INSERT via VALUES). The total number of INSERT statements generated equals nB rounded up.

Reference Data

SQL Dialect	Identifier Quoting	String Literal	Integer Type	Float Type	Text Type	Date Type	Boolean Type	NULL Literal	Batch INSERT Limit	Statement Terminator
MySQL	`name`	"value"	INT	DOUBLE	VARCHAR(n)	DATE	TINYINT(1)	NULL	~1000 rows	;
PostgreSQL	"name"	"value"	INTEGER	DOUBLE PRECISION	VARCHAR(n)	DATE	BOOLEAN	NULL	~1000 rows	;
SQLite	"name"	"value"	INTEGER	REAL	TEXT	TEXT	INTEGER	NULL	~500 rows	;
SQL Server	[name]	"value"	INT	FLOAT	NVARCHAR(n)	DATE	BIT	NULL	~1000 rows	; GO
Oracle	"NAME"	"value"	NUMBER(10)	NUMBER(15,4)	VARCHAR2(n)	DATE	NUMBER(1)	NULL	1 row (no multi-row INSERT)	;
Common CSV delimiters
Comma	Default RFC 4180 separator: ,
Semicolon	European locale default: ; (when comma is decimal separator)
Tab	TSV format: \t (U+0009)
Pipe	Legacy mainframe exports: \|
Type inference patterns
INTEGER	Regex: ^-?\d{1,10}$
FLOAT	Regex: ^-?\d+\.\d+$ or scientific notation
BOOLEAN	Values: true, false, 0, 1, yes, no
DATE	ISO 8601: YYYY-MM-DD, YYYY/MM/DD, MM/DD/YYYY, DD.MM.YYYY
VARCHAR	Fallback type when no numeric/date/boolean pattern matches all values

Frequently Asked Questions

The inference engine tests each column against type patterns in strict priority order: INTEGER → FLOAT → BOOLEAN → DATE → VARCHAR. It scans every non-null value. If even one value fails the current type's regex, the column is demoted to the next type. A single text string in a numeric column forces the entire column to VARCHAR. Empty cells are treated as NULL and excluded from type checks.

The parser follows RFC 4180. Fields wrapped in double quotes can contain the delimiter character, newline characters (CR, LF, or CRLF), and literal double quotes escaped as two consecutive double quotes (""). The parser also strips a UTF-8 BOM (byte order mark) if present at the start of the file.

Oracle's INSERT INTO ... VALUES syntax does not support multiple value tuples in a single statement (unlike MySQL, PostgreSQL, or SQLite). The converter automatically sets batch size to 1 for Oracle dialect and generates individual INSERT statements. For bulk loading into Oracle, consider using INSERT ALL or SQL*Loader, which are outside the scope of this converter.

Table and column names are sanitized in two steps. First, any existing dialect-specific quote characters are stripped from the name (e.g., backticks for MySQL). Second, the cleaned name is wrapped in the dialect's identifier quoting mechanism. This prevents names containing spaces, reserved words, or special characters from breaking the generated SQL. However, this tool generates static SQL text - it does not execute queries, so runtime injection risk is zero.

The converter processes files entirely in-browser memory. Files under 500 KB are parsed on the main thread. Files between 500 KB and approximately 50 MB are parsed in a Web Worker to prevent UI freezing. Practical limits depend on the device's available RAM. A 10 MB CSV with 20 columns typically produces around 40-60 MB of SQL output. For files exceeding 50 MB, consider splitting the CSV or using a server-side tool.

The converter scans all values in a VARCHAR column and finds the maximum string length. That value is rounded up to the nearest power of 2 (minimum 16, maximum 4096). For SQL Server, VARCHAR becomes NVARCHAR to support Unicode. For Oracle, VARCHAR becomes VARCHAR2. PostgreSQL and MySQL use standard VARCHAR(n). SQLite uses TEXT regardless of length since SQLite does not enforce VARCHAR length constraints.