User Rating 0.0 ★★★★★

Total Usage 0 times

Category Document Converters

Drop BLAST XML file here or click to browse Supports .xml files (BLAST -outfmt 5)

or paste XML below

Color-coded alignments Show statistics Wrap long alignments Max hits per query:

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

NCBI BLAST produces XML output (format 5 or 14) containing structured alignment data: E-values, bit scores (S′), identity percentages, and pairwise sequence alignments. Raw XML is unreadable for quick analysis. Misreading an E-value of 1e−3 versus 1e−30 can lead to false homology assignments or missed orthologs. This tool parses the full BLAST XML schema (BlastOutput → Iteration → Hit → Hsp) and renders it as a formatted HTML report with color-coded residue alignments, sortable hit tables, and highlighted conservation patterns.

The converter handles both single-query and multi-query outputs. Alignment coloring follows standard biochemistry conventions: identical residues in green, positive substitutions (per BLOSUM/PAM matrix context) in yellow, mismatches in red, gaps in gray. Output HTML is self-contained and printable. Note: this tool processes XML client-side. Files exceeding 50 MB may cause browser slowdowns. For metagenomic-scale outputs, consider command-line alternatives.

Formulas

BLAST statistical significance relies on the Karlin-Altschul equation. The E-value represents the expected number of alignments with score ≥ S occurring by chance in a database of given size:

E = K ⋅ m ⋅ n ⋅ e^−λ⋅S

Where K = minor constant (search space scaling), m = effective query length, n = effective database size, λ = Gumbel distribution decay constant, S = raw alignment score.

The normalized bit score S′ allows comparison across different scoring systems:

S′ = λ ⋅ S − ln(K)ln(2)

Identity percentage computed per HSP:

%identity = Hsp_identityHsp_align-len × 100

This converter extracts all numeric fields from the XML and renders E-values in scientific notation. Alignment midline characters are parsed: | maps to identity (green), + maps to positive substitution (yellow), space maps to mismatch (red), - in sequences maps to gap (gray).

Reference Data

BLAST XML Element	Path	HTML Output	Description
BlastOutput_program	BlastOutput/BlastOutput_program	Report header	Program used (blastn, blastp, blastx, tblastn, tblastx)
BlastOutput_db	BlastOutput/BlastOutput_db	Report header	Database searched (nr, nt, refseq_protein, etc.)
BlastOutput_query-def	BlastOutput/BlastOutput_query-def	Query section title	Query sequence definition line
BlastOutput_query-len	BlastOutput/BlastOutput_query-len	Query metadata	Query sequence length in residues/bases
Parameters	BlastOutput/BlastOutput_param/Parameters	Parameters table	Matrix, gap costs, expect threshold, filters
Iteration_query-def	Iteration/Iteration_query-def	Iteration heading	Query definition for multi-query searches
Hit_num	Hit/Hit_num	Hit rank column	Sequential hit number
Hit_id	Hit/Hit_id	Accession link	Subject sequence identifier (accession)
Hit_def	Hit/Hit_def	Description column	Subject sequence definition/description
Hit_len	Hit/Hit_len	Length column	Subject sequence length
Hsp_bit-score	Hsp/Hsp_bit-score	Score column	Bit score S′ (normalized)
Hsp_score	Hsp/Hsp_score	Raw score	Raw alignment score
Hsp_evalue	Hsp/Hsp_evalue	E-value column	Expect value - statistical significance
Hsp_query-from	Hsp/Hsp_query-from	Alignment coords	Start position on query
Hsp_query-to	Hsp/Hsp_query-to	Alignment coords	End position on query
Hsp_hit-from	Hsp/Hsp_hit-from	Alignment coords	Start position on subject
Hsp_hit-to	Hsp/Hsp_hit-to	Alignment coords	End position on subject
Hsp_identity	Hsp/Hsp_identity	Identity count	Number of identical residues/bases
Hsp_positive	Hsp/Hsp_positive	Positives count	Number of positive-scoring residue pairs
Hsp_gaps	Hsp/Hsp_gaps	Gaps count	Total gap characters in alignment
Hsp_align-len	Hsp/Hsp_align-len	Alignment length	Total columns in the alignment
Hsp_qseq	Hsp/Hsp_qseq	Query alignment row	Query sequence in alignment (with gaps)
Hsp_hseq	Hsp/Hsp_hseq	Subject alignment row	Subject sequence in alignment (with gaps)
Hsp_midline	Hsp/Hsp_midline	Midline row	Conservation line: \| = identity, + = positive, space = mismatch
Statistics	Iteration/Iteration_stat/Statistics	Statistics footer	Database size, lambda, kappa, entropy, effective lengths

Frequently Asked Questions

This tool parses BLAST XML format 5 (the default -outfmt 5 output from NCBI BLAST+). It reads the standard BlastOutput root schema with nested Iteration, Hit, and Hsp elements. Format 14 (BLAST XML2) uses a different schema and is not supported. If your file has a root element of <BlastXML2> instead of <BlastOutput>, you must re-run BLAST with -outfmt 5.

The converter parses the Hsp_midline element character by character. A pipe character (|) indicates identity - the query and subject residues are identical - colored green. A plus (+) indicates a positive substitution according to the scoring matrix (e.g., BLOSUM62) - colored yellow. A space indicates a mismatch - colored red. Gaps (- characters in Hsp_qseq or Hsp_hseq) are colored gray. For nucleotide BLAST (blastn), only identity and mismatch apply since there are no positive substitutions.

BLAST reports E-values as floating-point numbers. When the true E-value is smaller than approximately 1e-180, BLAST rounds it to 0.0 in the XML output. This is a limitation of the BLAST software, not the converter. An E-value of 0.0 indicates extremely high statistical significance. The converter displays it as reported. For precise values at this range, consult the bit score instead: a bit score above 600 typically corresponds to E-values below 1e-180.

Yes. The converter iterates over all elements in the XML. Each query generates a separate section in the HTML output with its own hit table and alignments. For files containing more than 50 iterations, the conversion may take several seconds. The progress indicator will show completion percentage. For files exceeding 50 MB or 500+ queries, performance depends on available browser memory.

Hit definitions (Hit_def) in BLAST XML can span thousands of characters when multiple database entries share identical sequences. The summary table truncates descriptions to 120 characters with an ellipsis. The full description is available in the tooltip (hover) and in the detailed alignment section below the table. No sequence data is ever truncated.

The converter uses the browser DOMParser API which reports XML syntax errors. If the XML is malformed (unclosed tags, encoding issues, truncated file), the converter will display a specific error message including the line and column of the first parsing error. Common causes include interrupted downloads (truncated files) and character encoding conflicts. Ensure your file is complete and UTF-8 encoded.