About

This is a high-fidelity, privacy-centric transcription environment designed for professionals who require speed and security. Unlike server-dependent solutions, this tool executes the Web Speech API directly within your local browser context. This architecture ensures zero latency and guarantees that your voice data is not stored on third-party analytical servers.

The system features a Dual-Engine Architecture: it runs a real-time Fast Fourier Transform (FFT) for audio visualization alongside the linguistic inference engine. This allows users to visually monitor input gain and background noise levels, ensuring optimal transcription accuracy. It includes a comprehensive Command Dictionary that dynamically adapts to the selected language, allowing for complex formatting without lifting a finger.

Formulas

Signal clarity is critical for the SpeechRecognition engine. The relationship between Signal-to-Noise Ratio (SNR) and Word Error Rate (WER) is inversely proportional.

WER ∝ 1log(SNR)

Using the integrated visualizer, aim for input peaks between -12dB and -6dB for optimal inference results.

Reference Data

Category	Voice Command (English)	Output / Action	Context
Structure	"New Paragraph"	(Inserts double line break)	Formatting
Structure	"New Line"	(Inserts single line break)	Formatting
Punctuation	"Period" / "Full Stop"	.	Sentence End
Punctuation	"Open Quote" ... "Close Quote"	“ ... ”	Quoting
Symbols	"Hashtag"	#	Social
Symbols	"Dollar Sign"	$	Currency
Editing	"Scratch That"	(Deletes last word/phrase)	Correction
Emoticons	"Smiley Face"	:-)	Informal
Control	"Stop Recording"	(Stops the engine)	System

Frequently Asked Questions

After dictating, press the "Wand" icon button. This executes a Regex-based cleanup script that removes common filler words (um, uh, ah), fixes double spaces, capitalizes the first letter of sentences, and ensures proper spacing around punctuation.

It depends on your browser and OS configuration. Google Chrome on Desktop often downloads a local dictionary for offline use, while mobile browsers may rely on a network connection to process voice data via the OS API.

The visualizer shows raw audio input. If text isn't appearing, the Speech Recognition engine may be waiting for a silence threshold to finalize the sentence, or the confidence score of the recognized words is too low to display.

We utilize a "Zero-Knowledge" architecture. Audio is processed by the browser's native engine. The text is stored temporarily in your browser's LocalStorage (so you don't lose work if you refresh) and is never transmitted to our backend.

Yes. The interface is WCAG 2.1 compliant. Use "Tab" to move between controls. Press "Space" or "Enter" to activate buttons. Press "Ctrl + Space" (or Cmd + Space) to globally toggle the microphone on/off.