Text to Speech & Speech to Text
Convert text to natural speech or transcribe your voice in 50+ languages — free, no login needed.
How to Use
Choose a Tab
Select "Text to Speech" to convert written text into spoken audio, or "Speech to Text" to transcribe your voice into written text.
Text to Speech — Type & Configure
Enter your text, pick a voice and language, set the speed, pitch, and volume using the sliders, then click Play. The word being spoken is highlighted in real time.
Speech to Text — Start Recording
Select your language, click "Start Recording", and speak naturally. The transcript appears instantly. Interim words show in grey and finalise as you pause.
Save Your Results
Copy the transcript to clipboard or download it as a .txt file. Use the Clear button to reset and start a new session.
Speech Tools — Text to Speech and Voice to Text in Your Browser
Your browser has built-in speech synthesis and speech recognition capabilities that are more capable than most people realise. Text-to-speech converts written text into spoken audio using your device's voice engine. Speech recognition (voice-to-text) transcribes spoken words into text using your microphone. Both work directly in the browser without installing any software.
Text to speech
Paste any text and have it read aloud. Control the voice (from the available voices on your device), the speed (rate), and the pitch. The available voices depend on what your operating system and browser provide — Windows, macOS, iOS, and Android each ship with different voice packages, and some have more natural-sounding options than others.
Most modern devices include at least one English voice. Some devices include regional English accents (Indian English, British English, Australian English). Hindi and other Indian language voices are available on some Android devices through Google's text-to-speech engine but availability varies by device and Android version.
Speech recognition (voice to text)
Click the microphone button and speak — your words are transcribed to text in real time. This uses the Web Speech API, which on most browsers routes your audio through the browser vendor's speech recognition service (Google's service for Chrome, Apple's for Safari). The transcription accuracy is generally good for clear speech in a quiet environment.
The transcription appears as you speak, with the final text appearing when you pause or stop. You can continue speaking to append more text, then copy the transcribed result when you're done.
Common use cases
Accessibility: Text-to-speech is fundamentally an accessibility tool — for users with visual impairments, reading difficulties (dyslexia), or fatigue from long hours of screen reading, having text read aloud makes content more accessible. Even for fully sighted users, listening while doing something else (commuting, exercising) is a mode of consuming written content.
Proofreading: Having your own writing read back to you catches errors that your eyes skip over when reading. The disconnect between your "intended" reading and what the text actually says becomes obvious when heard rather than seen. Writers and editors use this technique deliberately.
Language learning: Listening to correct pronunciation of text you've written in a language you're learning. Useful for checking that your written sentences sound natural in English or another language.
Hands-free note taking: Dictate notes, ideas, or draft content while your hands are occupied — cooking, driving (safely parked), or working with physical materials. Voice-to-text gets your thoughts captured without needing to type.
Accessibility for written content consumption: Having a long article, document, or email read aloud so you can process it while doing something else. Particularly useful for people who consume content faster through listening than reading.
Draft transcription: Speaking a first draft of an email, document, or message is often faster than typing, especially for people who think better verbally than through typing. The voice-to-text gives you a rough draft to then edit.
Tips
For voice-to-text, speak clearly and at a moderate pace. The recognition engine handles natural speech well but struggles with very fast speech, heavy accents in quiet-zone training data gaps, and speech with a lot of background noise. A headset microphone produces significantly better results than a built-in laptop microphone in most environments.
For text-to-speech, punctuation affects natural-sounding output. Commas and periods create natural pauses. Sentences that run together without punctuation sound rushed and unnatural. If the spoken output sounds wrong, check the punctuation in your input text.
Limitations
Speech recognition requires microphone permission — your browser will ask for it the first time. If you declined previously, you'll need to reset the permission in your browser settings. The microphone access is only active while you're recording; nothing is recorded in the background.
Voice-to-text accuracy for Indian English accents varies — Google Chrome's speech recognition generally handles Indian English well, but niche accents, regional pronunciation patterns, and code-switching between English and Hindi mid-sentence may produce transcription errors. Always review and edit the transcription before using it.
The Web Speech API is not uniformly supported across all browsers. Chrome has the best support for both TTS and speech recognition. Firefox has limited speech recognition support. Safari supports TTS well but speech recognition support varies. On mobile, the experience depends heavily on the device and browser version.
Frequently Asked Questions
Google Chrome and Microsoft Edge support the Web Speech Recognition API fully. Firefox has experimental support but it is disabled by default. Safari on iOS supports recognition in newer versions. For the most reliable experience, use Chrome or Edge on desktop.
It depends on your operating system. Most browsers use voices installed on your device for playback. These are downloaded once automatically and can then work offline. Some high-quality cloud voices may require an internet connection. Local system voices (marked "offline" in the voice list) always work without a connection.
In Chrome, the Web Speech API sends audio to Google's speech recognition servers to process the transcription. This is handled by the browser, not by this website. Firefox uses on-device recognition when available, keeping audio local. This tool itself never receives, stores, or transmits your audio or transcription text.
The number of available voices depends on your operating system and browser. Windows typically offers 20–40 voices, macOS offers 50+ with additional downloadable options, and Chrome on Android provides 10–30. You can filter by language using the pill buttons above the voice dropdown to quickly find voices for a specific language.
Yes. For speech-to-text, Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam, Punjabi, and Urdu are all supported in the language selector. For text-to-speech, availability of Indian language voices depends on your operating system — Windows and macOS include some by default, and additional voices can be installed through system settings.
Yes. Once your transcription is complete, click the "Download .txt" button to save the full transcript as a plain text file directly to your device. You can also click "Copy Transcript" to copy the text to your clipboard and paste it into any application.
Related Tools
Other tools you might find useful