What languages are supported for Streaming Speech-to-text?

Language support for Streaming STT depends on the model you select. AssemblyAI offers several streaming models with different language capabilities:

Available Models and Language Support

ModelLanguages Supported
Universal-3 Pro StreamingEnglish, Spanish, German, French, Portuguese, Italian (native code switching)
Universal-Streaming EnglishEnglish only
Universal-Streaming MultilingualEnglish, Spanish, German, French, Portuguese, Italian (per turn)
Whisper Streaming99+ languages (auto-detected)

Choosing the Right Model

  • If you need the highest accuracy with multilingual code switching support, use Universal-3 Pro Streaming (u3-rt-pro).
  • If you only need English transcription at the lowest cost, use Universal-Streaming English (universal-streaming-english).
  • If you need multilingual support at a lower cost, use Universal-Streaming Multilingual (universal-streaming-multilingual).
  • If you need support for languages beyond the six listed above, use Whisper Streaming (whisper-rt), which supports 99+ languages.

For more details on model selection, see the Model selection page.

Difference From Pre-recorded STT

It’s important to note that the language_code parameter mentioned in some AssemblyAI documentation applies to the Pre-recorded STT feature, not the Streaming Transcription feature. For streaming STT, you specify the model using the speech_model parameter.

To stay informed about new features and improvements, including language support updates, you can follow our Changelog.