Model selection | AssemblyAI

The speech_model connection parameter lets you specify which model to use for streaming transcription.

speech_model is required

You must include the speech_model parameter in every streaming transcription request. There is no default model. If you omit speech_model, the request will fail.

Recommended model

We recommend Universal-3 Pro Streaming as your primary model for streaming transcription. It provides the highest accuracy with sub-300ms latency, native multilingual code switching, and advanced prompting support — ideal for voice agents and real-time applications.

Available models

Name	Parameter	Description	Best for
Universal-3 Pro Streaming	`"speech_model": "u3-rt-pro"`	The most accurate model with the fastest word emissions for voice agents that demand the highest quality. Best-in-class accuracy with advanced prompting capabilities. Supports EN, ES, DE, FR, PT, IT.	Real-time voice agents needing premium accuracy, elite entity accuracy, IVR replacement, agent assist, multilingual code-switching
Universal-Streaming English	`"speech_model": "universal-streaming-english"`	An English transcription model offering a good balance of speed and cost-effectiveness.	Cost-effective English real-time transcription, English-only real-time apps
Universal-Streaming Multilingual	`"speech_model": "universal-streaming-multilingual"`	A multilingual transcription model offering a good balance of speed and cost-effectiveness. Supports EN, ES, DE, FR, PT, IT.	Cost-effective multilingual streaming across EN/ES/DE/FR/PT/IT
Whisper Streaming	`"speech_model": "whisper-rt"`	An open-source Whisper model enhanced with AssemblyAI’s reliable infrastructure and unlimited scale. Supports 99+ languages at an accessible price point.	Language coverage beyond 6 languages, open-source model preference, cost-sensitive multilingual transcription

Choosing a model

Feature	Universal-3 Pro Streaming	Universal-Streaming English	Universal-Streaming Multilingual	Whisper Streaming
Latency	Fast	Fastest	Fast	Moderate
Partial transcripts	Yes	Yes	Yes	Yes
Multilingual	Native Code Switching	No	Per Turn	99+ languages (auto-detected)
Entity accuracy	Best	Okay	Okay	Okay
Disfluencies & filler words	Yes	No	No	No
Language detection	Yes	No	Yes	Yes (with confidence scores)
Non-speech tags	No	No	No	Yes (`[Silence]`, `[Music]`, etc.)
Customization	Keyterms prompting (known context) + Native prompting (unknown context)	Keyterms prompting (known context)	Keyterms prompting (known context)	No

For detailed setup and configuration of Universal-3 Pro streaming, see the Universal-3 Pro Streaming page. For prompting guidance, see the Prompting guide.

For detailed setup and configuration of Whisper streaming, see this page.

End-to-end example

You can select a model by setting the speech_model connection parameter when connecting to the streaming API:

Python

Python SDK

JavaScript

JavaScript SDK

1 import pyaudio
2 import websocket
3 import json
4 import threading
5 import time
6 from urllib.parse import urlencode
7 
8 YOUR_API_KEY = "<YOUR_API_KEY>"
9 
10 CONNECTION_PARAMS = {
11     "sample_rate": 16000,
12     "speech_model": "u3-rt-pro",  # or "universal-streaming-english", "universal-streaming-multilingual", "whisper-rt"
13     "min_turn_silence": 100,
14     "max_turn_silence": 1000,
15     # "format_turns": True,  # Whether to return formatted final transcripts (not applicable to u3-rt-pro)
16 }
17 API_ENDPOINT_BASE_URL = "wss://streaming.assemblyai.com/v3/ws"
18 API_ENDPOINT = f"{API_ENDPOINT_BASE_URL}?{urlencode(CONNECTION_PARAMS)}"
19 
20 FRAMES_PER_BUFFER = 800
21 SAMPLE_RATE = CONNECTION_PARAMS["sample_rate"]
22 CHANNELS = 1
23 FORMAT = pyaudio.paInt16
24 
25 audio = None
26 stream = None
27 ws_app = None
28 audio_thread = None
29 stop_event = threading.Event()
30 
31 def on_open(ws):
32     print("WebSocket connection opened.")
33     def stream_audio():
34         global stream
35         while not stop_event.is_set():
36             try:
37                 audio_data = stream.read(FRAMES_PER_BUFFER, exception_on_overflow=False)
38                 ws.send(audio_data, websocket.ABNF.OPCODE_BINARY)
39             except Exception as e:
40                 print(f"Error streaming audio: {e}")
41                 break
42 
43     global audio_thread
44     audio_thread = threading.Thread(target=stream_audio)
45     audio_thread.daemon = True
46     audio_thread.start()
47 
48 def on_message(ws, message):
49     try:
50         data = json.loads(message)
51         msg_type = data.get("type")
52 
53         if msg_type == "Begin":
54             print(f"Session began: ID={data.get('id')}")
55         elif msg_type == "Turn":
56             transcript = data.get("transcript", "")
57             end_of_turn = data.get("end_of_turn", False)
58             if end_of_turn:
59                 print(f"\r{' ' * 80}\r{transcript}")
60             else:
61                 print(f"\r{transcript}", end="")
62         elif msg_type == "Termination":
63             print(f"\nSession terminated: {data.get('audio_duration_seconds', 0)}s of audio")
64     except Exception as e:
65         print(f"Error handling message: {e}")
66 
67 def on_error(ws, error):
68     print(f"\nWebSocket Error: {error}")
69     stop_event.set()
70 
71 def on_close(ws, close_status_code, close_msg):
72     print(f"\nWebSocket Disconnected: Status={close_status_code}")
73     global stream, audio
74     stop_event.set()
75     if stream:
76         if stream.is_active():
77             stream.stop_stream()
78         stream.close()
79     if audio:
80         audio.terminate()
81 
82 def run():
83     global audio, stream, ws_app
84 
85     audio = pyaudio.PyAudio()
86     stream = audio.open(
87         input=True,
88         frames_per_buffer=FRAMES_PER_BUFFER,
89         channels=CHANNELS,
90         format=FORMAT,
91         rate=SAMPLE_RATE,
92     )
93     print("Speak into your microphone. Press Ctrl+C to stop.")
94 
95     ws_app = websocket.WebSocketApp(
96         API_ENDPOINT,
97         header={"Authorization": YOUR_API_KEY},
98         on_open=on_open,
99         on_message=on_message,
100         on_error=on_error,
101         on_close=on_close,
102     )
103 
104     ws_thread = threading.Thread(target=ws_app.run_forever)
105     ws_thread.daemon = True
106     ws_thread.start()
107 
108     try:
109         while ws_thread.is_alive():
110             time.sleep(0.1)
111     except KeyboardInterrupt:
112         print("\nStopping...")
113         stop_event.set()
114         if ws_app and ws_app.sock and ws_app.sock.connected:
115             ws_app.send(json.dumps({"type": "Terminate"}))
116             time.sleep(2)
117         if ws_app:
118             ws_app.close()
119         ws_thread.join(timeout=2.0)
120 
121 if __name__ == "__main__":
122     run()