parakeet-tdt-0.6b-v3
by nvidia
600M multilingual ASR with 25-language support and automatic language detection
nvidia/parakeet-tdt-0.6b-v3mixpeek://transcription@v1/nvidia_parakeet_tdt_v3Overview
Parakeet TDT 0.6B v3 is NVIDIA's multilingual speech-to-text model built on the FastConformer-TDT architecture and trained on over 670,000 hours of audio from NVIDIA's Granary dataset. It extends the English-only v2 to 25 European languages with automatic language detection, achieving a 6.34% average WER on the HuggingFace Open ASR Leaderboard while maintaining among the highest throughput of any multilingual model.
On Mixpeek, Parakeet TDT powers cost-efficient multilingual transcription pipelines where Whisper-class accuracy is needed at lower compute cost. Its 600M parameter count and FastConformer architecture deliver excellent throughput for batch processing large audio and video archives across European languages.
Architecture
FastConformer encoder with Token-and-Duration Transducer (TDT) decoder. 600M parameters. Uses a unified SentencePiece tokenizer with 8,192-token vocabulary. Supports audio up to 3 hours via local attention mode. Automatic language identification across 25 languages.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "my-collection",source: { url: "https://example.com/interview.mp4" },feature_extractors: [{name: "transcription",version: "v1",params: {model_id: "nvidia/parakeet-tdt-0.6b-v3"}}]});
Capabilities
- 25 European languages with automatic detection
- 1.93% WER on LibriSpeech test-clean
- 6.34% average WER on Open ASR Leaderboard
- Audio up to 3 hours via local attention mode
- Word-level timestamps included
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| LibriSpeech test-clean | WER | 1.93% | NVIDIA, 2025 — Model Card |
| LibriSpeech test-other | WER | 3.59% | NVIDIA, 2025 — Model Card |
| Open ASR Leaderboard (avg) | WER | 6.34% | NVIDIA, 2025 — Model Card |
Performance
Specification
Research Paper
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient Multilingual ASR
arxiv.orgBuild a pipeline with parakeet-tdt-0.6b-v3
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio