Qwen3-ASR-1.7B
by Qwen
State-of-the-art open-source ASR for 52 languages with streaming and offline modes
Qwen/Qwen3-ASR-1.7Bmixpeek://transcription@v1/qwen3_asr_1b_v1Overview
Qwen3-ASR-1.7B is Alibaba's flagship open-source speech recognition model supporting 52 languages and dialects. It combines a 300M-parameter AuT audio encoder with a Qwen3-1.7B decoder, achieving state-of-the-art performance among open-source ASR models and competing with the strongest proprietary APIs including OpenAI Whisper large v3.
On Mixpeek, Qwen3-ASR powers multilingual transcription pipelines that need broad language coverage beyond European languages. Its dual-mode architecture supports both streaming inference with 1-8 second chunks and offline processing of long recordings, making it versatile for real-time and batch workloads across 52 languages.
Architecture
AuT audio encoder (300M params, attention-encoder-decoder, 1024 hidden size) compresses audio 8x to 12.5 Hz representations. Qwen3-1.7B decoder for text generation. Dynamic flash attention window (1s-8s) enables both streaming and offline inference. Total 1.7B parameters.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "my-collection",source: { url: "https://example.com/multilingual-video.mp4" },feature_extractors: [{name: "transcription",version: "v1",params: {model_id: "Qwen/Qwen3-ASR-1.7B"}}]});
Capabilities
- 52 languages and dialects with automatic language detection
- 1.63% WER on LibriSpeech Clean (offline mode)
- Streaming inference with 1-8 second dynamic chunks
- Timestamp prediction for word-level alignment
- Competitive with strongest proprietary ASR APIs
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| LibriSpeech Clean (offline) | WER | 1.63% | Alibaba, Jan 2026 — Technical Report |
| LibriSpeech Other (offline) | WER | 3.38% | Alibaba, Jan 2026 — Technical Report |
Performance
Specification
Research Paper
Qwen3-ASR Technical Report
arxiv.orgBuild a pipeline with Qwen3-ASR-1.7B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio