pplx-embed-v1-0.6b
by perplexity-ai
Diffusion-pretrained 0.6B text embeddings with INT8 quantization — SOTA at sub-1B scale
perplexity-ai/pplx-embed-v1-0.6bmixpeek://text_extractor@v1/perplexity_pplx_embed_v1_06bOverview
pplx-embed-v1-0.6B is Perplexity AI's lightweight text embedding model built on diffusion continued pre-trained Qwen3 with bidirectional attention. It natively produces INT8-quantized embeddings, reducing storage requirements by 4x compared to FP32 while maintaining retrieval quality. At just 0.6B parameters, it achieves 68.6 nDCG@10 on MTEB Retrieval — beating the much larger Qwen3-Embed-0.6B (61.2) and BGE-M3 (62.3).
The model supports 32K context length and 1024-dimensional embeddings, with optional binary quantization for 32x storage reduction. On Mixpeek, pplx-embed provides a fast, storage-efficient embedding backbone for text-heavy retrieval pipelines where index size and inference cost are primary constraints.
Architecture
Bidirectional attention transformer built on diffusion continued pre-trained Qwen3. 0.6B parameters. 32K context length. Natively produces INT8-quantized 1024-dimensional embeddings. Supports binary quantization for 32x storage reduction.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "text_embedding",
version: "v1",
parameters: { model_id: "perplexity-ai/pplx-embed-v1-0.6b" },
},
});Capabilities
- 68.6 nDCG@10 on MTEB Retrieval — SOTA at sub-1B scale
- Native INT8 quantization (4x storage reduction)
- Optional binary embeddings (32x storage reduction)
- 32K context window for long documents
- Beats BGE-M3 and Qwen3-Embed-0.6B on retrieval benchmarks
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| MTEB Retrieval (en) | nDCG@10 | 68.6 | Perplexity AI, 2026 — arxiv:2602.11151 |
| BERGEN End-to-End RAG | Avg score | Beats Qwen3-embedding-4B on 3/5 tasks | Perplexity AI, 2026 — arxiv:2602.11151 |
Performance
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
pplx-embed: State-of-the-Art Embedding Models for Web-Scale Retrieval
arxiv.orgBuild a pipeline with pplx-embed-v1-0.6b
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio