Why Migrate
Weaviate introduced multimodal capabilities with modules likemulti2vec-clip and img2vec-neural. Mixpeek builds on this direction but takes a fundamentally different approach.
Where Weaviate adds vector search to a database, Mixpeek starts from the file itself. A single video becomes transcripts, visual embeddings, scene descriptions, and detected entities, each independently searchable, all stored across cost tiers, and reassembled through configurable pipelines.
| Weaviate | Mixpeek |
|---|---|
| Vectorization modules bolt onto storage | Feature extraction is the core of the pipeline |
| GraphQL queries with vector search | Multi-stage retrieval pipelines: search, filter, rerank, enrich |
| Single storage tier | Tiered storage: hot, cold, archive, up to 90% savings |
| Schema-per-class design | Namespace-level organization with collections per processing pipeline |
| Module-based multimodal support | Native decomposition: one file becomes many searchable layers |
Concept Mapping
| Weaviate | Mixpeek | Notes |
|---|---|---|
| Class | Collection | Defines processing pipeline and feature extraction for a data type |
| Property | Document field | Documents contain extracted features, metadata, and source lineage |
| Object (with vector) | Document (with features) | Documents hold multiple feature types, not just one vector |
Module (e.g., text2vec-openai) | Feature Extractor | Built-in extractors: multimodal, image, text, face-identity, document, and more |
| GraphQL query | Retriever execution | Retrievers are multi-stage pipelines, not single queries |
nearText / nearVector | Semantic search stage | One stage in a larger pipeline |
where filter | Attribute filter stage | Filters compose as pipeline stages alongside search and ranking |
| Cross-reference | Semantic JOIN / Taxonomy | Connect documents across collections using vector similarity |
Migration Steps
Map Classes to Collections
Each Weaviate class becomes a Mixpeek collection with a feature extractor. Instead of choosing a vectorization module, you choose an extractor that matches your content type.
Re-ingest Your Data
Upload source files through the Mixpeek pipeline instead of importing Weaviate objects. The pipeline extracts richer features than a single vectorization module.
Replace GraphQL Queries with Retrievers
Weaviate’s GraphQL queries map to Mixpeek retrievers. The difference: retrievers chain multiple stages together.
Build Multi-Stage Pipelines
Go beyond what Weaviate’s query language supports. Chain semantic search with attribute filters, reranking, and enrichment in a single retriever.
What You Gain
| Capability | Weaviate | Mixpeek |
|---|---|---|
| File decomposition | Vectorize one property at a time | Decompose a file into multiple searchable layers automatically |
| Multi-stage retrieval | Single query with optional filters | Chain search, filter, rerank, and enrich stages in one pipeline |
| Tiered storage | All data at one storage tier | Hot, cold, and archive tiers, up to 90% savings |
| Cross-modal search | Modules per class, limited cross-modal | Native cross-modal: text query finds video moments, audio segments, image regions |
| No infrastructure | Self-hosted or managed cluster | Fully managed API, no clusters, no module configuration |
| Complete lineage | Objects in a class | Trace results back through document, object, and source file |
Next Steps
Quickstart
Get Mixpeek running in 10 minutes
Feature Extractors
Learn about automatic feature extraction
Retrievers
Build multi-stage retrieval pipelines
Core Concepts
Understand the data model

