NEWVector Store Object Storage — 50x cheaper.Read the post →
    Models/Segmentation/facebook/sam3.1
    HFSegmentationApache 2.0

    sam3.1

    by facebook

    7x faster multi-object tracking via Object Multiplex shared-memory architecture

    270Kdl/month
    848Mparams
    Identifiers
    Model ID
    facebook/sam3.1
    Feature URI
    mixpeek://image_extractor@v1/facebook_sam31_v1

    Overview

    SAM 3.1 is Meta's update to the Segment Anything Model 3 that introduces Object Multiplex, a shared-memory approach for joint multi-object tracking. Instead of processing each object independently, SAM 3.1 bundles all tracked objects into a single forward pass with global reasoning, delivering 7x faster inference at 128 objects on a single H100 GPU while improving accuracy on 6 of 7 video segmentation benchmarks.

    On Mixpeek, SAM 3.1 replaces SAM 3 as the default segmentation model for video analytics pipelines involving multiple simultaneous objects. The Object Multiplex architecture halves VRAM usage (8GB to 4GB FP16) while doubling throughput from 16 to 32 FPS, making multi-object tracking practical for production-scale video processing.

    Architecture

    Same detector-tracker architecture as SAM 3 (848M parameters) with Object Multiplex extension. Shared-memory joint processing of up to 16 objects per forward pass. DETR-based detector conditioned on text prompts, geometric prompts, and image exemplars. Global reasoning across all tracked objects simultaneously.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/video.mp4" },
    feature_extractors: [{
    name: "segmentation",
    version: "v1",
    params: {
    model_id: "facebook/sam3.1",
    enable_multiplex: true
    }
    }]
    });

    Capabilities

    • 7x faster than SAM 3 at 128 tracked objects (H100)
    • Object Multiplex: joint multi-object tracking in single forward pass
    • Improved on 6/7 VOS benchmarks including +2.0 on MOSEv2
    • Half the VRAM of SAM 3 (4GB vs 8GB FP16)
    • 32 FPS multi-object tracking on H100

    Use Cases on Mixpeek

    Multi-object video tracking at scale — people, products, vehicles across surveillance feeds
    Brand and logo tracking across advertising content with dozens of simultaneous objects
    Production video editing with real-time multi-object segmentation and masking

    Benchmarks

    DatasetMetricScoreSource
    MOSEv2 (video seg.)J&F+2.0 over SAM 3Meta, Mar 2026 — SAM 3.1 Release
    SA-V (video seg.)J&FImproved on 6/7 benchmarksMeta, Mar 2026 — SAM 3.1 Release

    Performance

    Input Size1024x1024 px
    GPU Latency~31ms / frame (H100, 16 objects multiplex)
    GPU Throughput~32 FPS (H100, multi-object)
    GPU Memory~4 GB (FP16, half of SAM 3)

    Specification

    FrameworkHF
    Organizationfacebook
    FeatureSegmentation
    Outputmask + label
    Modalitiesvideo, image
    RetrieverMask Filter
    Parameters848M
    LicenseApache 2.0
    Downloads/mo270K

    Research Paper

    SAM 3.1: Faster Multi-Object Tracking with Object Multiplex

    arxiv.org

    Build a pipeline with sam3.1

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio