NEWVector Store Object Storage — 50x cheaper.Read the post →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    506 models available

    Showing 124 of 506 models

    Image Text To Text

    Qwen/Qwen3-VL-2B-Instruct

    61.6M
    411
    transformers
    Image Text To Text

    google/gemma-4-31B-it

    10.4M
    2,763
    transformers
    Image Text To Text

    google/gemma-4-26B-A4B-it

    10.1M
    1,000
    transformers
    Image Text To Text

    Qwen/Qwen3.5-4B

    8.1M
    559
    transformers
    Image Text To Text

    Qwen/Qwen3.5-9B

    7.7M
    1,476
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-8B-Instruct

    7.6M
    912
    transformers
    Image Text To Text

    Qwen/Qwen3.6-27B-FP8

    6.9M
    224
    transformers
    Image Text To Text

    zai-org/GLM-OCR

    6.1M
    1,774
    transformers
    Image Text To Text

    Qwen/Qwen3.6-35B-A3B

    6.0M
    1,887
    transformers
    Image Text To Text

    Qwen/Qwen3.6-35B-A3B-FP8

    5.1M
    230
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-7B-Instruct

    4.7M
    1,544
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-3B-Instruct

    4.6M
    650
    transformers
    Image Text To Text

    cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

    4.6M
    73
    transformers
    Image Text To Text

    Qwen/Qwen3.6-27B

    4.2M
    1,428
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-2B-Instruct

    3.5M
    502
    transformers
    Image Text To Text

    llava-hf/llava-1.5-7b-hf

    3.4M
    361
    transformers
    Image Text To Text

    Qwen/Qwen3.5-27B

    3.3M
    976
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-4B-Instruct

    3.1M
    389
    transformers
    Image Text To Text

    deepseek-ai/DeepSeek-OCR

    3.1M
    3,242
    transformers
    Image Text To Text

    unsloth/gemma-4-26B-A4B-it-GGUF

    3.0M
    779
    Image Text To Text

    Qwen/Qwen3.5-35B-A3B

    3.0M
    1,432
    transformers
    Image Text To Text

    vikhyatk/moondream2

    2.9M
    1,413
    transformers
    Image Text To Text

    google/gemma-3-12b-it

    2.8M
    720
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-7B-Instruct

    2.8M
    1,275
    transformers
    1 / 22