Developers

Get your models up and running fast on Tenstorrent hardware. With two open source SDKs, you can get as close to the metal as possible, or let our AI compiler do the work.

Models

Explore models optimized on Tenstorrent hardware.

Don’t see yours listed? Check out TT-Forge, our compiler, to get other models running today.

Models44
bge-large-en-v1.5

Purpose-built for semantic search, dense retrieval, and RAG. BAAI, 335M.

Feature Extraction
335M
DeepSeek-R1

Mixture-of-experts reasoning powerhouse rivaling closed frontier models on math and code. DeepSeek, 671B.

Text Generation
671B
efficientnet-b0

Compound-scaled CNN delivering state-of-the-art accuracy at a fraction of the parameters. Google, 66M.

Image Classification
5.3M
FLUX.1 [dev]

Flow-matching transformer for photorealistic text-to-image with strong prompt adherence. Black Forest Labs, 12B.

Text-to-Image
12B
FLUX.1 [schnell]

FLUX distilled to 4 steps — full quality, fraction of the compute. Black Forest Labs, 12B.

Text-to-Image
12B
gemma-3-1b-it

Tiny and capable — instruction-tuned for edge and on-device use. Google, 1B.

Text Generation
1B
gemma-3-27b

Largest Gemma 3 base — 140-language reasoning, coding, long context. Google, 27B.

Text Generation
27B
gemma-3-27b-it

Largest Gemma 3 — 140-language reasoning, coding, long context. Google, 27B.

Text Generation
27B
gemma-3-4b-it

Punches above its weight in reasoning and code. Google, 4B.

Text Generation
4B
gpt-oss-120b

Open GPT-style model for deep reasoning in self-hosted deployments. GPT-OSS, 120B.

Text Generation
120B
gpt-oss-20b

Open GPT-style generation at a practical self-hosted scale. GPT-OSS, 20B.

Text Generation
20B
Llama-3.1-8B-Instruct

Multilingual instruction following, tool use, and function calling. Meta, 8B.

Text Generation
8B
Llama-3.2-11B-Vision-Instruct

Charts, image Q&A, visual documents — multimodal on a single accelerator. Meta, 11B.

Image-Text-to-Text
11B
Llama-3.2-1B-Instruct

Instruction-tuned to run anywhere — on-device at 1B. Meta, 1B.

Text Generation
1B
Llama-3.2-3B-Instruct

Lightweight agent backbone for low-latency, resource-constrained deployments. Meta, 3B.

Text Generation
3B
Llama-3.2-90B-Vision

Image + text inputs at full scale on Llama's reasoning foundation. Meta, 90B.

Image-Text-to-Text
90B
Llama-3.2-90B-Vision-Instruct

Document analysis, chart reading, OCR-level image understanding at full scale. Meta, 90B.

Image-Text-to-Text
90B
Llama-3.3-70B

Llama 3.1 refined — better math, code, and multilingual, same 128K context. Meta, 70B.

Text Generation
70B
Llama-3.3-70B-Instruct

Stronger structured tasks, tool use, and reasoning than prior Llama generations. Meta, 70B.

Text Generation
70B
Mistral-7B-Instruct-v0.3

Function-calling and instruction following for production and agentic use. Mistral AI, 7B.

Text Generation
7B
MobileNet V2

Inverted residuals and linear bottlenecks — strong accuracy at near-zero inference cost. Google, 3.4M.

Image Classification
3.4M
Mochi 1

Text-to-video focused on motion quality and temporal coherence. Genmo, 10B.

Text-to-Video
10B
Motif Vision 6B Preview

Preview text-to-video from natural language prompts. Motif, 6B.

Text-to-Video
6B
Qwen2.5-72B-Instruct

Instruction-tuned across Chinese, English, coding, and structured output at scale. Alibaba, 72B.

Text Generation
72B
Qwen2.5-7B-Instruct

Tuned for code, math, and structured output at an efficient scale. Alibaba, 7B.

Text Generation
7B
Qwen2.5-VL-32B-Instruct

Document understanding, chart analysis, and multi-image Q&A. Alibaba, 32B.

Image-Text-to-Text
32B
Qwen2.5-VL-3B-Instruct

Vision-language for low-latency multimodal deployment on constrained hardware. Alibaba, 3B.

Image-Text-to-Text
3B
Qwen2.5-VL-72B-Instruct

Vision-language at scale — documents, charts, and scene understanding. Alibaba, 72B.

Image-Text-to-Text
72B
Qwen2.5-VL-7B-Instruct

Visual Q&A, OCR, and document parsing in a practical VLM footprint. Alibaba, 7B.

Image-Text-to-Text
7B
Qwen3-32B

Toggleable chain-of-thought for on-demand deep reasoning. Alibaba, 32B.

Text Generation
32B
Qwen3-8B

Reasoning depth on demand without a throughput penalty. Alibaba, 8B.

Text Generation
8B
Qwen3-Embedding-4B

Multilingual embeddings for retrieval and semantic similarity. Alibaba, 4B.

Embedding
4B
Qwen3-Embedding-8B

Higher-capacity multilingual embeddings for retrieval and reranking. Alibaba, 8B.

Embedding
8B
QwQ-32B

Chain-of-thought with self-reflection for math, science, and logic. Alibaba, 32B.

Text Generation
32B
ResNet-50

Skip connections that made deep networks trainable — the image classification baseline. Microsoft Research, 25M.

Image Classification
25M
SegFormer (b0)

Dual-encoder base for high-resolution synthesis and fine-tuning pipelines. Stability AI, 6.6B.

Image Segmentation
3.8M
SpeechT5 (TTS task)

Mix-transformer segmentation without positional encoding — accurate at low compute. NVIDIA, 3.8M.

Text-to-Speech
307M
Stable Diffusion 3.5 Large

Unified encoder-decoder for natural text-to-speech synthesis. Microsoft, 307M.

Text-to-Image
8B
SD-XL 1.0-base

MMDiT architecture — strong text rendering and prompt control for image generation. Stability AI, 8B.

Text-to-Image
6.6B
unet-base-vgg

Skip connections preserve spatial detail for precise segmentation — VGG backbone. 31M.

Image Segmentation
31M
vit-base

Image patches + self-attention, no convolutions — the original vision transformer. Google, 86M.

Image Classification
86M
vovnet-19b-ra

One-Shot Aggregation avoids DenseNet's redundant paths for better accuracy-per-FLOP. 11.2M.

Image Classification
11.2M
Wan2.2

Causal video transformer for text-to-video with strong motion coherence. Alibaba, 14B.

Text-to-Video
14B
whisper-large-v3

99 languages, 680K hours of training audio — built for robust speech recognition. OpenAI, 1.5B.

Speech-to-Text
1.5B

Start a bounty

Build an open future with us. Fix bugs, add features, get paid.

Join our developer community

Get the latest info, ask questions, review our open-source repos.