Developers

revamp binary and unary min/max

Jan 22

optimise int32→uint16 and uint32→uint16 typecast ops

optimise binary division LLKs

optimise uint16 multiplication LLKs for BH/WH

Conv3d Sharding support using TensorAccessor

Dec 23

Optimise fp32 comparison ops

Join our developer community

Get the latest info, ask questions, review our open-source repos.

GitHub Discord X

Developers

Get your models up and running fast on Tenstorrent hardware. With two open source SDKs, you can get as close to the metal as possible, or let our AI compiler do the work.

Models

Try our Hardware Compatibility tool

Bounties

Contribute to our open source software

Community

Join our developer community

Models

Explore models optimized on Tenstorrent hardware.

Don’t see yours listed? Check out TT-Forge, our compiler, to get other models running today.

Task

Size

Software

Hardware

Blackhole

Wormhole

Models44

bge-large-en-v1.5

Purpose-built for semantic search, dense retrieval, and RAG. BAAI, 335M.

Feature Extraction

335M

DeepSeek-R1

Mixture-of-experts reasoning powerhouse rivaling closed frontier models on math and code. DeepSeek, 671B.

Text Generation

671B

efficientnet-b0

Compound-scaled CNN delivering state-of-the-art accuracy at a fraction of the parameters. Google, 66M.

Image Classification

5.3M

FLUX.1 [dev]

Flow-matching transformer for photorealistic text-to-image with strong prompt adherence. Black Forest Labs, 12B.

Text-to-Image

12B

FLUX.1 [schnell]

FLUX distilled to 4 steps — full quality, fraction of the compute. Black Forest Labs, 12B.

Text-to-Image

12B

gemma-3-1b-it

Tiny and capable — instruction-tuned for edge and on-device use. Google, 1B.

Text Generation

gemma-3-27b

Largest Gemma 3 base — 140-language reasoning, coding, long context. Google, 27B.

Text Generation

27B

gemma-3-27b-it

Largest Gemma 3 — 140-language reasoning, coding, long context. Google, 27B.

Text Generation

27B

gemma-3-4b-it

Punches above its weight in reasoning and code. Google, 4B.

Text Generation

gpt-oss-120b

Open GPT-style model for deep reasoning in self-hosted deployments. GPT-OSS, 120B.

Text Generation

120B

gpt-oss-20b

Open GPT-style generation at a practical self-hosted scale. GPT-OSS, 20B.

Text Generation

20B

Llama-3.1-8B-Instruct

Multilingual instruction following, tool use, and function calling. Meta, 8B.

Text Generation

Llama-3.2-11B-Vision-Instruct

Charts, image Q&A, visual documents — multimodal on a single accelerator. Meta, 11B.

Image-Text-to-Text

11B

Llama-3.2-1B-Instruct

Instruction-tuned to run anywhere — on-device at 1B. Meta, 1B.

Text Generation

Llama-3.2-3B-Instruct

Lightweight agent backbone for low-latency, resource-constrained deployments. Meta, 3B.

Text Generation

Llama-3.2-90B-Vision

Image + text inputs at full scale on Llama's reasoning foundation. Meta, 90B.

Image-Text-to-Text

90B

Llama-3.2-90B-Vision-Instruct

Document analysis, chart reading, OCR-level image understanding at full scale. Meta, 90B.

Image-Text-to-Text

90B

Llama-3.3-70B

Llama 3.1 refined — better math, code, and multilingual, same 128K context. Meta, 70B.

Text Generation

70B

Llama-3.3-70B-Instruct

Stronger structured tasks, tool use, and reasoning than prior Llama generations. Meta, 70B.

Text Generation

70B

Mistral-7B-Instruct-v0.3

Function-calling and instruction following for production and agentic use. Mistral AI, 7B.

Text Generation

MobileNet V2

Inverted residuals and linear bottlenecks — strong accuracy at near-zero inference cost. Google, 3.4M.

Image Classification

3.4M

Mochi 1

Text-to-video focused on motion quality and temporal coherence. Genmo, 10B.

Text-to-Video

10B

Motif Vision 6B Preview

Preview text-to-video from natural language prompts. Motif, 6B.

Text-to-Video

Qwen2.5-72B-Instruct

Instruction-tuned across Chinese, English, coding, and structured output at scale. Alibaba, 72B.

Text Generation

72B

Qwen2.5-7B-Instruct

Tuned for code, math, and structured output at an efficient scale. Alibaba, 7B.

Text Generation

Qwen2.5-VL-32B-Instruct

Document understanding, chart analysis, and multi-image Q&A. Alibaba, 32B.

Image-Text-to-Text

32B

Qwen2.5-VL-3B-Instruct

Vision-language for low-latency multimodal deployment on constrained hardware. Alibaba, 3B.

Image-Text-to-Text

Qwen2.5-VL-72B-Instruct

Vision-language at scale — documents, charts, and scene understanding. Alibaba, 72B.

Image-Text-to-Text

72B

Qwen2.5-VL-7B-Instruct

Visual Q&A, OCR, and document parsing in a practical VLM footprint. Alibaba, 7B.

Image-Text-to-Text

Qwen3-32B

Toggleable chain-of-thought for on-demand deep reasoning. Alibaba, 32B.

Text Generation

32B

Qwen3-8B

Reasoning depth on demand without a throughput penalty. Alibaba, 8B.

Text Generation

Qwen3-Embedding-4B

Multilingual embeddings for retrieval and semantic similarity. Alibaba, 4B.

Embedding

Qwen3-Embedding-8B

Higher-capacity multilingual embeddings for retrieval and reranking. Alibaba, 8B.

Embedding

QwQ-32B

Chain-of-thought with self-reflection for math, science, and logic. Alibaba, 32B.

Text Generation

32B

ResNet-50

Skip connections that made deep networks trainable — the image classification baseline. Microsoft Research, 25M.

Image Classification

25M

SegFormer (b0)

Dual-encoder base for high-resolution synthesis and fine-tuning pipelines. Stability AI, 6.6B.

Image Segmentation

3.8M

SpeechT5 (TTS task)

Mix-transformer segmentation without positional encoding — accurate at low compute. NVIDIA, 3.8M.

Text-to-Speech

307M

Stable Diffusion 3.5 Large

Unified encoder-decoder for natural text-to-speech synthesis. Microsoft, 307M.

Text-to-Image

SD-XL 1.0-base

MMDiT architecture — strong text rendering and prompt control for image generation. Stability AI, 8B.

Text-to-Image

6.6B

unet-base-vgg

Skip connections preserve spatial detail for precise segmentation — VGG backbone. 31M.

Image Segmentation

31M

vit-base

Image patches + self-attention, no convolutions — the original vision transformer. Google, 86M.

Image Classification

86M

vovnet-19b-ra

One-Shot Aggregation avoids DenseNet's redundant paths for better accuracy-per-FLOP. 11.2M.

Image Classification

11.2M

Wan2.2

Causal video transformer for text-to-video with strong motion coherence. Alibaba, 14B.

Text-to-Video

14B

whisper-large-v3

99 languages, 680K hours of training audio — built for robust speech recognition. OpenAI, 1.5B.

Speech-to-Text

1.5B

Start a bounty

Build an open future with us. Fix bugs, add features, get paid.

View all

May 5

Optimise and improve precision of sin/cos/tan

revamp binary and unary min/max

Jan 22

optimise int32→uint16 and uint32→uint16 typecast ops

optimise binary division LLKs

optimise uint16 multiplication LLKs for BH/WH

Conv3d Sharding support using TensorAccessor

Dec 23