Mistral Small 3.1 24B Instruct 2503
Model Overview¶
This model is an instruction-finetuned version of Mistral-Small-3.1-24B-Base-2503. It adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. Mistral Small 3.1 can be deployed locally and is exceptionally "knowledge-dense," fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.
It is ideal for:
- Fast-response conversational agents.
- Low-latency function calling.
- Subject matter experts via fine-tuning.
- Local inference for hobbyists and organizations handling sensitive data.
- Programming and math reasoning.
- Long document understanding.
- Visual understanding.
Key Features:¶
- Vision: Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text.
- Multilingual: Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
- Agent-Centric: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
- Advanced Reasoning: State-of-the-art conversational and reasoning capabilities.
- Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
- Context Window: A 128k context window.
- System Prompt: Maintains strong adherence and support for system prompts.
-
Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.
-
Model Source: mistralai/Mistral-Small-3.1-24B-Instruct-2503
Multi Model QPC Configuration # 1¶
| Precision | SoCs / Tensor slicing | NSP-Cores (per SoC) | Batch Size | Chunking Prompt Length | Context Length (CL) | CCL_Enabled | QPC URL | QPC Size | QPC Download | Onnx URL | Onnx Download | Generation Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MXFP6 | 4 | 16 | 1 | 128 | 65536 | True | https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/mistralai/Mistral-Small-3.1-24B-Instruct-2503/mistralai_Mistral-Small-3.1-24B-Instruct-2503_Encoder_qpc_16cores_128pl_65536cl_[2048,4096,8192,12288,16384,24576,32768,65536]ccl_4devices.tar.gz | 2.8GB | Download | https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/mistralai/Mistral-Small-3.1-24B-Instruct-2503/mistralai_Mistral-Small-3.1-24B-Instruct-2503_Encoder_ONNX.tar.gz | Download | 15-May-2026 |
| MXFP6 | 4 | 16 | 1 | 128 | 65536 | True | https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/mistralai/Mistral-Small-3.1-24B-Instruct-2503/mistralai_Mistral-Small-3.1-24B-Instruct-2503_Decoder_qpc_16cores_128pl_65536cl_[2048,4096,8192,12288,16384,24576,32768,65536]ccl_4devices.tar.gz | 55GB | Download | https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/mistralai/Mistral-Small-3.1-24B-Instruct-2503/mistralai_Mistral-Small-3.1-24B-Instruct-2503_Decoder_ONNX.tar.gz | Download | 15-May-2026 |