Mistral Large 3 675B Instruct 2512 is a state-of-the-art multimodal granular Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. As the instruct-tuned FP8 variant, it is optimized for reliable instruction following, agentic workflows, production-grade assistants, and long-context enterprise tasks. It incorporates a massive 673B-parameter language MoE backbone and a 2.5B-parameter vision encoder, enabling rich multimodal understanding across text and images. The model supports dozens of languages and maintains strong system-prompt adherence, making it suitable for global and structured enterprise use. Designed for high performance, it runs on a single node of B200 or H200 GPUs in FP8, and can also operate in NVFP4 mode on H100 or A100 hardware. With a 256k context window, it excels at long-document comprehension, deep retrieval workflows, and complex knowledge-intensive tasks.
Features
- Granular Mixture-of-Experts architecture with 675B total and 41B active parameters
- 2.5B-parameter vision encoder enabling advanced multimodal reasoning
- FP8 instruct-tuned model optimized for chat, agentic workflows, and production assistants
- Supports dozens of languages including English, Spanish, German, Chinese, Japanese, and more
- Strong system-prompt adherence and native function calling with JSON output
- 256k context window for long-document understanding, retrieval workflows, and enterprise knowledge tasks
- Deployable on a single node of B200/H200 GPUs (FP8) or H100/A100 GPUs (NVFP4)
- Open-source under Apache 2.0, allowing unrestricted commercial and research use