GigaChat 3 Ultra
GigaChat 3 Ultra is a 702-billion-parameter Mixture-of-Experts model built from scratch to deliver frontier-level reasoning, multilingual capability, and deep Russian-language fluency. It activates just 36 billion parameters per token, enabling massive scale with practical inference speeds. The model was trained on a 14-trillion-token corpus combining natural, multilingual, and high-quality synthetic data to strengthen reasoning, math, coding, and linguistic performance. Unlike modified foreign checkpoints, GigaChat 3 Ultra is entirely original—giving developers full control, modern alignment, and a dataset free of inherited limitations. Its architecture leverages MoE, MTP, and MLA to match open-source ecosystems and integrate easily with popular inference and fine-tuning tools. With leading results on Russian benchmarks and competitive performance on global tasks, GigaChat 3 Ultra represents one of the largest and most capable open-source LLMs in the world.
Learn more
DeepScaleR
DeepScaleR is a 1.5-billion-parameter language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning and a novel iterative context-lengthening strategy that gradually increases its context window from 8K to 24K tokens during training. It was trained on ~40,000 carefully curated mathematical problems drawn from competition-level datasets like AIME (1984–2023), AMC (pre-2023), Omni-MATH, and STILL. DeepScaleR achieves 43.1% accuracy on AIME 2024, a roughly 14.3 percentage point boost over the base model, and surpasses the performance of the proprietary O1-Preview model despite its much smaller size. It also posts strong results on a suite of math benchmarks (e.g., MATH-500, AMC 2023, Minerva Math, OlympiadBench), demonstrating that small, efficient models tuned with RL can match or exceed larger baselines on reasoning tasks.
Learn more
Phi-4-reasoning
Phi-4-reasoning is a 14-billion parameter transformer-based language model optimized for complex reasoning tasks, including math, coding, algorithmic problem solving, and planning. Trained via supervised fine-tuning of Phi-4 on carefully curated "teachable" prompts and reasoning demonstrations generated using o3-mini, it generates detailed reasoning chains that effectively leverage inference-time compute. Phi-4-reasoning incorporates outcome-based reinforcement learning to produce longer reasoning traces. It outperforms significantly larger open-weight models such as DeepSeek-R1-Distill-Llama-70B and approaches the performance levels of the full DeepSeek-R1 model across a wide range of reasoning tasks. Phi-4-reasoning is designed for environments with constrained computing or latency. Fine-tuned with synthetic data generated by DeepSeek-R1, it provides high-quality, step-by-step problem solving.
Learn more
Gemma 3n
Gemma 3n is our state-of-the-art open multimodal model, engineered for on-device performance and efficiency. Made for responsive, low-footprint local inference, Gemma 3n empowers a new wave of intelligent, on-the-go applications. It analyzes and responds to combined images and text, with video and audio coming soon. Build intelligent, interactive features that put user privacy first and work reliably offline. Mobile-first architecture, with a significantly reduced memory footprint. Co-designed by Google's mobile hardware teams and industry leaders. 4B active memory footprint with the ability to create submodels for quality-latency tradeoffs. Gemma 3n is our first open model built on this groundbreaking, shared architecture, allowing developers to begin experimenting with this technology today in an early preview.
Learn more