NVIDIA Model Optimizer - Browse /0.42.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
nvidia_modelopt-0.42.0-py3-none-any.whl	2026-03-09	1.0 MB	0
ModelOpt 0.42.0 Release source code.tar.gz	2026-03-06	11.8 MB	0
ModelOpt 0.42.0 Release source code.zip	2026-03-06	12.6 MB	0
README.md	2026-03-06	2.2 kB	0
Totals: 4 Items		25.4 MB	0

Bug Fixes

Fix calibration data generation with multiple samples in the ONNX workflow.

New Features

Added a standalone type inference option (--use_standalone_type_inference) to ONNX AutoCast as an experimental alternative to ONNX's infer_shapes. This option performs type-only inference without shape inference, which can help when shape inference fails or when you want to avoid extra shape inference overhead.
Added quantization support for the Kimi K2 Thinking model from the original int4 checkpoint.
Introduced support for params constraint-based automatic neural architecture search in Minitron pruning (mcore_minitron) as an alternative to manual pruning with export_config. See examples/pruning/README.md for more details.
Example added for Minitron pruning using the Megatron-Bridge framework, including advanced pruning usage with params-constraint-based pruning and a new distillation example. See examples/megatron_bridge/README.md.
Supported calibration data with multiple samples in .npz format in the ONNX Autocast workflow.
Added the --opset option to the ONNX quantization CLI to specify the target opset version for the quantized model.
Enabled support for context parallelism in Eagle speculative decoding for both HuggingFace and Megatron Core models.
Added unified Hugging Face export support for diffusers pipelines/components.
Added support for LTX-2 and Wan2.2 (T2V) in the diffusers quantization workflow.
Provided PTQ support for GLM-4.7, including loading MTP layer weights from a separate mtp.safetensors file and supporting export as-is.
Added support for image-text data calibration in PTQ for Nemotron VL models.
Enabled advanced weight scale search for NVFP4 quantization and its export pathway.
Provided PTQ support for Nemotron Parse.
Added distillation support for LTX-2. See examples/diffusers/distillation/README.md for more details.

Source: README.md, updated 2026-03-06

NVIDIA Model Optimizer Files

A unified library of SOTA model optimization techniques

Bug Fixes

New Features

NVIDIA Model Optimizer Files

A unified library of SOTA model optimization techniques

Get an email when there's a new version of NVIDIA Model Optimizer

Bug Fixes

New Features