Download Latest Version ModelOpt 0.43.0 Release source code.tar.gz (12.8 MB)
Email in envelope

Get an email when there's a new version of NVIDIA Model Optimizer

Home / 0.42.0
Name Modified Size InfoDownloads / Week
Parent folder
nvidia_modelopt-0.42.0-py3-none-any.whl 2026-03-09 1.0 MB
ModelOpt 0.42.0 Release source code.tar.gz 2026-03-06 11.8 MB
ModelOpt 0.42.0 Release source code.zip 2026-03-06 12.6 MB
README.md 2026-03-06 2.2 kB
Totals: 4 Items   25.4 MB 0

Bug Fixes

  • Fix calibration data generation with multiple samples in the ONNX workflow.

New Features

  • Added a standalone type inference option (--use_standalone_type_inference) to ONNX AutoCast as an experimental alternative to ONNX's infer_shapes. This option performs type-only inference without shape inference, which can help when shape inference fails or when you want to avoid extra shape inference overhead.
  • Added quantization support for the Kimi K2 Thinking model from the original int4 checkpoint.
  • Introduced support for params constraint-based automatic neural architecture search in Minitron pruning (mcore_minitron) as an alternative to manual pruning with export_config. See examples/pruning/README.md for more details.
  • Example added for Minitron pruning using the Megatron-Bridge framework, including advanced pruning usage with params-constraint-based pruning and a new distillation example. See examples/megatron_bridge/README.md.
  • Supported calibration data with multiple samples in .npz format in the ONNX Autocast workflow.
  • Added the --opset option to the ONNX quantization CLI to specify the target opset version for the quantized model.
  • Enabled support for context parallelism in Eagle speculative decoding for both HuggingFace and Megatron Core models.
  • Added unified Hugging Face export support for diffusers pipelines/components.
  • Added support for LTX-2 and Wan2.2 (T2V) in the diffusers quantization workflow.
  • Provided PTQ support for GLM-4.7, including loading MTP layer weights from a separate mtp.safetensors file and supporting export as-is.
  • Added support for image-text data calibration in PTQ for Nemotron VL models.
  • Enabled advanced weight scale search for NVFP4 quantization and its export pathway.
  • Provided PTQ support for Nemotron Parse.
  • Added distillation support for LTX-2. See examples/diffusers/distillation/README.md for more details.
Source: README.md, updated 2026-03-06