LMDeploy - Browse /v0.12.1 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
lmdeploy-0.12.1+cu128-cp310-cp310-manylinux2014_x86_64.whl	2026-02-13	98.8 MB	0
lmdeploy-0.12.1+cu128-cp310-cp310-win_amd64.whl	2026-02-13	36.1 MB	0
lmdeploy-0.12.1+cu128-cp311-cp311-manylinux2014_x86_64.whl	2026-02-13	98.8 MB	0
lmdeploy-0.12.1+cu128-cp311-cp311-win_amd64.whl	2026-02-13	36.1 MB	0
lmdeploy-0.12.1+cu128-cp312-cp312-manylinux2014_x86_64.whl	2026-02-13	98.9 MB	0
lmdeploy-0.12.1+cu128-cp312-cp312-win_amd64.whl	2026-02-13	36.1 MB	1
lmdeploy-0.12.1+cu128-cp313-cp313-manylinux2014_x86_64.whl	2026-02-13	98.9 MB	0
lmdeploy-0.12.1+cu128-cp313-cp313-win_amd64.whl	2026-02-13	36.1 MB	0
README.md	2026-02-13	2.5 kB	0
v0.12.1 source code.tar.gz	2026-02-13	1.5 MB	0
v0.12.1 source code.zip	2026-02-13	2.2 MB	0
Totals: 11 Items		543.4 MB	1

What's Changed

fix rotary embedding for transformers v5 by @grimoire in https://github.com/InternLM/lmdeploy/pull/4303
Improve metrics log by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/4297
Support ignore layers in quant config for qwen3 models by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/4293
add custom noaux kernel by @grimoire in https://github.com/InternLM/lmdeploy/pull/4345
fix qwen3vl with transformers5 by @grimoire in https://github.com/InternLM/lmdeploy/pull/4348

fix tool call parser's streaming cursor by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/4333
Fix data race for guided decoding in TP mode by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/4341
fa3 check by @grimoire in https://github.com/InternLM/lmdeploy/pull/4340
Fix time series preprocess by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/4339
Negative KV sequence length error in Attention op by @jinminxi104 in https://github.com/InternLM/lmdeploy/pull/4316
fix qwen3-vl-moe long context by @grimoire in https://github.com/InternLM/lmdeploy/pull/4342
fix: move quantized norm to CPU instead of stale q_linear reference in smooth_quant by @Mr-Neutr0n in https://github.com/InternLM/lmdeploy/pull/4352
update noaux-kernel check by @grimoire in https://github.com/InternLM/lmdeploy/pull/4358

change INPUT_CUDA_VERSION to 12.6.2 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/4322
add Qwen3-8B accuracy evaluation in llm_compressor.md by @43758726 in https://github.com/InternLM/lmdeploy/pull/4319
[ci] refactor ete testcase by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/4274
Set alias interns1_1 for interns1_pro by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/4334
build(docker): skip FA2 when use cu13 by @windreamer in https://github.com/InternLM/lmdeploy/pull/4356
bump version to v0.12.1 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/4350

Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.12.0...v0.12.1

Source: README.md, updated 2026-02-13