| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 0.5.2 source code.tar.gz | 2026-02-13 | 2.1 MB | |
| 0.5.2 source code.zip | 2026-02-13 | 4.2 MB | |
| README.md | 2026-02-13 | 2.8 kB | |
| Totals: 3 Items | 6.3 MB | 0 | |
OpenCompass v0.5.2 Release Notes
๐ Highlights
โจ ๐งช Extensive New Benchmarks Support: We have introduced comprehensive support for Scientific and General Benchmarks, including SciReasoner, Biology Instructions, Mol Instructions, CMPhysBench, IFBench, LCB-pro, etc. โจ ๐ค New Model & API Support: Added support for Intern-S1-Pro and TeleChat API evaluation examples. โจ ๐ ๏ธ Infrastructure & Enhancements: Fixed bugs, improved evaluation pipelines and updated CI.
๐ New Features
๐ง Introduced support for HMMT2025 (#2305), AMO-Bench (#2305), IMO-Bench (#2305), ATLAS (#2297), OpenSWI (#2312), CMPhysBench (#2313), Biology Instructions (#2314), Mol Instructions (#2326), ARC_AGI_2 (#2330), IFBench (#2354), SciReasoner (#2360), PI-LLM (#2283), ProcessBench (#2274), and LCB_pro (#2361). ๐ง Supported monitoring of multi-dimensional evaluation metrics, including output length, logprobs, and finish reasons (#2351). ๐ง Added support for Intern-S1-Pro evaluation examples (#2394). ๐ง Added support for TeleChat API inference (#2371). ๐ง Added LLM-judge-based config for C-Eval (#2398).
๐ Bug Fixes
๐ง Fixed OpenAISDKStreaming regarding output completeness and related issues (#2367, [#2389], [#2399]). ๐ง Removed Pyext in runtime requirement (#2306). ๐ง Fixed pattern match in Smolinstruct (#2384). ๐ง Fixed buffer-related error in the LiveCodeBench evaluation (#2393).
โ Enhancements and Refactors
โ Infrastructure Refactors:
- Updated LCBench (#2166).
- Added headers as input param in BigCodeBench (#2302).
- Updated rjob with metadata name (#2316).
- Parametrized timeout in OpenAISDK (#2352).
- Added meta logger in OpenICLInferTask (#2383).
โ CI/CD Improvements:
- Refactored dailytest (#2308).
- Added CI for new datasets (#2358, [#2369]).
- Changed github runner (#2373).
- Added uni-test (#2390).
๐ Welcome New Contributors
A warm welcome and special thanks to our newest contributors who made this release possible:
- @zhuangziGiantfish made their first contribution in (#2283).
- @xgao922 made their first contribution in (#2307).
- @Jensen246 made their first contribution in (#2310).
- @ccx06 made their first contribution in (#2371).
Full Changelog: https://github.com/open-compass/opencompass/compare/0.5.1.post1...0.5.2
Thank you for using OpenCompass! These updates empower deeper insights and more reliable evaluations. Keep exploring, and stay tuned for future innovations! ๐