The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
torchrl-0.11.0-cp314-cp314t-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	1
torchrl-0.11.0-cp314-cp314t-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp314-cp314t-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp314-cp314t-win_amd64.whl	2026-01-27	2.0 MB	0
torchrl-0.11.0-cp313-cp313-win_amd64.whl	2026-01-27	2.0 MB	1
torchrl-0.11.0-cp313-cp313t-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	0
torchrl-0.11.0-cp313-cp313t-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp313-cp313t-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp313-cp313t-win_amd64.whl	2026-01-27	2.0 MB	0
torchrl-0.11.0-cp314-cp314-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	0
torchrl-0.11.0-cp314-cp314-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp314-cp314-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp314-cp314-win_amd64.whl	2026-01-27	2.0 MB	1
torchrl-0.11.0-cp311-cp311-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp311-cp311-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp311-cp311-win_amd64.whl	2026-01-27	2.0 MB	1
torchrl-0.11.0-cp312-cp312-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	1
torchrl-0.11.0-cp312-cp312-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp312-cp312-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp312-cp312-win_amd64.whl	2026-01-27	2.0 MB	0
torchrl-0.11.0-cp313-cp313-macosx_12_0_arm64.whl	2026-01-27	2.4 MB	0
torchrl-0.11.0-cp313-cp313-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp313-cp313-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	0
torchrl-0.11.0-cp310-cp310-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	1
torchrl-0.11.0-cp310-cp310-manylinux_2_28_aarch64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp310-cp310-manylinux_2_28_x86_64.whl	2026-01-27	2.1 MB	1
torchrl-0.11.0-cp310-cp310-win_amd64.whl	2026-01-27	2.0 MB	1
torchrl-0.11.0-cp311-cp311-macosx_11_0_arm64.whl	2026-01-27	2.4 MB	0
README.md	2026-01-27	34.3 kB	0
TorchRL v0.11.0 source code.tar.gz	2026-01-27	9.2 MB	0
TorchRL v0.11.0 source code.zip	2026-01-27	9.9 MB	0
Totals: 31 Items		79.1 MB	15

TorchRL v0.11.0 Release Notes

Highlights

Dreamer overhaul - Comprehensive improvements to Dreamer world model training: async collectors with profiling, RSSM fixes (scan mode, noise injection, explicit dimensions), torch.compile compatibility for value functions and TDLambda estimator, optimized DreamerEnv to avoid CUDA syncs, and updated sota-implementation with better configs. @vmoens
Weight synchronization schemes - New modular weight sync infrastructure (torchrl.weight_update) with SharedMem, MultiProcess, and vLLM-specific (NCCL, double-buffer) schemes. Collectors now integrate seamlessly with weight sync schemes for distributed training. @vmoens
Major collector refactor - The collector codebase has been completely restructured. The monolithic collectors.py is now split into focused modules (_single.py, _multi_base.py, _multi_sync.py, _multi_async.py, _runner.py, base.py), with cleaner separation of concerns. (#3233) @vmoens
LLM objectives: DAPO & CISPO - New DAPO (Direct Advantage Policy Optimization) and CISPO (Clipped Importance Sampling Policy Optimization) algorithms for LLM training. @vmoens
Trainer infrastructure - New SAC Trainer, configuration system for algorithms, timing utilities, and async collection support within trainers. @vmoens
Tool services - New tool service infrastructure for LLM agents with Python executor, MCP tools, and web search capabilities. @vmoens
Deprecated APIs removed - All deprecation warnings from v0.10 have been promoted to hard errors for v0.11. (#3369) @vmoens
New environment backends - Added Procgen environments support with a new ProcgenEnv wrapper. (#3331) @ParamThakkar123
Multi-env execution - GymEnv, BraxEnv, and DMControlEnv now support a num_envs/num_workers parameter to run multiple environments in a single call via ParallelEnv. (#3343, #3370, #3337) @ParamThakkar123

Installation

:::bash
pip install torchrl==0.11.0

Breaking Changes

[v0.11] Remove deprecated features and replace warnings with errors (#3369) @vmoens
Removes deprecated KLRewardTransform from transforms/llm.py (use torchrl.envs.llm.KLRewardTransform)
Removes LogReward and Recorder classes from trainers (use LogScalar and LogValidationReward)
Removes unbatched_*_spec properties from VmasWrapper/VmasEnv (use full_*_spec_unbatched)
Deletes deprecated rlhf.py modules (data/rlhf.py, envs/transforms/rlhf.py, modules/models/rlhf.py)
Removes replay_buffer_chunk parameter from MultiCollector
Replaces minimum/maximum deprecation warnings with TypeError in Bounded spec
Replaces critic_coef/entropy_coef deprecation warnings with TypeError in PPO and A2C losses
[Major] Major refactoring of collectors (#3233) @vmoens
Splits the 5000+ line collectors.py into focused modules for single/multi sync/async collectors
Creates new _constants.py, _runner.py, base.py modules
Introduces cleaner weight synchronization scheme integration
Improves test coverage for multi-device and shared-device weight updates
Some internal APIs have changed; external API remains compatible

Dreamer World Model Improvements

These changes significantly improve Dreamer training performance, torch.compile compatibility, and usability. @vmoens

[Feature] Refactor Dreamer training with async collectors, profiling, and improved config (cc917bae)
Major overhaul of the Dreamer sota-implementation with async data collection
Adds profiling support for performance analysis
Improved configuration with better defaults and documentation
Updated README with detailed usage instructions
[Refactor] Dreamer implementation updates (3ab4b306)
Refactors Dreamer training script for better maintainability
Updates config.yaml with improved hyperparameters
Enhances dreamer_utils.py with additional helper functions
[Feature] Add noise argument and scan mode to RSSMRollout (8653b6e6)
Adds noise argument to control stochastic sampling during rollout
Implements scan mode for efficient sequential processing
109 lines added to model_based.py for improved RSSM flexibility
[Feature] Add explicit dimensions and device support to RSSM modules (f350fe06)
Adds explicit dimension handling for batch, time, and feature dims
Improves device placement for RSSM components
[Feature] Logging & RSSM fixes (d0829794)
Fixes RSSM module behavior and adds logging improvements
Updates batched_envs.py and wandb logger
[Refactor] Use compile-aware helpers in Dreamer objectives (d8c3887d)
Updates Dreamer objectives to use torch.compile-compatible helpers
Improves performance when using torch.compile
[BugFix] Optimize DreamerEnv to avoid CUDA sync in done checks (d57fdecf)
Eliminates unnecessary CUDA synchronizations in done flag checking
Significant performance improvement for GPU-based Dreamer training
[BugFix] Fix ModelBasedEnvBase for torch.compile compatibility (7ff663d6)
Makes ModelBasedEnvBase compatible with torch.compile
[Feature] Add allow_done_after_reset parameter to ModelBasedEnvBase (24ae0424)
Adds flexibility for environments that may signal done immediately after reset
[BugFix] Final polish for Dreamer utils and Collector tests (5aefdd5c)
Final cleanup and polish for Dreamer implementation

Weight Synchronization Schemes

New modular infrastructure for weight synchronization between training and inference workers. @vmoens

[Feature] Weight Synchronization Schemes - Core Infrastructure (e8f6fa5a)
New torchrl.weight_update module with 2400+ lines of weight sync infrastructure
SharedMemWeightSyncScheme: Uses shared memory for fast intra-node sync
MultiProcessWeightSyncScheme: Uses multiprocessing queues for cross-process sync
Comprehensive documentation and examples in docs/source/reference/collectors.rst
[Feature] vLLM Weight Synchronization Schemes (d0c8b7e9)
VllmNCCLWeightSyncScheme: NCCL-based weight sync for vLLM distributed inference
VllmDoubleBufferWeightSyncScheme: Double-buffered async weight updates
1876 lines of vLLM-specific weight sync code
[Feature] Collectors - Weight Sync Scheme Integration (a7707ca0)
Integrates weight sync schemes into collector infrastructure
Updates GRPO and expert-iteration implementations to use new schemes
Adds examples for multi-weight update patterns
[Refactor] Weight sync schemes refactor (ae0ae064)
Refines weight sync API and adds additional schemes
Improves test coverage with 342+ new test lines

LLM Training: DAPO & CISPO

New policy optimization algorithms for LLM fine-tuning. @vmoens

[Feature] DAPO (9d5c2767)
Implements Direct Advantage Policy Optimization for LLM training
Adds DAPO-specific loss computation to torchrl/objectives/llm/grpo.py
[Feature] CISPO (ed0d8dc6)
Implements Clipped Importance Sampling Policy Optimization
Alternative to PPO/GRPO with different clipping strategy
[Refactor] Refactor GRPO as a separate class (2bc3cb70)
Separates GRPO implementation for better modularity

Trainer Infrastructure

New trainer algorithms, configuration system, and utilities. @vmoens

[Trainers] SAC Trainer and algorithms (02d4bfd1)
New SAC Trainer implementation with 786 lines of code
Complete sota-implementation in sota-implementations/sac_trainer/
Trainer configuration via YAML files
[Feature] Trainer Algorithms - Configuration System (6bc201a5)
New configuration system in torchrl/trainers/algorithms/configs/
Configs for collectors, data, modules, objectives, transforms, weight sync schemes
Enables hydra-style configuration composition
[Feature] Trainer Infrastructure - Timing and Utilities (dc215236)
Adds timing utilities to trainer infrastructure
263 lines of enhanced trainer functionality
[Feature] Async collection within trainers (5f1eb2c5)
Enables asynchronous data collection during training
Improves training throughput
[Feature] PPO Trainer Updates (129f3d50)
Updates to PPO trainer with new features

Tool Services for LLM Agents

New infrastructure for tool-augmented LLM agents. @vmoens

[Feature] Tool services (9ca0e407)
New torchrl/services/ module for tool execution
Python executor service for safe code execution
MCP (Model Context Protocol) tool integration
Web search tool example
609 lines of documentation in docs/source/reference/services.rst
Comprehensive test coverage in test/test_services.py
[Feature] Transform Module - ModuleTransform and Ray Service Refactor (7b85c716)
New ModuleTransform for applying nn.Modules as transforms
Refactored Ray service integration
Moves ray_service.py to torchrl/envs/transforms/

torch.compile Compatibility

Fixes to enable torch.compile with various TorchRL components. @vmoens

[BugFix] Fix value functions for torch.compile compatibility (3bdc7b18)
295 lines of new tests in test/compile/test_value.py
Fixes value function implementations for compile compatibility
[BugFix] Fix TDLambdaEstimator for torch.compile compatibility (11e22ee9)
Updates TDLambdaEstimator to work with torch.compile
[Feature] Add compile-aware timing and profiling helpers (764d9f71)
New utilities for profiling compiled code
[Refactor] Add compile-aware timing and profiling helpers (c82447e6)
Performance utilities that work correctly under torch.compile

Features

[Feature] Added EXP3 Scoring function (#3013) @ParamThakkar123
Implements the EXP3 (Exponential-weight algorithm for Exploration and Exploitation) scoring function for MCTS
Adds torchrl/data/map/ module with hash-based storage, query utilities, and tree structures for efficient state lookups
[Feature] Add num_workers parameter to BraxEnv (#3370) @ParamThakkar123
Allows running multiple Brax environments in parallel via ParallelEnv by specifying num_workers > 1
Consistent API with GymEnv and DMControlEnv multi-env support
[Feature] Added num_envs parameter in GymEnv (#3343) @ParamThakkar123
When num_envs > 1, GymEnv automatically returns a lazy ParallelEnv wrapping multiple environment instances
Simplifies multi-env setup without manually constructing ParallelEnv
[Environments] Added Procgen environments (#3331) @ParamThakkar123
New ProcgenWrapper and ProcgenEnv classes to wrap OpenAI Procgen environments
Converts Procgen observations to TorchRL-compliant TensorDict outputs
Supports all 16 Procgen game environments (coinrun, starpilot, etc.)
[Feature] Loss make_value_estimator takes a ValueEstimatorBase class (#3336) @ParamThakkar123
Allows passing a ValueEstimatorBase class directly to make_value_estimator() for more flexible value estimation configuration
Adds tests and documentation for the new API
[Feature] Make custom_range public in ActionDiscretizer (#3333) @vmoens
Exposes custom_range parameter in ActionDiscretizer for user-defined discretization ranges
[Feature] Added num_envs parameter in DMControlEnv (#3337) @ParamThakkar123
Similar to GymEnv, allows running multiple DMControl environments via num_workers parameter
Returns a lazy ParallelEnv when num_workers > 1
[Feature] Auto-configure exploration module specs from environment (#3317) @bsprenger
Collectors now automatically configure exploration module specs (like AdditiveGaussianModule) from the environment's action spec
Adds set_exploration_modules_spec_from_env() utility function
Reduces boilerplate when using exploration modules with delayed spec initialization
[Feature] Add NPU Support for Single Agent (#3229) @lowdy1
Adds Huawei NPU (Ascend) device support for single-agent training
Updates device handling logic to recognize NPU devices
[Feature] Add support for trackio (#3196) @Xmaster6y
Integrates with trackio for experiment tracking and logging
New TrackioLogger class for seamless integration
[Feature] Add a new Trainer hook point process_loss (#3259) @Xmaster6y
Adds a new hook point in the Trainer that runs after loss computation but before backward pass
Useful for loss modification, logging, or gradient accumulation strategies
[Feature] Ensure MultiSyncDataCollectors returns data ordered by worker id (#3243) @LCarmi
Data from MultiSyncDataCollectors is now consistently ordered by worker ID
Makes debugging and analysis easier when working with multi-worker collectors
[Feature] Add TanhModuleConfig (#3255) @bsprenger
Adds configuration class for TanhModule to support serialization and hydra-style configs
[Feature] add TensorDictSequentialConfig (#3248) @bsprenger
Adds configuration class for TensorDictSequential modules
[Feature] Weight loss outputs when using prioritized sampler (#3235) @vmoens
Loss modules now properly weight outputs when using prioritized replay buffers
Adds importance_key parameter to loss functions (DQN, DDPG, SAC, TD3, TD3+BC)
When importance weights are present in the sampled data, losses are automatically weighted
Adds comprehensive tests for prioritized replay buffer integration with all loss functions
[Feature] Composite specs can create named tensors with 'zero' and 'rand' (#3214) @louisfaury
Composite specs now propagate dimension names when creating tensors via zero() and rand()
[Feature] Enable storing rollouts on a different device (#3199) @Xmaster6y
Collectors can now store rollouts on a different device than the execution device
Useful for CPU storage while running on GPU
[Feature] Named dims in Composite (#3174) @vmoens
Adds named dimension support to Composite specs via names property and refine_names() method
Enables better integration with PyTorch's named tensors
Supports ellipsis notation for partial name specification

Bug Fixes

[BugFix] Fix TransformersWrapper ChatHistory.full not being set (#3375) @vmoens
[BugFix] Fix AsyncEnvPool + LLMCollector with yield_completed_trajectories (#3373) @vmoens
Fixes interaction between AsyncEnvPool and LLMCollector when yielding completed trajectories
[BugFix] Fix LLM test failures (#3360) @vmoens
Comprehensive fixes for LLM-related test failures across multiple modules
[BugFix] Fixed MultiSyncCollector set_seed and split_trajs issue (#3352) @ParamThakkar123
Fixes seed propagation and trajectory splitting in MultiSyncDataCollector
[BugFix] Fix VecNormV2 device gathering (#3368) @vmoens
Fixes device handling when gathering statistics in VecNormV2 transform
[BugFix] Fix AsyncEnv - LLMCollector integration (#3365) @vmoens
Resolves integration issues between async environments and LLM collectors
[BugFix] Fix VecNormV2 GPU device handling for stateful mode (#3364) @BY571
Fixes GPU device handling in VecNormV2 when using stateful normalization mode
[BugFix] Fix Ray collector iterator bug and process group cleanup (#3363) @vmoens
Fixes iterator behavior in Ray-based collectors and ensures proper cleanup of process groups
[BugFix] Fix LLM CI by replacing uvx with uv package (#3356) @vmoens
[Bugfix] Wrong minari download first element (#3106) @marcosgalleterobbva
Fixes incorrect first element handling when downloading Minari datasets
[BugFix,Test] Fix envpool failing tests (#3345) @vmoens
[BugFix] Fixed ParallelEnv + aSyncDataCollector / MultiSyncDataCollector not working if replay_buffer is given (#3341) @ParamThakkar123
Fixes compatibility issue when using ParallelEnv with collectors that have a replay buffer attached
[BugFix] treat 1-D MultiDiscrete as MultiCategorical and accept flattened masks (#3342) @ParamThakkar123
Improves MultiDiscrete action space handling to accept flattened action masks
[BugFix] Fixes register_save_hook bug (#3340) @ParamThakkar123
Fixes save hook registration in replay buffers
[BugFix,Test] recompiles with string failure (#3338) @vmoens
[BugFix] Fix Safe modules annotation and doc for losses (#3334) @vmoens
[BugFix] Fix ray modules circular import (#3319) @vmoens
[BugFix] Fix SACLoss target_entropy="auto" ignoring action space dimensionality (#3292) @vmoens
Fixes automatic entropy target computation to properly account for action space dimensions
[BugFix] Fix agent_dim in multiagent nets & account for neg dims (#3290) @vmoens
Fixes agent dimension handling in multi-agent networks, including support for negative dimension indices
[BugFix] Added a missing .to(device) call in _from_transformers_generate_history (#3289) @michalgregor
[BugFix] Fix old pytorch dependencies (#3266) @vmoens
Updates compatibility shims for older PyTorch versions
[BugFix] RSSMRollout not advancing state/belief across time steps in Dreamer (#3236) @cmdout
Fixes critical bug where RSSM rollout was not properly advancing hidden states in Dreamer world model
[Doc,BugFix] Fix doc and collectors (#3250) @vmoens
[BugFix] use correct field names in InitTrackerConfig (#3245) @bsprenger
[BugFix] Use torch.zeros for argument in torch.where (#3239) @sebimarkgraf
[BugFix] Fix wrong assertion about collector and buffer (#3176) @vmoens
[BugFix] AttributeError in accept_remote_rref_udf_invocation (#3168) @vmoens

Bug Fixes (ghstack commits without PR numbers)

[BugFix] Final polish for Collector and Exploration (558396d7) @vmoens
Final fixes for collector and exploration module interactions
[BugFix] Collector robustness & Async fixes (cadf23bc) @vmoens
Improves collector robustness and fixes async collection issues
[BugFix] TensorStorage key filtering (0667a580) @vmoens
Fixes key filtering in TensorStorage
[BugFix] Replay Buffer prefetch & SliceSampler (9d34dbe7) @vmoens
Fixes prefetching behavior and SliceSampler issues
[BugFix] Fix target_entropy computation for composite action specs (06960c64) @vmoens
Correctly computes target entropy for composite action spaces
[BugFix] Fix device initialization in CrossQLoss.maybe_init_target_entropy (416e4545) @vmoens
Fixes device placement for CrossQ loss entropy initialization
[BugFix] Fix schemes and refactor collectors to make them readable (364e0386) @vmoens
Fixes weight sync schemes and improves collector code readability
[BugFix] Fix collector devices (888095fd) @vmoens
Fixes device handling in collectors
[BugFix] Fix tests (963fdd43) @vmoens
[BugFix] Fix GRPO tests and runs (47ad9d8f) @vmoens
[BugFix] Handle Lock/RLock types in EnvCreator (6985ca25) @vmoens
Properly handles threading locks in EnvCreator serialization
[BugFix] Defer filter_warnings import to avoid module load issues (8ea954cd) @vmoens
[BugFix] Fix CUDA sync in forked subprocess (4a2a2749) @vmoens
Fixes CUDA synchronization issues in forked processes
[BugFix] Fix unique ref to lambda func (b6fe45ee) @vmoens
[BugFix] Add pybind11 check and Windows extension pattern fix (5e89e4e7) @vmoens

Additional Features (ghstack commits without PR numbers)

[Feature] Storage Shared Initialization for Multiprocessing (61c178e5) @vmoens
Enables shared storage initialization for multiprocessing collectors
Updates distributed collector examples
[Feature] Memmap storage cleanup (7f9ea748) @vmoens
Improves memmap storage cleanup and resource management
[Feature] Collector Profiling (02ed47ed) @vmoens
Adds profiling capabilities to collectors for performance analysis
[Feature] auto_wrap_envs in PEnv (d781f9e9) @vmoens
Automatic environment wrapping in ParallelEnv
[Feature] Auto-wrap lambda functions with EnvCreator for spawn compatibility (d250c182) @vmoens
Automatically wraps lambda functions with EnvCreator for spawn multiprocessing compatibility
[Feature] Collectors' getattr_policy and getattr_env (fbdbb617) @vmoens
Adds attribute access methods for policy and env in collectors
[Feature] track_policy_version in collectors.py (a089cc48) @vmoens
Adds policy version tracking in collectors for distributed training
[Feature] Aggregation strategies (0e6a3565) @vmoens
New aggregation strategies for multi-worker data collection
[Feature] kl_mask_threshold (7ab48a4c) @vmoens
Adds KL divergence masking threshold for LLM training
[Feature] Add timing options (13434ebb) @vmoens
Additional timing options for performance monitoring
[Feature] float32 patch (01d2801b) @vmoens
Float32 precision handling improvements
[Feature] Support callable scale in IndependentNormal and TanhNormal distributions (d3aba7d1) @vmoens
Allows scale parameter to be a callable in distribution modules

Documentation

[Doc] Huge doc refactoring (3d5dd1ac) @vmoens
Major documentation restructuring with new sections:
- collectors_basics.rst, collectors_single.rst, collectors_distributed.rst
- collectors_weightsync.rst, collectors_replay.rst
- data_datasets.rst, data_replaybuffers.rst, data_samplers.rst
Adds pre-commit hook for Sphinx section underline checking
[Docs] Update LLM_TEST_ISSUES.md with fix status (#3374) @vmoens
[Docs] Enable doc builds and tutorial runs (#3335) @vmoens
Re-enables Sphinx-gallery tutorial execution in documentation builds
Adds process cleanup between tutorials to prevent resource leaks

CI / Infrastructure

[CI] Speed up slow tests in tests-gpu/tests-cpu (#3395) @vmoens
Optimizes slow test execution with better parallelization and test isolation
[CI] Bump version to 0.11.0 (#3392) @vmoens
[CI] Fix auto-label workflow for fork PRs (#3388) @vmoens
[CI] Better release workflow (#3386) @vmoens
Comprehensive release workflow with sanity checks, wheel collection, docs updates, and PyPI publishing
Adds dry-run mode for testing releases
Automatically updates stable docs symlink and versions.html
[CI] Fix auto-labelling (#3387) @vmoens
[CI] Auto-tag PRs (#3381) @vmoens
Adds automatic labeling for PRs based on changed files
[CI] Add release agent prompt for LLM-assisted releases (#3380) @vmoens
Adds comprehensive release instructions for LLM-assisted release automation
[CI] Add release workflow with PyPI trusted publishing (#3379) @vmoens
Implements PyPI trusted publishing via OIDC authentication
Eliminates need for PyPI API tokens in secrets
[CI] Add granular label support for environment and data workflow jobs (#3371) @vmoens
[CI] Fix PettingZoo CI by updating Python 3.9 to 3.10 (#3362) @vmoens
[CI] Fix GenDGRL CI by adding missing requests dependency (#3361) @vmoens
[CI] Fix IsaacLab tests by using explicit conda Python path (#3358) @vmoens
[CI] Fix M1 build pip command not found (#3359) @vmoens
[CI] Fix Windows build for free-threaded Python (3.13t, 3.14t) (#3357) @vmoens
[CI,BugFix] Fix Habitat CI by upgrading to Python 3.10 (#3346) @vmoens
Upgrades Habitat CI to Python 3.10 and builds habitat-sim from source
[CI] Fix Jumanji CI by adding missing requests dependency (#3349) @vmoens
[CI] Fix Chess CI by correcting test file path typo (#3353) @vmoens
[CI] Skip Windows-incompatible tests in optional deps CI (#3348) @vmoens
[CI] Fix GPU benchmark failures (#3347) @vmoens
[CI] Upgrade doc python version (#3222) @vmoens
[CI] Use pip install (#3200) @vmoens
Migrates CI workflows from setup.py install to pip install

CI / Infrastructure (ghstack commits without PR numbers)

[Setup] Python 3.14 in, python 3.9 out (ff86ab78) @vmoens
Adds Python 3.14 support and removes Python 3.9
[CI] LLM tests integration (cccfaa69) @vmoens
Integrates LLM tests into CI pipeline
[CI] Fix benchmarks for LLMs (e7ec9c31) @vmoens
[CI] Test setup (2644b899) @vmoens
[CI] Isolate distributed tests (5c5992d9) @vmoens
[CI] Fix uv install with --no-deps (546a1b73) @vmoens
[CI] Fix SOTA runs (57000fcc) @vmoens
[CI] Fix libs (071d079f) @vmoens
[CI] Fix missing librhash0 in doc CI (8d0f0613) @vmoens
[CI] Add --upgrade flag for torch installs and install ffmpeg/xvfb (ce9931943) @vmoens
[CI] Quiet coverage combine (1ac0fd0a) @vmoens
[CI] Fix tensordict install in lib tests (8f487d40) @vmoens

Tests

[Tests] Check traj_ids shape with unbatched envs (#3393) @vmoens
[Tests] Fix test isolation in test_set_gym_environments and related tests (#3382) @vmoens
Improves test isolation to prevent cross-test contamination
[Test] Use mock Llama tokenizer instead of skipping gated model test (#3376) @vmoens
[Test] Skip Llama test when tokenizer unavailable instead of xfail (#3372) @vmoens
[Tests] Remove _utils_internal.py in tests (#3281) @vmoens
Removes deprecated internal test utilities and updates tests to use public APIs
[Test] Check LinearizeReward obs transform (#3241) @vmoens
[Test,Benchmark] Move mocking classes file and bench for non-tensor env (#3257) @vmoens
[Tests] Fix vmas seeding test (#3210) @matteobettini
[Tests] Reintroduce VMAS football (#3178) @matteobettini

Tests (ghstack commits without PR numbers)

[Test] Use spawn in older pytorch (9b0492906) @vmoens
[Test] Fix test_num_threads that instantiate the env in the main process (852dd61b) @vmoens
[Test] Use class references instead of lambdas in transform tests (7dd1e61d) @vmoens
[Test] Add test for lambda wrapping in ParallelEnv (4b2b2279) @vmoens
[Test] Ensure collector shutdown with try/finally (aa2b0311) @vmoens
[Test] Add filterwarnings for unclosed resources (b493ab34) @vmoens
[Test] Add retry decorator to flaky vecnorm test (6d908b69) @vmoens
[Test] Simplify num_threads test to check threads in env factory (295cc20e) @vmoens
[Test] Ignore script_method deprecation warning (eaaf97a4) @vmoens
[Test] Fix failing 3.14 tests (66a48f3c) @vmoens
[Test] Remove the forked decorator of the distributed checks (4db59660) @vmoens
[Test] Test RB+Isaac+Ray (a0d650f9) @vmoens
[Test] Fix flaky parallel test (4f013a81) @vmoens
[Test] Fix vc1 (bd93f136) @vmoens
[Quality] Fix flaky test (03ffc820) @vmoens

Refactors / Maintenance

[Refactor] Updated num_envs parameter to num_workers for consistency (#3354) @ParamThakkar123
Renames num_envs to num_workers for consistency across environment wrappers
[Quality] replace reduce to reduction, better error message for invalid mask (#3179) @Kayzwer

Refactors (ghstack commits without PR numbers)

[Refactor] Move WEIGHT_SYNC_TIMEOUT to collectors._constants (3ceb6b9a) @vmoens
Centralizes weight sync timeout configuration
[Refactor] Remove *TensorSpec classes (7ea8d861) @vmoens
Removes legacy TensorSpec class aliases
[Refactor] Rename collectors (8779f2a3) @vmoens
Renames collector classes for consistency
[Refactor] Non-daemonic processes in PEnv (aeb2e9b2) @vmoens
Changes ParallelEnv to use non-daemonic processes
[Refactor] Make env creator optional for Ray (b599d9b8) @vmoens
Makes environment creator optional in Ray collectors
[Refactor] Move decorate_thread_sub_func to torchrl.testing.mp_helpers (e3e9e6a7) @vmoens
Moves multiprocessing test helpers to proper location
[Refactor] Use non_blocking transfers in distribution modules (5c75777f) @vmoens
Uses non-blocking GPU transfers in distribution modules for better performance
[Refactor] Remove MUJOCO_EGL_DEVICE_ID auto-setting from torchrl init (8fac4c54) @vmoens
[Refactor] Add WEIGHT_SYNC_TIMEOUT constant for collector weight synchronization (ab3768aa) @vmoens
[BugFix,Test,Refactor] Refactor tests (eb8a885b) @vmoens
[Refactor,Test] Move compile test to dedicated folder (253b8ddd) @vmoens

Dependencies

[Dependencies] Bump ray from 2.46.0 to 2.52.1 (#3258) @dependabot
[Environment] Fix envpool wrapper (#3339) @vmoens
Updates envpool wrapper for compatibility with latest envpool versions

Typing

[Typing] Edit wrongfully set str type annotations (e7583b3b) @vmoens
Fixes incorrect string type annotations across the codebase

Other

Revert "replace reduce to reduction, better error message for invalid mask" (#3182) @vmoens
Fix (#3180) @matteobettini
[Versioning] Fix doc versioning (#3175) @vmoens

Contributors

Thanks to all contributors to this release:

@vmoens (Vincent Moens)
@ParamThakkar123 (Param Thakkar)
@bsprenger (Ben Sprenger)
@Xmaster6y (Yoann Poupart)
@BY571
@LCarmi (Luca Carminati)
@louisfaury (Faury Louis)
@matteobettini (Matteo Bettini)
@lowdy1
@michalgregor (Michal Gregor)
@cmdout
@sebimarkgraf (Sebastian Moßburger)
@Kayzwer
@marcosgalleterobbva (Marcos Galletero Romero)
@dependabot

Source: README.md, updated 2026-01-27

TorchRL Files

A modular, primitive-first, python-first PyTorch library

TorchRL v0.11.0 Release Notes

Highlights

Installation

Breaking Changes

Dreamer World Model Improvements

Weight Synchronization Schemes

LLM Training: DAPO & CISPO

Trainer Infrastructure

Tool Services for LLM Agents

torch.compile Compatibility

Features

Bug Fixes

Bug Fixes (ghstack commits without PR numbers)

Additional Features (ghstack commits without PR numbers)

Documentation

CI / Infrastructure

CI / Infrastructure (ghstack commits without PR numbers)

Tests

Tests (ghstack commits without PR numbers)

Refactors / Maintenance

Refactors (ghstack commits without PR numbers)

Dependencies

Typing

Other

Contributors

TorchRL Files

A modular, primitive-first, python-first PyTorch library

Get an email when there's a new version of TorchRL

TorchRL v0.11.0 Release Notes

Highlights

Installation

Breaking Changes

Dreamer World Model Improvements

Weight Synchronization Schemes

LLM Training: DAPO & CISPO

Trainer Infrastructure

Tool Services for LLM Agents

torch.compile Compatibility

Features

Bug Fixes

Bug Fixes (ghstack commits without PR numbers)

Additional Features (ghstack commits without PR numbers)

Documentation

CI / Infrastructure

CI / Infrastructure (ghstack commits without PR numbers)

Tests

Tests (ghstack commits without PR numbers)

Refactors / Maintenance

Refactors (ghstack commits without PR numbers)

Dependencies

Typing

Other

Contributors