Skip to content

[Feature] MuJoCo custom envs with selectable physics backend#3700

Merged
vmoens merged 2 commits into
pytorch:mainfrom
vmoens:mujoco-torch-envs
May 30, 2026
Merged

[Feature] MuJoCo custom envs with selectable physics backend#3700
vmoens merged 2 commits into
pytorch:mainfrom
vmoens:mujoco-torch-envs

Conversation

@vmoens

@vmoens vmoens commented May 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds torchrl/envs/custom/mujoco/ -- a base MujocoEnv(EnvBase, abc.ABC) plus 5 task subclasses (HumanoidEnv, AntEnv, Walker2dEnv, HopperEnv, SatelliteEnv).
  • Three swappable physics backends behind one kwarg: mujoco-torch (default; native torch + vmap, torch.compile-friendly), mjx (JAX-vmap+jit, DLPack-bridged), mujoco (C-bindings, single-env, composed via ParallelEnv/SerialEnv through a metaclass).
  • Rendering supported on every backend: from_pixels=True / pixel_only / render_width / render_height / camera_id. Torch backend uses mujoco-torch's pure-torch ray-cast renderer; mjx/mujoco use mujoco.Renderer.

Design

MujocoEnv(EnvBase, abc.ABC, metaclass=_MujocoMeta)
  - abstract _compute_reward / _compute_done
  - default _make_obs (cat(qpos[skip:], qvel)); subclasses override for richer obs
  - xml_path kwarg accepts local path *or* http(s) URL (no need to vendor)
  - _MujocoMeta dispatches batching:
      mujoco-torch / mjx: num_envs=N (vmap). num_workers / parallel raise.
      mujoco:             num_envs and num_workers are aliases (mutually exclusive);
                          parallel=True -> ParallelEnv (default), False -> SerialEnv.

_PhysicsBackend
  - reset(qpos, qvel) / reset_mask / step(ctrl, frame_skip)
  - qpos / qvel / time properties
  - render(camera_id, width, height, background) -> (B, H, W, 3) uint8
  - _TorchBackend, _MJXBackend, _MujocoBackend

The satellite task uses a manipulability-based singularity penalty (r = -|q_err| - lambda_s/sqrt(det(J J^T) + eps) - lambda_u * |a|^2) over a 4-CMG pyramid (skew = arctan(sqrt(2))) or a 6-CMG orthogonal cluster -- both hardcoded in _math.py. Target attitude is sampled uniformly on SO(3) at reset via Marsaglia's method.

XML assets

Bundled under torchrl/envs/custom/mujoco/assets/ with a NOTICE preserving the upstream Apache-2.0 attribution (DeepMind / mujoco-torch). Subclasses can also point at remote URLs to avoid re-vendoring.

Tests

test/libs/test_mujoco.py -- parametrized tests covering specs, rollouts, dispatch validation, satellite finite-reward over 50 steps, custom XML loading, and from_pixels/pixel_only/render() across all available backends. Skipped cleanly when a backend isn't installed.

CI

Dedicated path-triggered workflow .github/workflows/test-linux-mujoco.yml (mirrors test-linux-libs.yml's brax pattern) with scripts under .github/unittest/linux_libs/scripts_mujoco/. Pinned versions:

  • mujoco==3.7.0, mujoco-mjx==3.7.0 (3.8.0 removed mjENBL_MULTICCD referenced by mujoco-torch 0.2.0)
  • mujoco-torch==0.2.0
  • jax[cuda12]>=0.7.0,<0.11

Test plan

  • pytest test/libs/test_mujoco.py -v -- pending rerun after splitting the satellite examples into [Feature] Add satellite MuJoCo SAC examples #3802 (previous local run: 58 passed on macOS before the split)
  • pytest test/test_custom_envs.py -v -- 8 passed (no regression on pre-existing custom envs)
  • check_env_specs passes for every (env class, backend) pair
  • Satellite singularity term remains finite over 50-step random rollouts (4 and 6 CMGs)
  • compile_step=True smoke runs on mujoco-torch backend
  • from_pixels=True produces uint8 (B, H, W, 3) tensors with non-trivial RGB content on all three backends
  • Dedicated CI workflow lights up on this PR

@pytorch-bot

pytorch-bot Bot commented May 1, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3700

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2026
@github-actions github-actions Bot added Documentation Improvements or additions to documentation CI Has to do with CI setup (e.g. wheels & builds, tests...) Environments Adds or modifies an environment wrapper Feature New feature labels May 1, 2026
@vmoens vmoens force-pushed the mujoco-torch-envs branch from 59d39b6 to a7699ec Compare May 30, 2026 00:14
@vmoens vmoens force-pushed the mujoco-torch-envs branch from a7699ec to 3ed0658 Compare May 30, 2026 00:19
@vmoens vmoens merged commit 97ae8cc into pytorch:main May 30, 2026
26 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Environments Adds or modifies an environment wrapper Examples Feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant