Skip to content

梯度突然在1000步爆炸 Mean value_function loss: inf #107

@xiaoc10020015

Description

@xiaoc10020015

################################################################################
Learning iteration 10108/50000

                   Computation: 57897 steps/s (collection: 1.551s, learning 0.147s)
         Mean action noise std: 0.53
      Mean value_function loss: inf
           Mean surrogate loss: 0.0000
             Mean entropy loss: 21.9732
                   Mean reward: 16.13
           Mean episode length: 976.62

Episode_Reward/track_lin_vel_xy: 0.7884
Episode_Reward/track_ang_vel_z: 0.1463
Episode_Reward/alive: 0.1468
Episode_Reward/base_linear_velocity: -0.0255
Episode_Reward/base_angular_velocity: -0.0898
Episode_Reward/joint_vel: -0.2325
Episode_Reward/joint_acc: -0.0927
Episode_Reward/action_rate: -0.7571
Episode_Reward/dof_pos_limits: -0.0077
Episode_Reward/energy: -0.0061
Episode_Reward/joint_deviation_arms: -0.1371
Episode_Reward/joint_deviation_waists: -0.0967
Episode_Reward/joint_deviation_legs: -0.1566
Episode_Reward/flat_orientation_l2: -0.0198
Episode_Reward/base_height: -0.0009
Episode_Reward/gait: 0.4804
Episode_Reward/feet_slide: -0.0511
Episode_Reward/feet_clearance: 0.9356
Episode_Reward/undesired_contacts: -0.0080
Curriculum/terrain_levels: 4.7236
Curriculum/lin_vel_cmd_levels: 1.0000
Metrics/base_velocity/error_vel_xy: 0.4269
Metrics/base_velocity/error_vel_yaw: 1.7772
Episode_Termination/time_out: 3.7500
Episode_Termination/base_height: 0.0000
Episode_Termination/bad_orientation: 0.2917

               Total timesteps: 993755136
                Iteration time: 1.70s
                  Time elapsed: 04:32:11
                           ETA: 17:54:05

Error executing job with overrides: []
Traceback (most recent call last):
File "/home/jeff/Downloads/IsaacLab-2.2.0/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 101, in hydra_main
func(env_cfg, agent_cfg, *args, **kwargs)
File "/home/jeff/Desktop/humanoid_project/scripts/rsl_rl/train.py", line 204, in main
runner.learn(num_learning_iterations=agent_cfg.max_iterations, init_at_random_ep_len=True)
File "/home/jeff/anaconda3/envs/isaac/lib/python3.11/site-packages/rsl_rl/runners/on_policy_runner.py", line 262, in learn
loss_dict = self.alg.update()
^^^^^^^^^^^^^^^^^
File "/home/jeff/anaconda3/envs/isaac/lib/python3.11/site-packages/rsl_rl/algorithms/ppo.py", line 260, in update
self.policy.act(obs_batch, masks=masks_batch, hidden_states=hid_states_batch[0])
File "/home/jeff/anaconda3/envs/isaac/lib/python3.11/site-packages/rsl_rl/modules/actor_critic.py", line 122, in act
return self.distribution.sample()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jeff/anaconda3/envs/isaac/lib/python3.11/site-packages/torch/distributions/normal.py", line 74, in sample
return torch.normal(self.loc.expand(shape), self.scale.expand(shape))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: normal expects all elements of std >= 0.0

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
2026-01-15T06:26:12Z [16,454,501ms] [Warning] [omni.physx.plugin] USD stage detach not called, holding a loose ptr to a stage!
2026-01-15T06:26:13Z [16,455,410ms] [Warning] [carb] Recursive unloadAllPlugins() detected!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions