Skip to content

[nav2_mppi] Add optional CUDA-accelerated backend for Jetson/Edge devices #5956

@functionhx-art

Description

@functionhx-art

Feature request

Feature description

Currently, nav2_mppi heavily relies on CPU for trajectory sampling and evaluation. While efficient on desktop-class CPUs, this becomes a significant bottleneck on edge computing platforms like NVIDIA Jetson Orin. In high-density obstacle environments or with a high number of sampled trajectories ($K &gt; 500$), the CPU utilization spikes significantly, often limiting the control frequency to sub-optimal levels (e.g., < 20Hz), which restricts the robot's high-speed maneuvering capabilities.

I propose adding an optional CUDA-accelerated backend to offload these computations to the GPU, significantly improving real-time performance on ARM-based SoC architectures.

Implementation considerations

I suggest implementing this as an optional plugin-based optimization. Key technical points include:

  • Parallel Computing: Use cuRAND for parallel noise generation and custom CUDA kernels for trajectory rollouts and scores.
  • Memory Optimization: Leverage Unified Memory (managed memory) to minimize host-to-device data transfer overhead, specifically targeting the shared-memory architecture of Jetson devices.
  • Build System: The CUDA backend will be gated behind a CMake flag (e.g., -DENABLE_CUDA=ON), ensuring 100% backward compatibility and no additional dependencies for non-NVIDIA users.
  • Pros: - Much higher sampling density (e.g., $K &gt; 2000$).
    • Significantly lower latency and higher control frequency.
    • Reduced CPU overhead for other critical tasks like perception or localization.
  • Cons: - Additional build-time dependency on the CUDA Toolkit for developers who explicitly enable this feature.

Recent research, such as "MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization" (arXiv:2409.07563), has already demonstrated the feasibility and performance gains of such an approach. I am a Robotics Algorithm Engineer and I am willing to contribute the implementation and a PR for this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions