SanD-Planner: Sample-Efficient Diffusion Planner in B-Spline Space for Robust Local Navigation

SanD-Planner: Sample-Efficient Diffusion Planner in B-Spline Space for Robust Local Navigation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The challenge of generating reliable local plans has long hindered practical applications in highly cluttered and dynamic environments. Key fundamental bottlenecks include acquiring large-scale expert demonstrations across diverse scenes and improving learning efficiency with limited data. This paper proposes SanD-Planner, a sample-efficient diffusion-based local planner that conducts depth image-based imitation learning within the clamped B-spline space. By operating within this compact space, the proposed algorithm inherently yields smooth outputs with bounded prediction errors over local supports, naturally aligning with receding-horizon execution. Integration of an ESDF-based safety checker with explicit clearance and time-to-completion metrics further reduces the training burden associated with value-function learning for feasibility assessment. Experiments show that training with $500$ episodes (merely $0.25%$ of the demonstration scale used by the baseline), SanD-Planner achieves state-of-the-art performance on the evaluated open benchmark, attaining success rates of $90.1%$ in simulated cluttered environments and $72.0%$ in indoor simulations. The performance is further proven by demonstrating zero-shot transferability to realistic experimentation in both 2D and 3D scenes. The dataset and pre-trained models will also be open-sourced.


💡 Research Summary

SanD‑Planner tackles the long‑standing problem of reliable local navigation in cluttered and dynamic environments by marrying a diffusion‑based generative policy with a compact, structured trajectory representation. The authors argue that the dominant trend in learning‑based navigation—scaling up expert demonstrations to hundreds of thousands of trajectories—poses prohibitive data and compute requirements, especially for domains where data collection is costly (e.g., marine robotics). Instead of relying on massive datasets, SanD‑Planner injects strong inductive bias at the representation level: a clamped cubic B‑spline defined by a fixed set of eight control points.

Key technical components

  1. Input encoding – At each replanning step the robot receives four consecutive depth images, a 3‑D relative goal, and the previously executed velocity. Each depth frame is processed by a lightweight ResNet‑18 backbone; intermediate feature maps are flattened into visual tokens, augmented with 2‑D positional and temporal embeddings. Goal and velocity are projected by separate MLPs. All tokens are concatenated and passed through a two‑layer Transformer encoder, yielding a multimodal context vector Cₜ.

  2. B‑spline control‑point space – The trajectory τₜ is expressed as a clamped cubic B‑spline: τₜ(u)=∑{i=0}^{N‑1} B{i,3}(u) Q_{t,i}, where Q_{t,i}∈ℝ³ are the control points and N=8. The first control point is fixed at the robot’s current pose (Q_{t,0}=0). This representation provides three crucial advantages: (a) parameter efficiency – a low‑dimensional manifold for the diffusion model to explore; (b) built‑in smoothness – C² continuity is guaranteed by the spline itself, removing the need for explicit smoothness losses; (c) representation‑level robustness – local support (each curve segment depends only on four neighboring control points) and the convex‑hull property bound trajectory deviation by the maximum control‑point error, preventing small prediction errors from exploding into large path distortions.

  3. Conditional diffusion policy – Expert demonstrations are first fitted with B‑splines to obtain ground‑truth control‑point vectors x₀. A conditional denoising diffusion probabilistic model (DDPM) with v‑prediction parameterization is trained to model p_θ(x₀|Cₜ). During inference, the model performs S denoising steps (typically 100–200) to sample K candidate control‑point sets {Q^{(k)}ₜ}. Each set is mapped to a continuous trajectory via the spline basis.

  4. Geometric critic – Rather than learning a value function, the authors employ an explicit safety evaluator based on an Euclidean Signed Distance Field (ESDF). For each candidate trajectory the minimum clearance to obstacles and an estimated time‑to‑completion are computed. A weighted cost J(τ)=w₁·(1/clearance)+w₂·time is minimized, and the trajectory with the lowest J is selected for execution. The initial velocity of the chosen trajectory is fed back as v_prev_{t+1} to maintain temporal consistency across replanning cycles.

  5. Training efficiency and results – The entire system is trained on only 500 expert episodes (≈0.25 % of the data used by the strongest baseline, NavDP). In the public benchmark of cluttered simulated environments, SanD‑Planner achieves a 90.1 % success rate; in indoor simulations it reaches 72.0 %, both surpassing or matching state‑of‑the‑art methods that rely on orders of magnitude more data. Zero‑shot transfer to a real‑world Unitree Go2 quadruped demonstrates robust navigation through narrow passages, stairs, and dynamic obstacles without any fine‑tuning.

  6. Ablation and representation study – The authors conduct a systematic comparison among three trajectory parameterizations: discrete waypoints, interpolating cubic splines, and B‑splines. Under identical training protocols, B‑splines consistently yield higher success rates, lower average path deviation under synthetic perturbations, and better sample efficiency, confirming the theoretical advantages discussed.

  7. Reproducibility – All datasets, pretrained models, and the full training/evaluation pipeline are promised to be released as open‑source, facilitating immediate replication and further research.

Limitations and future directions – The current system relies solely on depth images; incorporating RGB, semantic segmentation, or proprioceptive cues could improve performance in texture‑rich or highly dynamic scenes. Extending the spline representation to variable degree or non‑uniform knot spacing may allow more expressive curves for highly non‑linear maneuvers. Finally, integrating predictive models of moving obstacles could further enhance safety in densely populated environments.

In summary, SanD‑Planner demonstrates that a well‑chosen structured representation, combined with a diffusion‑based generative policy and an explicit geometric safety check, can achieve state‑of‑the‑art local navigation with dramatically fewer expert demonstrations. This work opens a promising avenue for sample‑efficient, robust robot navigation in real‑world settings.


Comments & Academic Discussion

Loading comments...

Leave a Comment