Adapting Reinforcement Learning for Path Planning in Constrained Parking Scenarios

Adapting Reinforcement Learning for Path Planning in Constrained Parking Scenarios
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Real-time path planning in constrained environments remains a fundamental challenge for autonomous systems. Traditional classical planners, while effective under perfect perception assumptions, are often sensitive to real-world perception constraints and rely on online search procedures that incur high computational costs. In complex surroundings, this renders real-time deployment prohibitive. To overcome these limitations, we introduce a Deep Reinforcement Learning (DRL) framework for real-time path planning in parking scenarios. In particular, we focus on challenging scenes with tight spaces that require a high number of reversal maneuvers and adjustments. Unlike classical planners, our solution does not require ideal and structured perception, and in principle, could avoid the need for additional modules such as localization and tracking, resulting in a simpler and more practical implementation. Also, at test time, the policy generates actions through a single forward pass at each step, which is lightweight enough for real-time deployment. The task is formulated as a sequential decision-making problem grounded in a bicycle model dynamics, enabling the agent to directly learn navigation policies that respect vehicle kinematics and environmental constraints in the closed-loop setting. A new benchmark is developed to support both training and evaluation, capturing diverse and challenging scenarios. Our approach achieves state-of-the-art success rates and efficiency, surpassing classical planner baselines by +96% in success rate and +52% in efficiency. Furthermore, we release our benchmark as an open-source resource for the community to foster future research in autonomous systems. The benchmark and accompanying tools are available at https://github.com/dqm5rtfg9b-collab/Constrained_Parking_Scenarios.


💡 Research Summary

The paper addresses the problem of real‑time path planning for autonomous vehicles in highly constrained parking environments, where traditional planners such as Hybrid A* struggle due to imperfect perception, high online search costs, and the need for additional modules (localization, tracking). The authors propose a deep reinforcement learning (DRL) framework that treats parking as a sequential decision‑making task grounded in a kinematic bicycle model, ensuring that the learned policy respects non‑holonomic vehicle constraints.

Problem formulation – The vehicle state is defined as (x, y, θ, δ) (rear‑axle position, heading, steering angle). The action space is discrete, consisting of longitudinal displacements (forward, backward, none) and steering angle changes (left, right, none). At each step the bicycle model updates the state, guaranteeing physically feasible motions.

Input representation – To keep the comparison fair with Hybrid A*, the RL agent receives the same information: the ego pose, the target pose, and sparse obstacle contours (e.g., lidar point clouds). Because these contours are sparse, the authors design a lightweight cross‑attention feature extractor that first transforms all points into an ego‑centric frame, normalizes them, and then lets the ego query obstacle features. This design mimics the perception limits of a real system while remaining computationally cheap.

Training challenges and solutions – Randomly sampling initial poses can produce infeasible configurations (e.g., the car intersecting a wall) because collision checks are impossible with only contours. The authors introduce a rollout‑based initialization that moves the vehicle away from the goal using the bicycle dynamics and adds heading perturbations, guaranteeing feasible start states and stabilizing training.

Action‑chunking – Instead of selecting a single primitive at every time step, the policy outputs a “chunk” that is executed for several consecutive steps. This dramatically reduces the number of forward passes, improves exploration efficiency, and still allows fine‑grained control when needed.

Benchmark – ParkBench – Recognizing the lack of realistic parking datasets, the authors construct ParkBench, comprising 51 scenarios extracted from real‑world data. Scenarios cover rear‑in parking in narrow aisles, occluded corner spots, and other tight maneuvers. The benchmark is wrapped in an OpenAI‑Gym‑compatible simulator, enabling standard RL pipelines.

Experiments – The DRL planner is trained on ParkBench with curriculum learning and evaluated against a strong Hybrid A* baseline (public implementation). Results show a +96 % increase in success rate (i.e., reaching the target without collision) and a +52 % improvement in efficiency measured by path length and time. The policy runs in a single forward pass per chunk, meeting real‑time constraints on typical onboard hardware.

Strengths

  1. Physical realism – By embedding the bicycle model directly into the transition dynamics, the policy never proposes infeasible motions.
  2. Lightweight perception handling – Sparse contour input plus cross‑attention yields a compact network suitable for embedded deployment.
  3. Training stability – The rollout initialization eliminates impossible start states, accelerating convergence.
  4. Open resources – Both the benchmark and code are released, fostering reproducibility and future research.

Limitations

  • Evaluation is purely simulation‑based; real‑world robustness to sensor noise, dynamic obstacles, and model mismatch remains untested.
  • The discrete action set may limit ultra‑fine steering adjustments required for some high‑precision parking tasks.
  • Action‑chunk length introduces a trade‑off between exploration speed and control granularity; optimal settings may be scenario‑dependent.

Conclusion – The work demonstrates that a carefully designed DRL planner can replace a classical planner in constrained parking scenarios, achieving superior success rates and efficiency while operating within real‑time budgets. The introduction of ParkBench fills a critical gap in the community, and the presented techniques (cross‑attention feature extraction, action‑chunking, rollout initialization) provide a solid foundation for future extensions such as handling dynamic agents, continuous action spaces, or hybrid planner‑RL systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment