Driving Through Uncertainty: Risk-Averse Control with LLM Commonsense for Autonomous Driving under Perception Deficits
Partial perception deficits can compromise autonomous vehicle safety by disrupting environmental understanding. Existing protocols typically default to entirely risk-avoidant actions such as immediate stops, which are detrimental to navigation goals and lack flexibility for rare driving scenarios. Yet, in cases of minor risk, halting the vehicle may be unnecessary, and more adaptive responses are preferable. In this paper, we propose LLM-RCO, a risk-averse framework leveraging large language models (LLMs) to integrate human-like driving commonsense into autonomous systems facing perception deficits. LLM-RCO features four key modules interacting with the dynamic driving environment: hazard inference, short-term motion planner, action condition verifier, and safety constraint generator, enabling proactive and context-aware actions in such challenging conditions. To enhance the driving decision-making of LLMs, we construct DriveLM-Deficit, a dataset of 53,895 video clips featuring deficits of safety-critical objects, annotated for LLM fine-tuning in hazard detection and motion planning. Extensive experiments in adverse driving conditions with the CARLA simulator demonstrate that LLM-RCO promotes proactive maneuvers over purely risk-averse actions in perception deficit scenarios, underscoring its value for boosting autonomous driving resilience against perception loss challenges.
💡 Research Summary
This paper addresses a critical challenge in autonomous driving: maintaining safe and efficient navigation when the vehicle’s perception system is partially compromised due to sensor failures or adversarial attacks. Conventional fail-safe protocols often default to overly conservative actions like immediate full stops for any perception uncertainty. While safe, this approach is inefficient, disrupts traffic flow, and is not always necessary for minor risks. The authors argue for a more nuanced, human-like strategy that uses commonsense reasoning to assess potential dangers and respond appropriately.
To this end, the paper proposes LLM-RCO (LLM-Guided Resilient Control Override), a novel risk-averse framework that leverages the commonsense knowledge and reasoning capabilities of multimodal Large Language Models (LLMs). The core idea is to use the LLM not for end-to-end driving control, but as a “safety co-pilot” that overrides the primary autonomous driving stack during perception deficits. The framework operates in a two-phase “planning and acting” cycle to reduce LLM inference latency.
The planning phase is driven by two key LLM-based modules:
- Hazard Inference Module: This module analyzes a short history of multi-view camera frames to infer potential hazards (e.g., pedestrians, vehicles) and their likely motion within the areas affected by the perception deficit.
- Short-term Motion Planner: Based on the hazard inference, current navigation goals, and immediate perception, this module generates a variable-length sequence of “condition-action” pairs for future timesteps. It can choose between two high-level strategies: “Move” (proceed cautiously with planned actions) or “Stop-Observe-Move” (halt temporarily to gather more information before replanning).
The acting phase sequentially executes the planned actions, but each execution is gated by a rule-based Action Condition Verifier. This verifier checks in real-time whether the pre-planned condition (e.g., “consistent deficit pattern with no immediate hazard”) still holds true by analyzing deficit consistency across frames and the proximity of any detected objects. If the check fails, the system triggers an immediate replanning. A Safety Constraint Generator module also sets dynamic control limits based on contextual factors like weather and traffic density.
A significant contribution is the creation of the DriveLM-Deficit dataset, containing 53,895 video clips with simulated perception deficits for safety-critical objects. This dataset was used to fine-tune a compact VL-LLM (Qwen2-VL-2B-Instruct) specifically for the tasks of hazard inference and motion planning under deficits, addressing the lack of expert data for this niche scenario.
Extensive closed-loop evaluations were conducted in the CARLA simulator under adverse conditions. The LLM-RCO framework was tested overriding two different state-of-the-art autonomous driving agents (TransFuser and InterFuser). Results demonstrated that LLM-RCO consistently improved driving performance metrics compared to baseline fail-safe strategies. It enabled more proactive and context-aware maneuvers—such as cautious progression when risk was inferred to be low, or decisive stopping when hazards were likely—leading to smoother and safer navigation through perception loss scenarios. The work underscores the potential of LLM-based commonsense reasoning to enhance the resilience and safety of autonomous vehicles facing partial perception failures.
Comments & Academic Discussion
Loading comments...
Leave a Comment