Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

December 31, 2025

Reading time: 5 minute

...

📝 Original Info

Title: Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
ArXiv ID: 2512.24574
Date: 2025-12-31
Authors: Zhenyu Zhang, Xiaoxia Wu, Zhongzhu Zhou, Qingyang Wu, Yineng Zhang, Pragaash Ponnusamy, Harikaran Subbaraj, Jue Wang, Shuaiwen Leon Song, Ben Athiwaratkun

📝 Abstract

Large Language Models (LLMs) often rely on long chain-of-thought (CoT) reasoning to solve complex tasks. While effective, these trajectories are frequently inefficient-leading to high latency from excessive token generation, or unstable reasoning that alternates between underthinking (shallow, inconsistent steps) and overthinking (repetitive, verbose reasoning). In this work, we study the structure of reasoning trajectories and uncover specialized attention heads that correlate with distinct cognitive behaviors such as verification and backtracking. By lightly intervening on these heads at inference time, we can steer the model away from inefficient modes. Building on this insight, we propose CREST-a training-free method for Cognitive REasoning Steering at Test-time. CREST has two components: (1) an offline calibration step that identifies cognitive heads and derives head-specific steering vectors, and (2) an inference-time procedure that rotates hidden representations to suppress components along those vectors. CREST adaptively suppresses unproductive reasoning behaviors, yielding both higher accuracy and lower computational cost. Across diverse reasoning benchmarks and models, CREST improves accuracy by up to 17.5% while reducing token usage by 37.6%, offering a simple and effective pathway to faster, more reliable LLM reasoning. Code is available at https://github.com/togethercomputer/CREST.

💡 Deep Analysis

📄 Full Content

Recent advances in Reinforcement Learning (RL)-based training (Shao et al., 2024) have substantially improved the reasoning capabilities of large language models (LLMs), enabling the emergence of "aha" moments and allowing them to excel in complex tasks such as coding (Jiang et al., 2024) and planning (Huang et al., 2024;Valmeekam et al., 2023). This capability is largely enabled by extended Chain-of-Thought (CoT) reasoning processes. While effective, the reasoning trajectories generated by LLMs are often suboptimal. From an efficiency perspective, long CoT processes consume significantly more tokens than standard responses, leading to increased latency, especially problematic for on-device applications. In terms of performance, recent studies have shown that LLMs often struggle with overthinking (Chen et al., 2024), generating unnecessarily verbose explanations for simple problems, and underthinking (Wang et al., 2025), where they halt reasoning prematurely before fully exploring complex solutions. Surprisingly, some work even suggests that effective reasoning can emerge without any explicit thinking process (Ma et al., 2025a).

To guide and enhance the reasoning process, prior work has primarily focused on directly controlling response length (Muennighoff et al., 2025;Luo et al., 2025a;Ma et al., 2025b;Sun et al., 2025;Yang et al., 2025c). However, there has been limited exploration of the internal cognitive mechanisms that underlie and drive these reasoning behaviors. Drawing inspiration from cognitive psychology, where deliberate processes such as planning, verification, and backtracking, often associated with System 2 thinking, are known to enhance human problem-solving, we posit that analogous cognitive behaviors can be identified and, importantly, steered within LLMs. In particular, we hypothesize that certain components of the model, such as attention heads, specialize in tracking and modulating these distinct reasoning patterns.

In this work, we categorize reasoning processes into two types: linear reasoning (i.e., step-by-step problem solving) and non-linear reasoning (e.g., backtracking, verification, and other divergent behaviors (Gandhi et al., 2025)). To understand how these behaviors are represented in the activation space, we label individual reasoning steps accordingly and train a simple linear classifier to distinguish between them based on hidden activations. Using linear probes, we identify a small subset of attention heads, referred to as cognitive heads, whose activations are highly predictive of reasoning type. Also, Steering these heads effectively alters the model’s cognitive trajectory without additional training.

Based on these findings, we introduce CREST (Cognitive REasoning Steering at Test-time), a training-free framework for dynamically adjusting reasoning behaviors during inference. CREST operates by first performing a simple offline calibration to identify cognitive heads and compute steering vectors from representative reasoning examples. Then, during test-time, it uses activation interventions based on these vectors to adaptively guide the model’s reasoning trajectory, suppressing inefficient cognitive modes and encouraging effective reasoning behavior. Importantly, CREST is compatible with a wide range of pre-trained LLMs and does not require any task-specific retraining or gradient updates, making it highly scalable and practical for real-world applications. And the test-time steering incurs negligible overhead, achieving matching throughput while reducing token consumption, thereby leading to an overall end-to-end efficiency gain.

In summary, our key contributions are as follows: (i) Cognitive Head Discovery: We provide empirical evidence for the existence of cognitive attention heads that correlate with specific reasoning behaviors, offering new interpretability into how cognitive patterns are represented within a model’s hidden states. (ii) Test-Time Behavioral Steering: We propose a plug-and-play activation intervention technique that enables test-time steering of reasoning behaviors without additional training. (iii) Comprehensive Evaluation: We validate our method across a diverse reasoning benchmarks, including MATH500, AMC23, AIME, LiveCodeBench, GPQA-D and Calender Planning, demonstrating that CREST not only enhances reasoning accuracy (up to 17.50%, R1-1.5B on AMC23) but also substantially reduces token usage (up to 37.60%, R1-1.5B on AMC23).

We organized prior research into three categories and move more related works in Appendix A.

Reasoning Models. Early chain-of-thought (CoT) prompting (Wei et al., 2022) and self-consistency decoding (Wang et al., 2022) demonstrated that sampling diverse reasoning paths and selecting the majority answer improves accuracy. Structured search frameworks extend this idea: Tree-of-Thought (Yao, 2023), Graph-of-Thought (Besta et al., 2024), and Forest-of-Thought (Bi et al., 2024).

Recent “thinking” model releases like Open

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on open access ArXiv data.

Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

Do Large Language Models Know What They Are Capable Of?

Modeling Language as a Sequence of Thoughts

Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Start searching

No results found