Distilling LLM Reasoning into Graph of Concept Predictors

Distilling LLM Reasoning into Graph of Concept Predictors
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Deploying Large Language Models (LLMs) for discriminative workloads is often limited by inference latency, compute, and API costs at scale. Active distillation reduces these costs by querying an LLM oracle to train compact discriminative students, but most pipelines distill only final labels, discarding intermediate reasoning signals and offering limited diagnostics of what reasoning is missing and where errors arise. We propose Graph of Concept Predictors (GCP), a reasoning-aware active distillation framework that externalizes the teacher’s decision process as a directed acyclic graph and mirrors it with modular concept predictors in the student. GCP enhances sample efficiency through a graph-aware acquisition strategy that targets uncertainty and disagreement at critical reasoning nodes. Additionally, it improves training stability and efficiency by performing targeted sub-module retraining, which attributes downstream loss to specific concept predictors and updates only the most influential modules. Experiments on eight NLP classification benchmarks demonstrate that GCP enhances performance under limited annotation budgets while yielding more interpretable and controllable training dynamics. Code is available at: https://github.com/Ziyang-Yu/GCP.


💡 Research Summary

The paper addresses the practical bottlenecks of deploying large language models (LLMs) for high‑throughput discriminative tasks—namely inference latency, compute cost, and API fees—by proposing a reasoning‑aware active distillation framework called Graph of Concept Predictors (GCP). Traditional distillation pipelines treat the LLM as a black‑box oracle that only provides final labels, discarding the rich intermediate reasoning traces (e.g., chain‑of‑thought, tree‑of‑thought) that are known to boost LLM performance on compositional problems. This loss of structure leads to sparse supervision, poor interpretability, and limited diagnostic capability for the student model.

GCP tackles these issues by first externalizing the teacher’s reasoning process as a directed acyclic graph (DAG) of “concept” nodes. Each node represents an explicit intermediate reasoning state, and edges encode semantic or causal dependencies. The student model mirrors this graph with modular sub‑networks: a root encoder produces an embedding for the input text, and each non‑root node computes its embedding by applying a node‑specific transition function to the concatenated embeddings of its parents. The final node’s output yields the task prediction. This design generalizes concept bottleneck models (CBMs) by supporting multi‑path, hierarchical reasoning, and by learning the transition functions jointly with the concepts.

Active learning in GCP is graph‑aware. Instead of selecting samples solely on final‑output uncertainty, three complementary criteria are evaluated for every unlabeled instance:

  1. Structure‑Weighted Uncertainty (E_unc) – node‑wise entropy weighted by degree centrality, aggregating uncertainty across the whole graph;
  2. Topology‑Aware Gradient Diversity (D_grad) – a distance computed from node‑level loss gradients, encouraging selection of samples that activate diverse reasoning pathways;
  3. Graph‑Aware Representativeness (D_KL) – a core‑set style objective using a topology‑weighted KL divergence between concept embeddings, ensuring coverage of the concept space.

The three candidate sets are intersected (consensus selection) to obtain a batch that is simultaneously uncertain, diverse, and representative. If the intersection is smaller than the annotation budget, remaining slots are filled from the union using a tie‑breaker such as E_unc.

Retraining is also guided by the graph. After each acquisition round, a counterfactual “rerun” is performed: each concept node’s output is perturbed or masked, and the resulting change in the final loss is measured. Nodes whose perturbation yields the largest loss increase are deemed high‑impact and are selectively fine‑tuned, while the rest of the network remains frozen. This targeted sub‑module updating dramatically reduces computation compared to full‑model retraining and improves training stability because gradients are propagated through a well‑structured, lower‑dimensional subspace.

Empirical evaluation spans eight NLP classification benchmarks covering sentiment analysis, topic classification, stance detection, and clinical risk prediction. Under strict annotation budgets (as low as 1 %–5 % of the full training set), GCP consistently outperforms strong baselines that use standard active learning with LLM annotation, as well as recent step‑by‑step distillation methods. Ablation studies show that removing any of the three acquisition components degrades performance, and that full‑model retraining (instead of selective sub‑module updates) leads to slower convergence and higher variance. Visualizations of concept‑level predictions and gradient flows illustrate that GCP provides interpretable traces of the decision process, enabling users to pinpoint which reasoning steps are failing and to intervene (e.g., by adding concept‑level supervision).

In summary, GCP demonstrates that preserving and transferring the internal reasoning structure of LLMs is both feasible and beneficial. By aligning active learning, distillation, and optimization with a graph of concepts, the framework achieves substantial cost savings, higher accuracy under limited supervision, and improved model transparency. The authors suggest future directions such as automated extraction of concept graphs from arbitrary LLM prompts, extensions to multimodal or temporal data, and interactive human‑AI workflows where users can edit or correct specific concepts to steer model behavior.


Comments & Academic Discussion

Loading comments...

Leave a Comment