MetaCLASS: Metacognitive Coaching for Learning with Adaptive Self-regulation Support

MetaCLASS: Metacognitive Coaching for Learning with Adaptive Self-regulation Support
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Large language models can generate fluent explanations, but effective tutoring requires supporting the learner’s thought process, not just delivering content. Metacognitive tutoring targets this gap by prompting planning, monitoring, debugging, and evaluation, and crucially, deciding when to be active versus minimally present, based on learner signals and trajectory. We introduce MetaCLASS, a learning-science grounded framework that formulates metacognitive tutoring as move selection over 11 interpretable actions aligned to self-regulated learning processes. MetaCLASS uses a two-phase framework that first plans a pedagogical trajectory conditioned on learner profiles (calibration, help-seeking) and then generates natural dialogue consistent with that plan. This yields a dataset of 1,015 conversations (7,711 turns) annotated with turn-level metacognitive labels, and validated for pedagogical contingency and trajectory adherence. We benchmark nine LLMs on predicting the next coach move given the problem and dialogue context. The best model achieves only 43.2% accuracy, and models exhibit compulsive intervention bias: in turns where effective metacognitive tutoring requires silent (41.7% of cases), models predict `no intervention’ only 4.2% of the time, while severely over-predicting high-intervention moves. These results show that traditional content-based tutoring ability does not translate to metacognitive tutoring competence, positioning MetaCLASS as a testbed for developing intelligent tutors that promote self-regulated learning.


💡 Research Summary

The paper addresses a critical gap in current large language model (LLM) tutoring systems: while they excel at generating fluent explanations, they rarely support the learner’s metacognitive processes that are essential for self‑regulated learning (SRL). To fill this gap, the authors introduce MetaCLASS, a learning‑science‑grounded framework that treats metacognitive tutoring as a decision‑making problem over a compact, interpretable action space.

MetaCLASS defines eleven coach moves aligned with the four components of the Metacognitive Awareness Inventory (MAI): Planning (e.g., eliciting goals and strategies), Monitoring (checking progress, probing uncertainty, surfacing inconsistencies), Debugging (suggesting alternative approaches, prompting resource use), Evaluation (reflecting on outcomes), plus two dialogue‑flow moves (neutral continuation prompts) and a strategic “NO_INTERVENTION” action that formalizes productive silence. By making “no intervention” a first‑class action, the framework captures the pedagogical value of restraint, a principle highlighted in human tutoring research.

The system operates in two phases. In Phase 1, a pedagogical planner constructs a full trajectory for each tutoring episode. This trajectory is conditioned on (1) learner profiles—calibration (over‑, under‑, well‑calibrated) and help‑seeking style (avoidant, executive, dependent)—and (2) problem analysis that identifies knowledge, strategy, monitoring, and execution gaps. The trajectory consists of a sequence of events, observable learner signals, the appropriate coach move, and the intended effect on the learner’s cognition. In Phase 2, an LLM generates the natural‑language dialogue, strictly following the pre‑planned trajectory while allowing flexible phrasing.

To evaluate the framework, the authors automatically generated 1,015 tutoring conversations (7,711 turns) across three math problem sets (GSM8K, MATH, AIME). Human annotators labeled each turn with the corresponding coach move and learner profile. Validation showed high pedagogical quality: the dialogues were contingent on learner signals and adhered to the planned trajectory (average consistency score ≈ 0.84).

The core benchmark, “Coach Move Prediction,” asks models to predict the optimal next coach move given the problem statement and dialogue context. Nine state‑of‑the‑art LLMs (including GPT‑4, Claude, Llama variants) were tested. The best model achieved only 43.2 % accuracy. Notably, in 41.7 % of turns where the correct action was “NO_INTERVENTION,” models predicted it merely 4.2 % of the time, revealing a strong “compulsive intervention bias” – models over‑produce high‑intervention moves and under‑predict strategic silence.

These findings demonstrate that proficiency in content generation does not translate to metacognitive tutoring competence. MetaCLASS therefore provides a novel testbed for developing and evaluating intelligent tutors that can genuinely scaffold self‑regulated learning, not just answer questions. The paper suggests future work on human‑in‑the‑loop studies, reinforcement‑learning‑based policy optimization, and extending the action space to other domains.


Comments & Academic Discussion

Loading comments...

Leave a Comment