Editrail: Understanding AI Usage by Visualizing Student-AI Interaction in Code

Editrail: Understanding AI Usage by Visualizing Student-AI Interaction in Code
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Programming instructors have diverse philosophies about integrating generative AI into their classes. Some encourage students to use AI, while others restrict or forbid it. Regardless of their approach, all instructors benefit from understanding how their students actually use AI while writing code. Such insight helps instructors assess whether AI use aligns with their pedagogical goals, enables timely intervention when they find unproductive usage patterns, and establishes effective policies for AI use. However, our survey with programming instructors found that many instructors lack visibility into how students use AI in their code-writing processes. To address this challenge, we introduce Editrail, an interactive system that enables instructors to track students’ AI usage, create personalized assessments, and provide timely interventions, all within the workflow of monitoring coding histories. We found that Editrail enables instructors to detect AI use that conflicts with pedagogical goals accurately and to determine when and which students require intervention.


💡 Research Summary

The paper addresses a pressing problem in computer science education: instructors lack direct visibility into how students use generative AI while writing code. While some educators encourage AI use and others prohibit it, all need concrete evidence of student‑AI interaction to align practice with pedagogical goals, intervene timely, and craft effective policies. To fill this gap, the authors introduce Editrail, an interactive visualization system that embeds AI usage data into the normal workflow of monitoring coding histories.

System Design
Editrail captures every edit event from a version‑controlled coding environment and tags those edits that originate from AI suggestions (e.g., Copilot, ChatGPT). In the UI, AI‑generated contributions appear as colored “trails” overlaid on a timeline of the student’s code evolution. The interface consists of three coordinated panels: (1) a high‑level timeline showing the sequence of edits with AI trails highlighted, (2) a detailed view of the selected edit displaying the student’s code alongside the AI‑generated snippet and any modifications the student made, and (3) a compact chat log summary that can be expanded to reveal the full dialogue. This design lets instructors quickly assess whether a student is merely copying AI output, iteratively refining it, or using it as a learning scaffold.

Empirical Evaluation
The authors collected data from 20 students solving two Python programming problems, yielding 188–319 lines of code, 2,410–2,610 edit events, and 42 AI chat sessions. A within‑subject study with 12 university instructors compared Editrail against a baseline tool that only displayed raw code and chat logs. Instructors using Editrail identified AI‑dependent work more accurately, made faster judgments about policy violations, and were better able to target interventions (e.g., inserting a quiz on a concept the student over‑relied on). Qualitative feedback highlighted Editrail’s usefulness for policy enforcement, personalized assessment creation, and real‑time support.

Contributions

  1. A large‑scale survey of 27 lead programming instructors across 15 universities, revealing a universal “visibility gap” and informing design goals.
  2. A publicly released dataset of student‑AI interaction logs for future research.
  3. The Editrail prototype, which visualizes AI contributions as trails and supports in‑situ instructor interventions.
  4. Evidence from a controlled study that Editrail bridges the visibility gap, enabling more precise alignment of AI use with instructional objectives.

Limitations and Future Work
The study is limited to Python and a modest sample size, so generalization to other languages or massive MOOCs remains to be tested. The system currently relies on a single AI model and simple tagging; richer semantic analysis of prompts and model outputs could improve automatic classification of “productive” vs. “unproductive” AI use. Future directions include scaling to larger classes, supporting multiple AI services, and integrating automated meta‑cognitive feedback that encourages students to reflect on their AI‑assisted problem‑solving process.

Overall, Editrail demonstrates that making AI contributions visible within the natural coding workflow empowers instructors to monitor, assess, and guide student learning more effectively in an era where generative AI is becoming an integral part of programming education.


Comments & Academic Discussion

Loading comments...

Leave a Comment