Lessons Learned from Integrating Generative AI into an Introductory Undergraduate Astronomy Course at Harvard

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We describe our efforts to fully integrate generative artificial intelligence (GAI) into an introductory undergraduate astronomy course. Ordered by student perception of utility, GAI was used in instructional Python notebooks, in a subset of assignments, for student presentation preparations, and as a participant (in conjunction with a RAG-encoded textbook) in a course Slack channel. Assignments were divided into GAI-encouraged and GAI-discouraged. We incentivized student mastery of the material through midterm and final exams in which electronics were not allowed. Student evaluations of the course showed no reduction compared to the non-GAI version from the previous year.

💡 Research Summary

This paper presents a detailed case study on the full integration of generative artificial intelligence (GAI) into “Extragalactic Astronomy and Cosmology,” an introductory undergraduate astronomy course at Harvard University in Fall 2025. With 14 enrolled students, the course served as a testbed for exploring practical pedagogical strategies for GAI adoption.

The instructors were motivated by the need to prepare students for a future where GAI is ubiquitous, to enhance hands-on learning by shortening Python coding and debugging cycles, to provide 24/7 interactive tutoring, and to improve their own teaching productivity. They simultaneously acknowledged significant risks, including the potential for GAI to bypass foundational knowledge acquisition, produce hallucinations, and create cybersecurity concerns.

The pedagogical design centered on a balanced AI policy and a motivated incentive structure. The syllabus explicitly encouraged GAI use to enhance learning but prohibited it from replacing intellectual struggle. Assignments were strategically divided into “GAI-encouraged” and “GAI-discouraged” categories. Crucially, a significant portion of the final grade (45%) was allocated to proctored midterm and final exams where all electronic devices, including GAI, were forbidden. This structure aimed to incentivize students to achieve genuine mastery of the core material independently.

From the student perspective, seven distinct GAI-augmented learning tools were experimented with and ranked by perceived utility. The least successful was unstructured, broad-topic exploration with GAI, which students found no more valuable than Wikipedia. The most successful application was GAI assistance for generating and debugging Python code within Colab notebooks. Other tools, listed in increasing order of perceived value, included: using GAI for presentation preparation, an AI tutor with pre-written prompts in notebooks, and a Retrieval-Augmented Generation (RAG) based Slack bot that answered student questions by referencing the course textbook. This Slack bot, built using OpenAI’s Assistants API, also served to create a shared, public record of Q&A, moving beyond purely private 1:1 AI interactions.

The technical implementation employed a multi-tool approach, primarily using Google’s Gemini suite within Colab and OpenAI’s models for the Slack bot, selecting tools appropriate for specific tasks. The RAG system for the Slack bot was highlighted as a key method for grounding AI responses in authoritative course material to improve accuracy.

The study’s outcomes indicated no reduction in student course evaluations compared to the non-GAI version of the course taught the previous year, suggesting the integration did not negatively impact the overall learning experience. Key lessons learned include: 1) GAI is more effective for specific, bounded tasks (e.g., coding, explaining defined concepts) than for open-ended exploration by novices; 2) A scaffolded approach that clearly designates where GAI use is and is not appropriate is essential; 3) High-stakes, no-GAI assessments are a critical motivator for ensuring students develop underlying competency; 4) Techniques like RAG can mitigate hallucinations by tethering GAI to reliable source material; and 5) GAI can also significantly aid instructors in course preparation and administration.

While not a large-scale controlled experiment, this case study offers valuable, nuanced insights into the practical challenges and potential strategies for responsibly and effectively integrating generative AI into discipline-specific higher education.

Lessons Learned from Integrating Generative AI into an Introductory Undergraduate Astronomy Course at Harvard

💡 Research Summary

Comments & Academic Discussion

Leave a Comment