AI Experimentation on Campus: Clara Hardy (CLAS)

6 March 2026

This blog post is part of a series of posts covering community members’ experiments with AI in the classroom and the workplace.

For a while now, advertisers have pitched generative AI (GenAI) to instructors for its potential to facilitate individualized learning. The idea is that, through prompting and adaptive conversation, an AI chatbot can craft a learning experience specifically tailored to the progress, needs, and strengths of the individual student. Clara Hardy, Professor of Classics, recently ran an experiment with her Ancient Greek language students to put this idea to the test.

Hardy used Playlab, an AI platform designed to help instructors develop AI applications and chatbots for use in the classroom. By prompting Playlab with written instructions and the vocabulary lists from the class textbooks, she designed a chatbot that could generate simple stories in Ancient Greek for students to practice with. All students would have to do is tell the chatbot the chapters they had read so far, and the AI would generate a story utilizing vocabulary from those chapters.

In their testing, though, Hardy and her students found that the AI bots struggled with several important tasks. For one, the Playlab bot was unable to process the spreadsheet format of the textbook vocabulary lists. Because changing this format would involve rewriting hundreds of vocabulary terms, Hardy and her students switched to using Claude, another popular AI tool. But even when they could get Claude to read the vocabulary lists, it still generated text with significant grammatical errors. This was especially problematic, as the bots were intended to help beginner and intermediate students who didn’t yet have the mastery needed to identify these errors. 

When the experiment came to a close, Hardy asked her students about their experiences with the AI bots. One student described them as “peers who have learned only a little more Greek.” Another student described them as “extra, very needy students.” In her own reflection, Hardy noted that the situation presented an interesting dilemma about ‘AI-powered learning’ in general: students need some level of knowledge to identify errors, but having this knowledge in advance also diminishes the need for the AI learning tool to begin with. In other words, when students are in a position to learn, they are less so in a position to catch the AI’s errors; when they are able to catch those errors, they likely have less to learn from AI.

Much of this dilemma comes down to the fact that, whereas textbooks and published learning materials are reviewed and edited to reflect the cumulative expertise of the field, AI-generated content is new and unreviewed. But seeing as AI tools are advertised precisely for their ability to generate content on the fly, the prospect of meticulously reviewing their output seems to defeat the apparent advantage they have over traditional learning materials.

On the whole, while she found the AI’s performance disappointing, Hardy also indicated that she was hopeful that the technical difficulties she encountered were not necessarily a sign that AI could never be used in this way. She speculated, for instance, that this strategy might work better for subjects with more consistent publicly available information, or with AI tools actually trained by experts. In any case, though, it is crucial for us to proceed with caution and greet unqualified claims of expertise with a curious, critical spirit.