Are Large Language Models Capable of Assessing Students’ Written Products?
A Pilot Study in Higher Education
DOI:
https://doi.org/10.6093/2284-0184/10671Keywords:
Large Language Models (LLMs), AI-Assisted Assessment, Technology-Enhanced Assessment, Artificial Intelligence in Education, Assessment Rubrics, Higher Education, Student Assessment, Authentic Tasks, Academic Assessment, Educational TechnologyAbstract
The rapid adoption of large language models (LLMs) like ChatGPT in higher education raises critical questions about their capabilities for assessment. This pilot study explores whether current LLMs can support university instructors in evaluating students’ written work using rubrics, even for open-ended tasks. Five prominent LLMs (ChatGPT-3.5, ChatGPT-4, Claude 2, Bing Chat, Bard) plus an outsider (OpenChat 3.5) evaluated 21 anonymous group projects from an education course using a 5-criteria rubric. Their scores were compared to two human expert raters through statistical analyses. Results found Claude 2 and ChatGPT-4 had the highest overall agreement with human raters, although the open-source OpenChat 3.5 model performed well above its scale. Agreement varied by assessment criteria; LLM scoring aligned more closely on basic objectives but diverged on complex tasks like evaluating assessment practices and the educational project design. Current LLMs show promise in supporting assessment but lack independent scoring ability, especially for sophisticated rubric dimensions. Further research should refine prompting techniques and specialize models, moving towards AI-assisted rather than autonomous evaluation. The main limitations of this study are the small sample size and limited disciplines. This study provides initial evidence for the possibilities and pitfalls of LLM assessment aid in higher education.
Downloads
Downloads
Published
Issue
Section
License
Authors who publish in this journal agree to the following:
- Authors retain the rights to their work and give in to the journal the right of first publication of the work simultaneously licensed under a Creative Commons License - Attribution that allows others to share the work indicating the authorship and the initial publication in this journal.
- Authors can adhere to other agreements of non-exclusive license for the distribution of the published version of the work (ex. To deposit it in an institutional repository or to publish it in a monography), provided to indicate that the document was first published in this journal.
- Authors can distribute their work online (ex. In institutional repositories or in their website) prior to and during the submission process, as it can lead to productive exchanges and it can increase the quotations of the published work (See The Effect of Open Access).