I Large Language Model sono capaci di valutare i compiti scritti degli studenti? uno studio pilota in Università

Daniele Agostini

doi:10.6093/2284-0184/10671

Are Large Language Models Capable of Assessing Students’ Written Products?

A Pilot Study in Higher Education

Authors

Daniele Agostini Università di Trento

DOI:

https://doi.org/10.6093/2284-0184/10671

Keywords:

Large Language Models (LLMs), AI-Assisted Assessment, Technology-Enhanced Assessment, Artificial Intelligence in Education, Assessment Rubrics, Higher Education, Student Assessment, Authentic Tasks, Academic Assessment, Educational Technology

Abstract

The rapid adoption of large language models (LLMs) like ChatGPT in higher education raises critical questions about their capabilities for assessment. This pilot study explores whether current LLMs can support university instructors in evaluating students’ written work using rubrics, even for open-ended tasks. Five prominent LLMs (ChatGPT-3.5, ChatGPT-4, Claude 2, Bing Chat, Bard) plus an outsider (OpenChat 3.5) evaluated 21 anonymous group projects from an education course using a 5-criteria rubric. Their scores were compared to two human expert raters through statistical analyses. Results found Claude 2 and ChatGPT-4 had the highest overall agreement with human raters, although the open-source OpenChat 3.5 model performed well above its scale. Agreement varied by assessment criteria; LLM scoring aligned more closely on basic objectives but diverged on complex tasks like evaluating assessment practices and the educational project design. Current LLMs show promise in supporting assessment but lack independent scoring ability, especially for sophisticated rubric dimensions. Further research should refine prompting techniques and specialize models, moving towards AI-assisted rather than autonomous evaluation. The main limitations of this study are the small sample size and limited disciplines. This study provides initial evidence for the possibilities and pitfalls of LLM assessment aid in higher education.

Downloads

Download data is not yet available.

Downloads

pdf (Italiano)

Published

2024-01-16

Issue

Vol. 11 (2024): RTH - Education & Philosophy

Section

Brain Education Cognition

License

Authors who publish in this journal agree to the following:

Authors retain the rights to their work and give in to the journal the right of first publication of the work simultaneously licensed under a Creative Commons License - Attribution that allows others to share the work indicating the authorship and the initial publication in this journal.

Authors can adhere to other agreements of non-exclusive license for the distribution of the published version of the work (ex. To deposit it in an institutional repository or to publish it in a monography), provided to indicate that the document was first published in this journal.

Authors can distribute their work online (ex. In institutional repositories or in their website) prior to and during the submission process, as it can lead to productive exchanges and it can increase the quotations of the published work (See The Effect of Open Access).

Are Large Language Models Capable of Assessing Students’ Written Products?

A Pilot Study in Higher Education

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

Information

Language