In a recent experimental exam, artificial intelligence outperformed real students without being detected by the graders.
Researchers from the University of Reading (UK) created 33 fictional students and used the AI tool ChatGPT to generate answers for psychology course tests at the university. According to them, the average score of AI students was 0.5 points higher than that of human students. At the same time, essays written by AI were “almost undetectable,” with 94% successfully passing unnoticed by grading teachers.
According to the study published in the journal PLOS ONE, the 6% detection rate may even be an overestimation. “It is concerning that AI’s papers scored significantly higher than those of actual students. Therefore, students could cheat undetected by using AI and, in doing so, achieve higher scores than those who do not cheat,” the researchers wrote.
Associate Professor Peter Scarfe and Professor Etienne Roesch, who led the research, stated that their findings serve as a “wake-up call” for educators worldwide. “Many institutions have eliminated traditional exams for more comprehensive assessments. Our research shows the international importance of understanding how AI will impact educational evaluations. We do not necessarily have to revert entirely to handwritten tests – but the global education sector will need to evolve to respond to AI,” Scarfe noted.
University professors could not distinguish between AI-written answers and handwritten answers.
In the study, the answers and essays were submitted for first, second, and third-year courses without the graders being aware. As a result, the AI scores were higher than those of actual university students in the first two years. However, the researchers noted that humans scored higher in the third-year exam – a finding that “supports the viewpoint that AI is still struggling with more abstract reasoning.”
Their experiment is the largest study in this field to date. Academics have raised concerns about the impact of AI in education, with the University of Glasgow (Scotland) recently reinstating in-person exams for a course.
Earlier this year, a study found that most college students had used AI programs to help write essays for them, but only 5% admitted to copying the AI-generated text verbatim into their work.