In recent years, AI has evolved at a rapid pace, quickly becoming integrated into everyday life. It’s bringing automation and efficiencies to many sectors and industries – with education and assessment being no exception.
Initially, discussions around AI in education centred on concerns that it might encourage dishonesty amongst students and increase cheating. It raised questions about fairness and academic integrity. Over time, the narrative is shifting. Increasingly, institutions are recognising that AI can be a valuable tool to support and enhance student learning.
With growing acceptance that AI is here to stay, many organisations are establishing clear policies for responsible and acceptable use. Furthermore, this evolving mindset is opening the door for AI use in streamlining and improving efficiency across educational operations.
In this blog, we consider how AI may be harnessed to enhance assessment processes.
One of the biggest challenges facing institutions is creating fresh exam questions. Not only must they test the required knowledge, but they must be fair and robust and of an appropriate level of difficulty. This can be incredibly time-consuming and resource-intensive.
It’s reasonable to consider that using AI for question creation has the potential to present significant time savings. However, the critical question is whether AI-generated items can maintain quality and meet the strict criteria demanded by educators.
This has been explored in a recently published study: Quality assurance and validity of AI-generated single best answer questions (2025). Researchers from the University of St Andrews used a widely available large language model (LLM) to generate 220 single best answer (SBA) questions. An expert panel then reviewed all 220 questions for accuracy, quality, and alignment with the curriculum. From this pool, 69% of AI-generated questions met the inclusion criteria with minimal or no edits. This suggests that AI can produce high-quality, curriculum-aligned items efficiently.
To further evaluate performance, fifty AI-authored and fifty human-authored questions were selected and used to create two exams for Year 1 and Year 2 ScotGEM (Scottish Graduate-Entry Medicine) students. With each exam having an equal split between AI and human-authored questions. They were delivered as formative assessments via the Speedwell eSystem.
Experts reviewed the performance of both question types. They discovered AI and human-authored questions showed no statistically significant difference in performance metrics (facility and discrimination index). This indicates that AI-assisted question generation can supplement traditional methods without compromising quality.
This is encouraging for written questions and naturally leads to considering how AI could benefit other aspects of medical education. For example, could AI be used to enhance OSCE examinations? OSCEs are an important part of many medical education programs, but they are notorious for being complex and time-consuming to administer and deliver.
Using AI in OSCE assessments could help reduce this burden by assisting educators in developing new OSCE scenarios or patient histories. As well helping to produce consistent mark sheets with clear, objective, marking criteria for examiners.
As the technology continues to evolve, AI could be used to create video simulations with virtual patients and clinical scenarios. This could not only ease the pressure to find physical actors and locations, often difficult and expensive, but also offer the benefit of a completely consistent experience for candidates.
With an AI-simulated station, every candidate is presented with the same clinical cues, timing, and responses, removing variation that may arise from differences in acting style, interpretation, or fatigue. This will undoubtedly help reduce unintended bias and ensure that performance is judged on clinical decision-making rather than on how a scenario is presented. Which in turn, could result in greater reliability of assessment outcomes.
Exam marking is another area of the assessment process that is time-consuming and labour-intensive. Whilst MCQ-type exams can already be marked automatically with supporting software, such as the eSystem, this has traditionally not been possible for written assessments such as short-answer or essay-style exams. However, this is set to change with AI.
Using AI to assist in the marking of written exams can provide rapid, consistent scoring, reducing the administrative burden on educators and enabling faster turnaround times for results. This allows academic staff to focus more on teaching and curriculum development rather than manually processing large volumes of exam scripts.
The benefits don’t end there; AI can also enhance reliability and fairness. Ensuring the consistent application of scoring criteria and reducing variation that can occur between human markers. Furthermore, it can highlight patterns in student responses, flagging areas where many candidates struggle, and provide actionable insights for curriculum improvement.
Additionally, AI can generate clear, personalised feedback for students. Helping them understand their strengths and areas for development. Historically, for written exams, its often been too time-prohibitive to provide detailed feedback. However, with AI automatically producing a feedback summary, examiners can simply review and accept or amend as necessary.
This combination of speed, accuracy, and insight has the potential to support higher-quality assessment and ultimately deliver more effective learning.
With AI, feedback can go a step further and transform into a personalised and adaptive learning experience for candidates. By analysing individual performance and response patterns, it’s plausible to imagine that AI could deliver a series of formative tests that are individualised to the learner. Presenting them the opportunity to work on personal learning gaps and progress at a pace that’s right for them.
Beyond this, using AI in assessment has the potential to offer enhancements to make exams more accessible. Integrating AI for translation or language support could mean that candidates get the option to take the exam in their preferred language. Levelling the playing field by allowing candidates to focus on displaying their knowledge rather than their language proficiency. This would be particularly advantageous to candidates studying abroad, where it wouldn’t be uncommon for them to be required to take an exam which is not in their native language.
Similarly, AI can assist with accessibility features, such as text-to-speech, reading support, or alternative formats, helping institutions provide a more inclusive assessment experience and ensuring that candidates can take an exam in the way that best works for them.
Using AI in assessment offers transformative potential. It presents opportunities for educators to save time and enhance their processes. Here, we have considered just a few areas where AI could elevate exam processes. As the technology continues to evolve, more opportunities will emerge.
What’s clear, however, is that rather than fearing AI will replace jobs, it should be viewed as a partnership between technology and human expertise. A tool that assists and enhances human roles, not replaces. Improving efficiency, accuracy and freeing up time to drive meaningful improvements.
Discover how our exam software can streamline your assessment processes, improve efficiency, and support more effective, reliable exams. Get in touch to see how we can help enhance your exam delivery and assessment outcomes.