Artificial intelligence in the task of generating distractors for test questions

Dagaev A.

UDC 004.89
DOI: 10.26102/2310-6018/2025.49.2.028

Abstract
List of references
About authors

Creating high-quality distractors for test items is a labor-intensive task that plays a crucial role in the accurate assessment of knowledge. Existing approaches often produce implausible alternatives or fail to reflect typical student errors. This paper proposes an AI-based algorithm for distractor generation. It employs a large language model (LLM) to first construct a correct chain of reasoning for a given question and answer, and then introduces typical misconceptions to generate incorrect but plausible answer choices, aiming to capture common student misunderstandings. The algorithm was evaluated on questions from the Russian-language datasets RuOpenBookQA and RuWorldTree. Evaluation was conducted using both automatic metrics and expert assessment. The results show that the proposed algorithm outperforms baseline methods (such as direct prompting and semantic modification), generating distractors with higher levels of plausibility, relevance, diversity, and similarity to human-authored reference distractors. This work contributes to the field of automated assessment material generation, offering a tool that supports the development of more effective evaluation resources for educators, educational platform developers, and researchers in natural language processing.

1. Awalurahman H.W., Budi I. Automatic Distractor Generation in Multiple-Choice Questions: A Systematic Literature Review. PeerJ Computer Science. 2024;10. https://doi.org/10.7717/peerj-cs.2441

2. Kumar A.P., Nayak A., K. M.Sh., Goyal Sh., Chaitanya. A Novel Approach to Generate Distractors for Multiple Choice Questions. Expert Systems with Applications. 2023;225. https://doi.org/10.1016/j.eswa.2023.120022

3. Bitew S.K., Hadifar A., Sterckx L., Deleu J., Develder Ch., Demeester Th. Learning to Reuse Distractors to Support Multiple-Choice Question Generation in Education. IEEE Transactions on Learning Technologies. 2022;17:375–390. https://doi.org/10.1109/tlt.2022.3226523

4. Artsi Ya., Sorin V., Konen E., Glicksberg B.S., Nadkarni G., Klang E. Large Language Models for Generating Medical Examinations: Systematic Review. BMC Medical Education. 2024;24. https://doi.org/10.1186/s12909-024-05239-y

5. Shi F., Chen X., Misra K., et al. Large Language Models Can Be Easily Distracted by Irrelevant Context. In: Proceedings of the 40th International Conference on Machine Learning, ICML 2023: Volume 202, 23–29 July 2023, Honolulu, Hawaii, USA. PMLR; 2023. P. 31210–31227.

6. Lee Yo., Kim S., Jo Yo. Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction. arXiv. URL: https://arxiv.org/abs/2501.13125 [Accessed 12th April 2025].

7. De-Fitero-Dominguez D., Garcia-Lopez E., Garcia-Cabot A., Del-Hoyo-Gabaldon J.-A., Moreno-Cediel A. Distractor Generation Through Text-to-Text Transformer Models. IEEE Access. 2024;12:25580–25589. https://doi.org/10.1109/access.2024.3361673

8. Zhang L., Zou B., Aw A.T. Empowering Tree-Structured Entailment Reasoning: Rhetorical Perception and LLM-driven Interpretability. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 20–25 May 2024, Torino, Italy. ELRA and ICCL; 2024. P. 5783–5793.

9. Feng W., Lee J., McNichols H., et al. Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models. In: Findings of the Association for Computational Linguistics: NAACL 2024, 16–21 June 2024, Mexico City, Mexico. Association for Computational Linguistics; 2024. P. 3067–3082. https://doi.org/10.18653/v1/2024.findings-naacl.193

10. Cai X., Wang Ch., Long Q., Zhou Yu., Xiao M. Knowledge Hierarchy Guided Biological-Medical Dataset Distillation for Domain LLM Training. arXiv. URL: https://arxiv.org/abs/2501.15108 [Accessed 12th April 2025].

11. Wang R., Jiang Yu., Tao Yu., Li M., Wang X., Ge Sh. High-Quality Distractors Generation for Human Exam Based on Reinforcement Learning from Preference Feedback. In: Natural Language Processing and Chinese Computing: 13th National CCF Conference, NLPCC 2024: Proceedings: Part IV, 01–03 November 2024, Hangzhou, China. Singapore: Springer; 2024. P. 94–106. https://doi.org/10.1007/978-981-97-9440-9_8

12. Maity S., Deroy A., Sarkar S. A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation Using GPT. In: Advances in Information Retrieval: 46th European Conference on Information Retrieval, ECIR 2024: Proceedings: Part III, 24–28 March 2024, Glasgow, UK. Cham: Springer; 2024. P. 268–277. https://doi.org/10.1007/978-3-031-56063-7_18

13. Shen Ch.-H., Kuo Yi-L., Fan Ya.-Ch. Personalized Cloze Test Generation with Large Language Models: Streamlining MCQ Development and Enhancing Adaptive Learning. In: Proceedings of the 17th International Natural Language Generation Conference, 23–27 September 2024, Tokyo, Japan. Association for Computational Linguistics; 2024. P. 314–319.

14. Wang H.-J., Hsieh K.-Yu, Yu H.-Ch., et al. Distractor Generation Based on Text2Text Language Models with Pseudo Kullback-Leibler Divergence Regulation. In: Findings of the Association for Computational Linguistics: ACL 2023, 09–14 July 2023, Toronto, Canada. Association for Computational Linguistics; 2023. P. 12477–12491. https://doi.org/10.18653/v1/2023.findings-acl.790

Dagaev Alexander

Moscow Polytechnic University

Moscow, the Russian Federation

Keywords: distractor generation, artificial intelligence, large language models, knowledge assessment, test items, automated test generation, NLP

For citation: Dagaev A. Artificial intelligence in the task of generating distractors for test questions. Modeling, Optimization and Information Technology. 2025;13(2). URL: https://moitvivt.ru/ru/journal/pdf?id=1915 DOI: 10.26102/2310-6018/2025.49.2.028 (In Russ).

268

Full text in PDF

Received 21.04.2025

Revised 13.05.2025

Accepted 22.05.2025

Published 30.06.2025