<?xml version="1.0" encoding="UTF-8"?>
<article article-type="research-article" dtd-version="1.3" xml:lang="ru" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://metafora.rcsi.science/xsd_files/journal3.xsd">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">moitvivt</journal-id>
      <journal-title-group>
        <journal-title xml:lang="ru">Моделирование, оптимизация и информационные технологии</journal-title>
        <trans-title-group xml:lang="en">
          <trans-title>Modeling, Optimization and Information Technology</trans-title>
        </trans-title-group>
      </journal-title-group>
      <issn pub-type="epub">2310-6018</issn>
      <publisher>
        <publisher-name>Издательство</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.26102/2310-6018/2025.48.1.021</article-id>
      <article-id pub-id-type="custom" custom-type="elpub">1799</article-id>
      <title-group>
        <article-title xml:lang="ru">Метод генерации вопросов закрытого типа с использованием LLM</article-title>
        <trans-title-group xml:lang="en">
          <trans-title>A method for generating closed-type questions using LLMs</trans-title>
        </trans-title-group>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name-alternatives>
            <name name-style="eastern" xml:lang="ru">
              <surname>Дагаев</surname>
              <given-names>Александр Евгеньевич</given-names>
            </name>
            <name name-style="western" xml:lang="en">
              <surname>Dagaev</surname>
              <given-names>Alexander Evgenevich</given-names>
            </name>
          </name-alternatives>
          <email>a.e.dagaev@mospolytech.ru</email>
          <xref ref-type="aff">aff-1</xref>
        </contrib>
      </contrib-group>
      <aff-alternatives id="aff-1">
        <aff xml:lang="ru">Московский политехнический университет</aff>
        <aff xml:lang="en">Moscow Polytechnic University</aff>
      </aff-alternatives>
      <pub-date pub-type="epub">
        <day>01</day>
        <month>01</month>
        <year>2026</year>
      </pub-date>
      <volume>1</volume>
      <issue>1</issue>
      <elocation-id>10.26102/2310-6018/2025.48.1.021</elocation-id>
      <permissions>
        <copyright-statement>Copyright © Авторы, 2026</copyright-statement>
        <copyright-year>2026</copyright-year>
        <license license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/">
          <license-p>This work is licensed under a Creative Commons Attribution 4.0 International License</license-p>
        </license>
      </permissions>
      <self-uri xlink:href="https://moitvivt.ru/ru/journal/article?id=1799"/>
      <abstract xml:lang="ru">
        <p>В исследовании представлен метод генерации вопросов закрытого типа, использующий большие языковые модели (LLM) для повышения качества и релевантности создаваемых вопросов. Предложенная структура объединяет этапы генерации, верификации и корректировки, что позволяет не исключать некачественные вопросы, а улучшать их с использованием обратной связи. Метод был протестирован на трех популярных наборах данных: SQuAD, Natural Questions и RACE. Ключевые метрики оценки ROUGE, BLEU и METEOR стабильно показывали улучшения производительности на всех протестированных моделях. В исследовании использовались четыре варианта LLM: O1, O1-mini, GPT-4o и GPT-4o-mini, при этом O1 достигла наивысших результатов по всем наборам данных и метрикам. Экспертная оценка показала увеличение точности до 14,4 % по сравнению с генерацией без верификации и корректировки. Полученные результаты подчеркивают эффективность метода в обеспечении большей ясности, фактической корректности и контекстуальной релевантности в сгенерированных вопросах. Сочетание автоматизированной верификации и корректировки дополнительно улучшает результаты, демонстрируя потенциал LLM в совершенствовании задач генерации текста. Результаты работы будут полезны исследователям в области обработки естественного языка, образовательных технологий, а также специалистам, работающим над адаптивными системами обучения и программным обеспечением корпоративного обучения.</p>
      </abstract>
      <trans-abstract xml:lang="en">
        <p>This study presents a method for closed-ended question generation leveraging large language models (LLM) to improve the quality and relevance of generated questions. The proposed framework combines the stages of generation, verification, and refinement, which allows for the improvement of low-quality questions through feedback rather than simply discarding them. The method was tested on three widely recognized datasets: SQuAD, Natural Questions, and RACE. Key evaluation metrics, including ROUGE, BLEU, and METEOR, consistently showed performance gains across all tested models. Four LLM configurations were used: O1, O1-mini, GPT-4o, and GPT-4o-mini, with O1 achieving the highest results across all datasets and metrics. Expert evaluation revealed an accuracy improvement of up to 14.4% compared to generation without verification and refinement. The results highlight the method's effectiveness in ensuring greater clarity, factual correctness, and contextual relevance in generated questions. The combination of automated verification and refinement further enhances outcomes, showcasing the potential of LLMs to refine text generation tasks. These findings will benefit researchers in natural language processing, educational technology, and professionals working on adaptive learning systems and corporate training software.</p>
      </trans-abstract>
      <kwd-group xml:lang="ru">
        <kwd>генерация вопросов</kwd>
        <kwd>большие языковые модели</kwd>
        <kwd>искусственный интеллект</kwd>
        <kwd>обработка естественного языка</kwd>
        <kwd>O1</kwd>
        <kwd>O1-mini</kwd>
        <kwd>GPT-4o</kwd>
        <kwd>GPT-4o-mini</kwd>
      </kwd-group>
      <kwd-group xml:lang="en">
        <kwd>question generation</kwd>
        <kwd>large language models</kwd>
        <kwd>artificial intelligence</kwd>
        <kwd>natural language processing</kwd>
        <kwd>O1</kwd>
        <kwd>O1-mini</kwd>
        <kwd>GPT-4o</kwd>
        <kwd>GPT-4o-mini</kwd>
      </kwd-group>
      <funding-group>
        <funding-statement xml:lang="ru">Исследование выполнено без спонсорской поддержки.</funding-statement>
        <funding-statement xml:lang="en">The study was performed without external funding.</funding-statement>
      </funding-group>
    </article-meta>
  </front>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="cit1">
        <label>1</label>
        <mixed-citation xml:lang="ru">Huang J.-H., Zhu H., Shen Yi., et al. Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models. arXiv. URL: https://doi.org/10.48550/arXiv.2411.05706 [Accessed 3rd January 2025].</mixed-citation>
      </ref>
      <ref id="cit2">
        <label>2</label>
        <mixed-citation xml:lang="ru">Chen Q., Wang Y., Wang F., et al. Decoding text from electroencephalography signals: A novel Hierarchical Gated Recurrent Unit with Masked Residual Attention Mechanism. Engineering Applications of Artificial Intelligence. 2025;139. https://doi.org/10.1016/j.engappai.2024.109615</mixed-citation>
      </ref>
      <ref id="cit3">
        <label>3</label>
        <mixed-citation xml:lang="ru">Zakareya S., Alsaleem N., Alnaghmaish A., et al. Evaluating the Discrimination Index of AI-Generated vs. Human-Generated Multiple-Choice Questions: Action Research. In: ICERI2024 Proceedings: 17th annual International Conference of Education, Research and Innovation, 11–13 November 2024, Seville, Spain. IATED; 2024. pp. 221–226. https://doi.org/10.21125/iceri.2024.0137</mixed-citation>
      </ref>
      <ref id="cit4">
        <label>4</label>
        <mixed-citation xml:lang="ru">Shetty N., Li Yo. Detailed Image Captioning and Hashtag Generation. Future Internet. 2024;16(12). https://doi.org/10.3390/fi16120444</mixed-citation>
      </ref>
      <ref id="cit5">
        <label>5</label>
        <mixed-citation xml:lang="ru">Kwiatkowski T., Palomaki J., Redfield O., et al. Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics. 2019;7:453–466. https://doi.org/10.1162/tacl_a_00276</mixed-citation>
      </ref>
      <ref id="cit6">
        <label>6</label>
        <mixed-citation xml:lang="ru">Lai G., Xie Q., Liu H., et al. RACE: Large-Scale ReAding Comprehension Dataset from Examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 09–11 September 2017, Copenhagen, Denmark. Association for Computational Linguistics; 2017. pp. 785–794. https://doi.org/10.18653/v1/D17-1082</mixed-citation>
      </ref>
      <ref id="cit7">
        <label>7</label>
        <mixed-citation xml:lang="ru">Thorne W., Robinson A., Peng B., et al. Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference. In: Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, 16 November 2024, Miami, USA. Association for Computational Linguistics; 2024. pp. 450–462. https://doi.org/10.18653/v1/2024.nlp4dh-1.43</mixed-citation>
      </ref>
      <ref id="cit8">
        <label>8</label>
        <mixed-citation xml:lang="ru">Ribeiro M.T., Singh S., Guestrin C. Semantically Equivalent Adversarial Rules for Debugging NLP Models. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Volume 1: Long Papers, 15–20 July 2018, Melbourne, Australia. Association for Computational Linguistics; 2018. pp. 856–865. https://doi.org/10.18653/v1/P18-1079</mixed-citation>
      </ref>
      <ref id="cit9">
        <label>9</label>
        <mixed-citation xml:lang="ru">Brown T., Mann B., Ryder N., et al. Language Models Are Few-Shot Learners. In: Advances in Neural Information Processing Systems 33: 34th Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 06–12 December 2020, Vancouver, Canada. 2020. pp. 1877–1901.</mixed-citation>
      </ref>
      <ref id="cit10">
        <label>10</label>
        <mixed-citation xml:lang="ru">Bian Yu., Huang J., Cai X., et al. On Attention Redundancy: A Comprehensive Study. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, 06–11 June 2021, Online. Association for Computational Linguistics; 2021. pp. 930–945. https://doi.org/10.18653/v1/2021.naacl-main.72</mixed-citation>
      </ref>
      <ref id="cit11">
        <label>11</label>
        <mixed-citation xml:lang="ru">Jiang N., De Marneffe M.-C. He Thinks He Knows Better than the Doctors: BERT for Event Factuality Fails on Pragmatics. Transactions of the Association for Computational Linguistics. 2021;9:1081–1097. http://doi.org/10.1162/tacl_a_00414</mixed-citation>
      </ref>
      <ref id="cit12">
        <label>12</label>
        <mixed-citation xml:lang="ru">Lafkiar S., En Nahnahi N. An End-to-End Transformer-Based Model for Arabic Question Generation. Multimedia Tools and Applications. 2024. https://doi.org/10.1007/s11042-024-19958-3</mixed-citation>
      </ref>
      <ref id="cit13">
        <label>13</label>
        <mixed-citation xml:lang="ru">Balepur N., Gu F., Ravichander A., et al. Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer? [Preprint]. arXiv. URL: https://doi.org/10.48550/arXiv.2410.15512 [Accessed 3rd January 2025].</mixed-citation>
      </ref>
      <ref id="cit14">
        <label>14</label>
        <mixed-citation xml:lang="ru">Ye W., Zhang Q., Zhou X., et al. Correcting Factual Errors in LLMs via Inference Paths Based on Knowledge Graph. In: Proceedings of the 2024 International Conference on Computational Linguistics and Natural Language Processing (CLNLP), 19–21 July 2024, Yinchuan, China. IEEE; 2024. pp. 12–16. https://doi.org/10.1109/CLNLP64123.2024.00011</mixed-citation>
      </ref>
      <ref id="cit15">
        <label>15</label>
        <mixed-citation xml:lang="ru">Wei X., Chen H., Yu H., et al. Guided Knowledge Generation with Language Models for Commonsense Reasoning. In: Findings of the Association for Computational Linguistics: EMNLP 2024, 12–16 November 2024, Miami, USA. Association for Computational Linguistics; 2024. pp. 1103–1136. http://doi.org/10.18653/v1/2024.findings-emnlp.61</mixed-citation>
      </ref>
    </ref-list>
    <fn-group>
      <fn fn-type="conflict">
        <p>The authors declare that there are no conflicts of interest present.</p>
      </fn>
    </fn-group>
  </back>
</article>