Применение машинного обучения для определения порядка прилагательных в английском языке
Работая с нашим сайтом, вы даете свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта отправляется в «Яндекс» и «Google»
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Application of machine learning for adjective ordering in English sentences

idTerekhova A.D. idTerekhov G.V. idSychev O.A.

UDC 004.891.2
DOI: 10.26102/2310-6018/2023.40.1.028

  • Abstract
  • List of references
  • About authors

The article presents a methodology for solving the adjective ordering problem in English sentences by determining their hypernyms. The determining of a hypernym can be represented as a classification task; therefore, the most popular machine-learning classification methods were compared, they include the following: nearest neighbors method, logistic regression, decision classifier, support vector machine and naive Bayes method. The models were trained on a sample that contained adjectives and their hypernyms. For each adjective, similar adjectives from the training sample were selected; the most semantically appropriate hypernym was determined based on them. The use of information about word similarity from GloVe embeddings is proposed. The optimal values of hyperparameters for the K-Nearest Neighbors method were selected by means of the gridsearch technique. The quality of data classification was evaluated applying the metrics of precision, recall, and F1-measure for each of the methods. Since there were no ready-made datasets of classified adjectives, 300 adjectives were classified manually to create necessary samples.

1. Mitrovic A., Koedinger K.R., Martin B. A comparative analysis of cognitive tutoring and constraint-based modeling. Lecture Notes in Computer Science. 2003;2702:313–322. DOI: 10.1007/3-540-44963-9_42.

2. Uglev V.A., Sychev O.A., Anikin A.V. Data mining of digital footprint during assessment grading for intelligent decision making during learning process.. Zhurnal Sibirskogo federal'nogo universiteta. Tekhnika i tekhnologii = Journal of Siberian Federal University. Engineering & Technologies. 2022;15(1):121–136. DOI: 10.17516/1999-494X-0378. (In Russ.).

3. Malkani N. A Comprehensive guide on General English for competitive examinations. Agra, Oswal Publishers; 2020. 518 p.

4. Yogish D., Manjunath T. N., Hegadi S.R. Review on natural language processing trends and techniques using NLTK. Recent Trends in Image Processing and Pattern Recognition. 2018;1037:589–606. DOI: 10.1007/978-981-13-9187-3_53.

5. Bird S, Klein E, Loper E. Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc; 2009. 502 p.

6. Cheng X., Kong X., Liao L., Li B. A combined method for usage of NLP libraries towards analyzing software documents. Advanced Information Systems Engineering. CAiSE 2020. Lecture Notes in Computer Science. 2020;12127:515–529. DOI: 10.1007/978-3-030-49435-3_32.

7. Sarkar D. Text Analytics with Python: A Practitioner's Guide to Natural Language Processing. New York, Apress; 2019. 698 p.

8. Fellbaum C. WordNet: an Electronic Lexical Database. Cambridge, MIT Press; 1998. 422 p. DOI: 10.7551/mitpress/7287.001.0001.

9. Pennington J., Socher R., Manning C.D. Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014:1532–1543. DOI: 10.3115/v1/D14-1162.

10. Daniel T.L., Chantal D.L. Discovering knowledge in data: an introduction to data mining. New Jersey, Wiley-interscience. John Wiley & Sons, Inc; 2005. 222 p.

11. Haneen A.A.A., Ahmad B.A.H. Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data. 2019:221–248

12. Li B. Importance weighted feature selection strategy for text classification. International Conference on Asian Language Processing (IALP). 2016:344–347.

13. Cristianini N., Shawe-Taylor J. An introduction to support vector machines: and other kernel-based learning methods. Cambridge, Cambridge University Press; 2000. 204 p. DOI: 10.1017/CBO9780511801389.

14. Shafieezadeh-Abadeh S., Esfahani P.M., Kuhn D., Distributionally robust logistic regression. Advances in Neural Information Processing Systems. 2015:1576–1584.

15. Champandard A.J. AI Game Development: Synthetic Creatures with Learning and Reactive Behaviors. San Francisco, New Riders Pub; 2003. 500 p.

Terekhova Anastasia Dmitrievna

Email: nastyakr@list.ru

ORCID | eLibrary |

Volgograd State Technical University
OZON Tech

Volgograd, Russian Federation

Terekhov Grigory Vladimirovich

Email: grvlter@gmail.com

Scopus | ORCID | eLibrary |

Volgograd State Technical University

Volgograd, Russian Federation

Sychev Oleg Aleksandrovich
Candidate of Technical Sciences, Associate Professor
Email: oasychev@gmail.com

Scopus | ORCID | eLibrary |

Volgograd State Technical University

Volgograd, Russian Federation

Keywords: adjective ordering, natural language processing, word vector representation, gloVe, classification methods, hypernyms

For citation: Terekhova A.D. Terekhov G.V. Sychev O.A. Application of machine learning for adjective ordering in English sentences. Modeling, Optimization and Information Technology. 2023;11(1). Available from: https://moitvivt.ru/ru/journal/pdf?id=1301 DOI: 10.26102/2310-6018/2023.40.1.028 (In Russ).

170

Full text in PDF

Received 11.01.2023

Revised 09.03.2023

Accepted 20.03.2023

Published 22.03.2023