Keywords: sentiment analysis, computational linguistics, machine learning, classification features, hybrid intelligent system, support vector machine, random forest
Models and methods for sentiment analysis of texts in Bashkir language
UDC 004.048
DOI: 10.26102/2310-6018/2020.30.3.016
The research works on automatic opinion extraction are still relevant. The article presents a formal description of the term opinion, setting tasks depending on the determined properties of opinion. The problems of solving the tasks of sentiment analysis, approaches to its solution and readymade software implementations are described. Available corpora of texts in the Bashkir language are presented, and also task statement for sentiment analysis in the Bashkir language. Presented solution, which include an algorithm for tagging the texts, a preprocessing algorithm, a choice of classification features, and classification algorithms. Also, the results of computational experiment, which aimed to define the most effective classifier based on quality metric, are present. The results in this work and the developed software solution based on SVM with stochastic gradient descent, which demonstrated the highest indicators in the criteria of accuracy, completeness, and F-measure, can be used to sentiment analysis of news sites in the Bashkir language. The results of the research presented in this article were supported by Grants RFBR 19-07-00709, 20-08-00668 and Ministry of Science and Higher Education of the Russian Federation in the framework of the work under the State Assignment of Ufa State Aviation Technical University # FEUE-2020-0007.
1. Ananeva M. I., Kobozeva M. V., Solovev F. N., Poliakov I. V., Chepovskii A. M.On the problem of revealing extremist ideology in texts. Bulletin of Novosibirsk State University. Series:Information Technologies. 2016;14 (4):5–13. (in Russian)
2. Bashkir poetry corpus. Available from: http://webcorpora.net/bashcorpus/search/?interface_language=ru (Accessed:30th April 2020). (in Russian)
3. Bodrunova S. S. Cross-Cultural Sentiment Analysis of Users’ Texts in. Bulletin of Moscow State university. Series 10. Journalism. 2018;6:191-212. (in Russian)
4. Voronina I. E., Goncharov V. A. Analysis of Emotional Sentiments in Social Network Messages (using Vkontakte Network as an example). Bulletin of Voronezh State University. Series:System Analysis and information Technologies. 2015;4:151-158. (in Russian)
5. Gorbushin, D. A., Grinchenkov D.V., Mokhov V.A., Nguen Fuk Khau System Analysis of Approaches on Solving the Task of Text Sentiment Identification. The tidinngs of HEI. North Caucasus Region. Technical Sciences. 2016;2:36-41. (in Russian)
6. Garshina V.V., Kalabukhov K. S., Stepantsov V. A., Smotrov S.V. Development of a System for Text Information Sentiment Analysis. Bulletin of Voronezh State university. Series:System Analysis and information Technologies. 2017;3:185-194. (in Russian)
7. Ermakov A. E., Kiselev S.L. Linguistic model for Computational Sentiment Analysis of Mass media Publications. Computational Linguistics and Intelligent technologies:proceedings of the International conference Dialog’2005. Moscow:Nauka. 2005:616. Available from: http://www.dialog-21.ru/media/2365/ermakov-kiselev.pdf (Accessed:30th April 2020). (in Russian)
8. Klekovkina M.V., Kotelnikov E.V. A Method for Automated Sentiment Classification of Texts Based on the Dictionary of Emotional Lexicon. Digital Libraries:Advanced Methods and Technologies, Digital Collections:proceedings of the 14th All-Russian Scientific Conference (RCDL-2012). (Pereslavl-Zalesskii, Russia, 15-18 October 2012). 2012:81-86. Available from: http://ceur-ws.org/Vol-934/paper15.pdf (Accessed:30th April 2020). (in Russian)
9. Kolmogorova A.V., Kalinin A. A., Malikova A.V. Linguistic Principles and Methods of Computational Linguistics for Solving the Tasks of Sentiment Analysis of Russian Texts. Open Issues of Philology and Pedagogic Linguistics. 2018;1(29):139-148.(in Russian)
10. Kotelnikov E.V. Combined Method for Automatic Sentiment Identification of a Text. Software Products and System. 2012;3:189-195. (in Russian)
11. Krasnov F.V. Sentiment Analysis of Applied Scientific Articles on Oil and Gas Industry with use of Artificial Neural. Bullentin of Eurasian Science. 2018;3(10). Available from: https://esj.today/PDF/43ITVN318.pdf (Accessed:30th April 2020). (in Russian)
12. Lukashevich N.V. Automatic Sentiment Analysis of the Text with Respect to the Predefined Object and its Characteristics. Russian Digital Libraries Journal. 2015;18(3- 4):88-119. (in Russian)
13. Lukashevich N.V., Chetverkin I. I. Combining of Thesaurus and Corporal Knowledge for Extracting the Words of Characteristics. Systems and Means of Informatics .2015;25(1): 20–33. (in Russian)
14. Menshikov I. L., Kudriavtsev A. G. A Survey on Sentiment Analysis for Texts in Russian. Young Scientist. 2012;12(47):140-143. Available from: https://moluch.ru/archive/47/5951/ (Accessed:30th April 2020). (in Russian)
15. Minina M. A. Psycholinguistic Analysis of Evaluative Sematics (on the Material of the verbs of Movement):10.02.19. Thesis for the degree of candidate of philological sciences, Moscow. 2005. (in Russian)
16. i-Teco official Website. Available from: https://www.iteco.ru/solutions/business_intelligence_products/analiz_tonalnosti_teksta/ (Accessed:30th April 2020). (in Russian)
17. Pazelskaia A. G., Solovev A. N. Method for emotion Analisys in Russian. Computational Linguistics and Intelligent Technologies:proceedings of the International conference Dialog (Bekasovo, 25–29 May 2011). Publishing House of the Russian State University for the Humanities. 2011;10(17):510-552.(in Russian)
18. Posevkin R.V. Automation of the Sentiment analysis of the Text. Inter disciplinary Dialog:Novel Trends in Humanities, Natural, and Technical Sciences. Proceedings of the IV All-Russian Conference on Applied Sciences for Lecturers, Scientists, Experts, and doctoral Students. 2015:242-244.(in Russian)
19. Romanov A. S., Vasileva M. I., Kurtukova A.V., Meshcheriakov R.V. Sentiment Analysis of the text with use of Machine learning Methods. Available from: http://ceurws.org/Vol-2233/Paper_8.pdf (Accessed:30th April 2020). (in Russian)
20. Sarbasova A. N. Exploration of the Sentiment Analysis Methods for the texts in Russian. Young Scientist. 2015;8 (88):143-146. Available from: https://moluch.ru/archive/88/17413/ (Accessed:30th April 2020). (in Russian)
21. Sirazitdinov Z. A., Polianin A. I., Ibragimova A. D., Ishmukhametova A. Sh. Corpora of Bashkir Language:Development Principles. Problems of Orientalism. 2013;4 (62):65-72. (in Russian)
22. Tolkunov A. A. Model for Realtime Analytical Processing of Textual Comments on Legislative Proposals. Thesis for the degree of candidate of technical sciences: 05.13.17. Academy of Federal Protective Service, Orel. 2014: 24. (in Russian)
23. Tutubalina E.V., Ivanov V.V., Zagulova M., Mingazov N., Alimova I., Malykh V. Testing Dictionary-Based Methods of Sentiment Analysis. Digital Libraries. 2015;18(3- 4):138-162. (in Russian)
24. Ustalov D. V. Extracting terms from Russian texts with use of Graph-Based Methods. Available from: http://koost.eveel.ru/ science/ CSEDays2012.pdf (Accessed:30th April 2020). (in Russian)
25. Oral Corpus of Bashkir Language. Available from: https://linghub.ru/oral_bashkir_corpus/ (Accessed:30th April 2020). (in Russian)
26. Chirkin E.S., Lopatin D.V. Approaches on Fuzzy Search for Unintended Content on a Web-Page. Bulletin of Tambov University. Series:Natural and Technical Sciences. Tambov. 2016;21(6): 2358-2365. (in Russian)
27. Abbasi M. M., Beltiukov A. P. Анализ эмоций из текста на русском языке с использованием синтаксических методов. Information Technology and Systems:7th International Science Conference. At Khanty-Mansiysk. Russian Federation. 2019. Available from: https://www.researchgate.net/publication/333489703Analiz_em ocijiz_teksta_na_russkom_azyke_s_ispolzovaniem_sintaksiceskih_metodov (accessed 30.04.2020).
28. Yan G. et al. A bilingual approach for conducting Chinese and English social media sentiment analysis. Computer Networks. 2014;75(PB):491-503.
29. Kadam S.A., Joglekar S.T. Sentiment Analysis:An Overview. International Journal of Research in Engineering & Advanced Technology. 2013;1(4).
30. Kennedy A., Inkpen D. Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence. 2006;22:110-125.
31. Liu B. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies #16. 2012;XIV:165.
32. Moilanen K., Pulman S., Zhang Y. Packed Feelings and Ordered Sentiments:Sentiment Parsing with Quasi-compositional Polarity Sequencing and Compression. Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2010):proceedings of the 1st Workshop at the 19th European Conference on Artificial Intelligence (ECAI 2010).2010:36-43.
33. Opinion lexicon English Available from: https://github.com/jeffreybreen/twittersentiment-analysis-tutorial-201107/tree/master/data/opinion-lexicon-English (accessed 30.04.2020).
34. Potapova R., Komalova L. Multimodal perception of aggressive behavior. Lecture Notes in Computer Science. 2016;9811:499-506.
35. Wiebe J.M., Wilson, T., Cardie, C. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation. 2005;39 (2-3):165-210.
Keywords: sentiment analysis, computational linguistics, machine learning, classification features, hybrid intelligent system, support vector machine, random forest
For citation: Suleimanov A.K., Sharipova M.A., Smetanina O.N., Sazonova E.Y., Mironov K.V. Models and methods for sentiment analysis of texts in Bashkir language. Modeling, Optimization and Information Technology. 2020;8(3). URL: https://moit.vivt.ru/wp-content/uploads/2020/08/SuleimanovSoavtors_3_20_1.pdf DOI: 10.26102/2310-6018/2020.30.3.016 (In Russ).
Published 30.09.2020