Концепция и архитектура парсинга и хранения единой базы патентов и научных журнальных публикаций
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
issn 2310-6018

Concept and architecture of parsing and storing a unified database of patents and scientific journal publications

idKozina S.A., Kulinchenko I.A.,  idKorobkin D.M., idFomenkov S.A.

UDC 004.853
DOI: 10.26102/2310-6018/2024.47.4.024

The currently existing methods of automated data collection, although they facilitate this process, often face problems of low reliability, efficiency and speed. Unstable connections, blocking IP addresses and changes in the structure of sites lead to data loss and the need for constant monitoring of the parsing process, which increases the cost of maintaining and operating such systems. In this regard, the development of new approaches and tools for parsing the necessary information is a very urgent task that can transform the field of data mining. The article discusses the process of developing a module for parsing information from patent systems and websites of physics and technology journals using modern technologies and approaches, and also presents the results of checking its operability. This tool can be useful for patent offices, researchers, students, engineers, and scientists working in the subject area under consideration. The use of such a module will open up new opportunities for data mining and strategic decision-making in the field of innovative development, as well as for in-depth analysis of technological trends, identification of promising developments and building innovative development strategies.

Kozina Svetlana Alexandrovna


Volgograd State Technical University

Volgograd, Russian Federation

Kulinchenko Inna Alexandrovna

Volgograd State Technical University

Volgograd, Russian Federation

Korobkin Dmitriy Mikhailovich
Candidate of Technical Sciences, Docent


Volgograd State Technical University

Volgograd, Russian Federation

Fomenkov Sergey Alekseevich
Doctor of Technical Sciences, Professor


Volgograd State Technical University

Volgograd, Russian Federation

Keywords: patents, physics and technology journals, parsing, scalability, fault tolerance

For citation: Kozina S.A., Kulinchenko I.A., Korobkin D.M., Fomenkov S.A. Concept and architecture of parsing and storing a unified database of patents and scientific journal publications. Modeling, Optimization and Information Technology. 2024;12(4). URL: https://moitvivt.ru/ru/journal/pdf?id=1740 DOI: 10.26102/2310-6018/2024.47.4.024 (In Russ).


Full text in PDF

Received 13.11.2024

Revised 25.11.2024

Accepted 27.11.2024

Published 31.12.2024