The currently existing methods of automated data collection, although they facilitate this process, often face problems of low reliability, efficiency and speed. Unstable connections, blocking IP addresses and changes in the structure of sites lead to data loss and the need for constant monitoring of the parsing process, which increases the cost of maintaining and operating such systems. In this regard, the development of new approaches and tools for parsing the necessary information is a very urgent task that can transform the field of data mining. The article discusses the process of developing a module for parsing information from patent systems and websites of physics and technology journals using modern technologies and approaches, and also presents the results of checking its operability. This tool can be useful for patent offices, researchers, students, engineers, and scientists working in the subject area under consideration. The use of such a module will open up new opportunities for data mining and strategic decision-making in the field of innovative development, as well as for in-depth analysis of technological trends, identification of promising developments and building innovative development strategies.
1. Zagrebelny M.S. Intellectual property as key resource in the digital economy. Vestnik nauki. 2024;1(6):502–511. (In Russ.).
2. Gorbashko E.A., Karlik A.E., Shepelev R.E. Patent analytics as an element of strategic management of economic structures. Izvestiya Sankt-Peterburgskogo gosudarstvennogo ekonomicheskogo universiteta. 2023;(3–1):114–121. (In Russ.).
3. Nikolaev A.S. Patentnaya analitika. Saint Petersburg: ITMO University; 2022. 98 p. (In Russ.).
4. Nikitenko S.M., Mesyats M.A., Korolev M.K. Patent analytics as a tool of formation innovative sectors of the economy. Economics and Innovation Management. 2022;(1):86–95. (In Russ.). https://doi.org/10.26730/2587-5574-2022-1-86-95
5. Fedortsova A.S. Intellectual property objects. Russian Economic Bulletin. 2021;4(2):287–290. (In Russ.).
6. Mazanik A.A. Goals and main methods of patent-information search in electronic databases. In: Intellektual'naya sobstvennost' v sovremennom mire: vyzovy vremeni i perspektivy razvitiya: Materialy Mezhdunarodnoi nauchno-prakticheskoi konferentsii: Chast' 2, 20 October 2021, Minsk, Belarus. Minsk: Al'fa-kniga; 2021. pp. 7–13. (In Russ.).
7. Menshikov Ya.S. Advantages of automatic data collection in the Internet over manual data collection. Universum: tekhnicheskie nauki. 2022;10(103). (In Russ.). URL: https://7universum.com/ru/tech/archive/item/14383
8. Kozina S.A., Korobkin D.M., Fomenkov S.A. Formation of a unified database on physical subjects. Mathematical Methods in Technologies and Technics. 2021;(8):89–92. (In Russ.). https://doi.org/10.52348/2712-8873_MMTT_2021_8_89
9. Genin B.L., Zolkin D.S. Similarity search in patents databases. The evaluations of the search quality. World Patent Information. 2021;64. https://doi.org/10.1016/j.wpi.2021.102022
10. Feng Z. Formal Analysis for Natural Language Processing: A Handbook. Singapore: Springer; 2023. 796 p. https://doi.org/10.1007/978-981-16-5172-4
Kozina Svetlana Alexandrovna
ORCID |
Volgograd State Technical University
Volgograd, Russian Federation
Kulinchenko Inna Alexandrovna
Volgograd State Technical University
Volgograd, Russian Federation
Korobkin Dmitriy Mikhailovich
Candidate of Technical Sciences, Docent
ORCID |
Volgograd State Technical University
Volgograd, Russian Federation
Fomenkov Sergey Alekseevich
Doctor of Technical Sciences, Professor
ORCID |
Volgograd State Technical University
Volgograd, Russian Federation