Разработка алгоритма индексирования данных на основе структуры данных CW-tree с применением параллельных вычислений
Работая с нашим сайтом, вы даете свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта отправляется в «Яндекс» и «Google»
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Development of a data indexing algorithm based on the CW-tree data structure using parallel computing

idShevskiy V.S.

UDC 004.657
DOI: 10.26102/2310-6018/2021.32.1.013

  • Abstract
  • List of references
  • About authors

The purpose of this scientific research was to develop an algorithm for data indexing using paral-lel computing technologies, the use of which would lead to a decrease in the time for processing queries to the database. In the course of the research, an algorithm for data indexing was devel-oped based on a special tree-based data storage structure called CW-tree (Constantly Wide tree). The CW-tree traversal in accordance with the proposed indexing algorithm is carried out in an asynchronous mode, which is achieved through the use of additional start vertices. Testing has shown that, in comparison with existing analogs of indexing algorithms, the proposed algorithm based on the CW-tree has advantages, the main of which is the complete parallelization of data retrieval both at the top level and at the level of the tree leaves. To establish the effect of using the CW-tree algorithm, a comparative study was carried out, namely: reading data using the B-tree, which is widely used in almost all modern database management systems, and reading data using the proposed CW-tree ... In the course of the study, queries to the database were built in such a way that a search was carried out for a specific key, it was established which approach would process a certain number of search queries faster. According to the results of the testing, it was found that searching for data in the CW-tree takes much less time than searching in the B-tree.

1. Nouri Z., Tu Y. GPU-Based Parallel Indexing for Concurrent Spatial Query Processing; Distributed and Parallel Databases. SSDBM'18: Proceedings of the 30th International Confer-ence on Scientific and Statistical Database Management. 2018;1-12. Available at: https://doi.org/10.1145/3221269.3221296 (accessed 20.01.202)

2. Shun J., Blelloch G. E. A Simple Parallel Cartesian Tree Algorithm and its Application to Parallel Suffix Tree Construction. Proceedings of the Thirteenth Workshop on Algorithm En-gineering and Experiments (ALENEX). 2011;(1):48-58.

3. Jarke M., Koch J. Query optimization in database systems. ACM Computing Surveys (CsUR). 1984;16(2):111-152.

4. Smith J. M., Chang P. Y. T. Optimizing the performance of a relational algebra database interface. Communications of the ACM (CACM). 1975;18(10):568-579.

5. Wu S., Li F., Mehrotra S., Chhin Ooi B. Query Optimization for Massively Parallel Da-ta Processing. SoCC Conference. 2011;(12):1–13.

6. Shnaiderman L., Shmueli O. A Parallel Tree Pattern Query Processing Algorithm for Graph Databases using a GPGPU. Proceedings of the Workshops of the EDBT/ICDT 2015 Joint Conference (EDBT/ICDT). 2020;1330(24):141. Available at: http://ceur-ws.org/Vol-1330/paper-24.pdf (accessed 20.01.2020)

7. Battre D., Angulo D. S. MPI framework for parallel searching in large biological data-bases. Journal of Parallel and Distributed Computing. 2006;66(12):1503–1511.

8. Dewan H. M., Stolfo S. J. Scalable Parallel and Distributed Expert Database Systems with Predictive Load Balancing. Journal of Parallel and Distributed Computing. 1994;22(3):506–522.

9. Ren D. Q. Algorithm level power efficiency optimization for CPU-GPU processing element in data intensive SIMD/SPMD computing. Journal of Parallel and Distributed Computing. 2011;72(2):245–253.

10. Liu P., Sun W., Zhang J., Zheng B. An automaton-based index scheme supporting twig queries for on-demand XML data broadcast. Journal of Parallel and Distributed Computing. 2015;(86):82–97.

11. Wani M. A., Bhat W. A. Dataset for forensic analysis of B-tree file system. Data in Brief. 2018;(18):2013–2018.

Shevskiy Vladislav Sergeevich

Email: immortalghost@yandex.ru

ORCID |

FSAEI OF HE Saint Petersburg State Electrotechnical University “LETI” named after V.I. Ul-yanov (Lenin),

Saint Petersburg, Russian Federation

Keywords: data indexing algorithms, tree data structure, parallel computing, multithreading, databases, search queries, start node, target node

For citation: Shevskiy V.S. Development of a data indexing algorithm based on the CW-tree data structure using parallel computing. Modeling, Optimization and Information Technology. 2021;9(1). Available from: https://moitvivt.ru/ru/journal/pdf?id=905 DOI: 10.26102/2310-6018/2021.32.1.013 (In Russ).

489

Full text in PDF