Keywords: approximate query processing, query processing algorithms, query pipeline, cluster sampling, data warehouse, hybrid transactional-analytical data processing
Developing an algorithm for approximation query pipeline processing in a relational database management system
UDC 004.65
DOI: 10.26102/2310-6018/2022.38.3.027
The article considers an algorithm for approximate query processing in relational database management systems. The described algorithm makes it possible to obtain approximate results of queries with aggregation and grouping, which helps to apply it for the purposes of analytical query processing in order to reduce the response time when processing queries. The presented algorithms implement the method of random cluster sampling and employ software that provides means for obtaining an optimized distribution of the sample space using a sample quality metric. The coefficient of variation is chosen as such metric. The article also proposes a model of the analytical query pipeline given in the form of a directed acyclic graph. The approximate query processing algorithm is extended for the conditions of its application in a query flow, which enables the estimation of the confidence interval along with the result of processing the query pipeline. This algorithm can be utilized in the development of special database processor software that implements the architecture of approximate query processing in relational databases. This approach finds a place in the field of research on the synthesis of the structure of hybrid data warehouses that implement transactional-analytical data processing. Further research is expected to obtain an experimental evaluation of the presented approach.
1. Babcock B., Chaudhuri S., Das G. Dynamic sample selection for approximate query processing. Proceedings of International Conference on Management of Data, SIGMOD ’03. 2003;539–550. DOI: 10.1145/872819.872822.
2. Ganti V., Lee M., Ramakrishnan R. (2000). ICICLES: self-tuning samples for approximate query answering. VLDB. 2000;176–187.
3. Cormode G., Garofalakis M., Haas P.J., Jermaine C. Synopses for massive data: Samples, histograms, wavelets, sketches. Foundations and Trends in Databases. 2012;4(1–3):1–294. DOI:10.1561/1900000004.
4. Xu B., Tirthapura S., Busch C. Sketching asynchronous data streams over sliding windows. Distributed Computing. 2008;20(5):359–374. DOI:10.1007/s00446-007-0048-7.
5. Chaudhuri S., Ding B., Kandula S. Approximate query processing: No silver bullet. Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data. 2017;511–519. DOI: 10.1145/3035918.3056097.
6. Grigor'ev Yu.A., Ukharov A.O., Plutenko A.D. Ispol'zovanie veivlet-preobrazovaniya dlya priblizhennoi obrabotki mnogomernykh dannykh. Informatika i sistemy upravleniya. 2008;15(1):3–13. (In Russ.).
7. Gromei D.D., Kozlov S.V., Filimonov A.V. Optimization of sample space distribution for questions with grouping in the process of their approximate processing. Sistemy upravleniya i informatsionnye tekhnologii. 2022;89(3):48–54. DOI: 10.36622/VSTU.2022.89.3.011. (In Russ.).
8. Cao Y., Fan W. Data driven approximation with bounded resources. Proceedings of the VLDB Endowment. 2017;10(9):973–984. DOI: 10.14778/3099622.3099628.
9. Al-wesabi O.A., Abdullah N., Sumari P. (2020). Hybrid Storage Management Method for Video-on-Demand Server. Emerging Trends in Intelligent Computing and Informatics. 2020;1073:695–704. DOI: 10.1007/978-3-030-33582-3_65.
10. Kozlov S.V., Nevrov A.A., Latyshev I.P., Filimonov A.V. Approaches to approximate processing of analytical queries in relational database management systems. I-methods. (2021);13(4). Available from: http://intech-spc.com/wp-content/uploads/archive/2021/4/7-kozlov.pdf (accessed on: 30.09.2022) (In Russ.).
Keywords: approximate query processing, query processing algorithms, query pipeline, cluster sampling, data warehouse, hybrid transactional-analytical data processing
For citation: Filimonov A.V. Developing an algorithm for approximation query pipeline processing in a relational database management system. Modeling, Optimization and Information Technology. 2022;10(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1242 DOI: 10.26102/2310-6018/2022.38.3.027 (In Russ).
Received 19.09.2022
Revised 27.09.2022
Accepted 30.09.2022
Published 30.09.2022