Влияние размера образа ключа на производительность хеш-таблиц в современных архитектурах
Работая с сайтом, я даю свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта обрабатывается системой Яндекс.Метрика
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Impact of key representation size on hash tables performance in modern CPUs

Buevich E.A. 

UDC 519.6
DOI: 10.26102/2310-6018/2025.50.3.014

  • Abstract
  • List of references
  • About authors

This paper considers a method for increasing the search speed in hash tables with links if the problem assumes that the performance is limited by the throughput of one of the interfaces between the storage levels (caches L1, L2, L3, memory). To reduce the impact of this limitation, an algorithm for optimal use of the cache line size, the minimum portion of information transferred between the storage levels, is proposed. The paper shows that there is an optimal size of information about a key in a hash table (key representation) for a specific problem and architecture; equations are given for its numerical and approximate analytical calculation for the cases of a key present and absent in the table. A separate case of using a part of a key as its representation in the table is considered. An algorithm for working with inconvenient key representation sizes that are not a power of two is proposed. The presented calculation results confirm the increase in search performance when using a calculated key representation size compared to other options. The presented experimental result confirms the assumption that the associated complication of the code has virtually no effect on performance due to partial processor idleness. The work assumes collision resolution via chains, but similar calculations should be applicable to other methods given their specific features.

1. Knuth D.E. The Art of Computer Programming. Volume 3. Sorting and Searching. Moscow: I.D. Vil'yams; 2018. 832 p. (In Russ.).

2. Farach-Colton M., Krapivin A., Kuszmaul W. Optimal Bounds for Open Addressing Without Reordering. In: 2024 IEEE 65th Annual Symposium on Foundations of Computer Science (FOCS), 27–30 October 2024, Chicago, IL, USA. IEEE; 2024. P. 594–605. https://doi.org/10.1109/FOCS61266.2024.00045

3. Zukowski M., Héman S., Boncz P. Architecture-Conscious Hashing. In: DaMoN '06: Proceedings of the 2nd International Workshop on Data Management on New Hardware, 25 June 2006, Chicago, IL, USA. New York: Association for Computing Machinery; 2006. https://doi.org/10.1145/1140402.114041

4. De Cristo F., De Almeida E., Alves M. ViViD Cuckoo Hash: Fast Cuckoo Table Building in SIMD. In: Proceedings of the 20th Symposium on High Performance Computing Systems, SSCAD 2019, 16–18 October 2019, Campo Grande, MS, Brasil. Porto Alegre: Sociedade Brasileira de Computação; 2019. P. 288–299. https://doi.org/10.5753/wscad.2019.8676

5. Thorup M. String Hashing for Linear Probing. In: SODA '09: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, 04–06 January 2009, New York, USA. Philadelphia: Society for Industrial and Applied Mathematics; 2009. P. 655–664. https://doi.org/10.1137/1.9781611973068.72

6. Barredo A., Cebrian J.M., Valero M., Casas M., Moreto M. Efficiency Analysis of Modern Vector Architectures: Vector ALU Sizes, Core Counts and Clock Frequencies. The Journal of Supercomputing. 2020;76(3):1960–1979. https://doi.org/10.1007/s11227-019-02841-6

7. Chen S., Ailamaki A., Gibbons P.B., Mowry T.C. Improving Hash Join Performance Through Prefetching. ACM Transactions on Database Systems. 2007;32(3). https://doi.org/10.1145/1272743.1272747

8. Ross K.A. Efficient Hash Probes on Modern Processors. In: 2007 IEEE 23rd International Conference on Data Engineering, 15 April 2006 – 20 April 2007, Istanbul, Turkey. IEEE; 2007. P. 1297–1301. https://doi.org/10.1109/ICDE.2007.368997

9. Herlihy M., Shavit N., Tzafrir M. Hopscotch Hashing. In: Distributed Computing: 22nd International Symposium, DISC 2008: Proceedings, 22–24 September 2008, Arcachon, France. Berlin, Heidelberg: Springer; 2008. P. 350–364. https://doi.org/10.1007/978-3-540-87779-0_24

10. Tripathi R.R.K., Singh P.K., Singh S. Templing and Temple Search-Same Height Hashing. [Preprint]. ResearchSquare. URL: https://doi.org/10.21203/rs.3.rs-1947527/v1 [Accessed 21st May 2025].

Buevich Evgeniy Andreevich

Moscow State Technological University "STANKIN"

Moscow, Russian Federation

Keywords: hash, hash-table, open addressing, chain, collision, memory level parallelism, cache, cache-line, cache miss

For citation: Buevich E.A. Impact of key representation size on hash tables performance in modern CPUs. Modeling, Optimization and Information Technology. 2025;13(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1974 DOI: 10.26102/2310-6018/2025.50.3.014 .

44

Full text in PDF

Received 27.05.2025

Revised 27.06.2025

Accepted 08.07.2025