Keywords: window function, hamming window, spectral analysis, voice signal processing, parameter optimization, gradient descent, biometric identification, spectrum estimation accuracy, STFT
Modified window function based on the Hamming window for improving the accuracy of determining the voice spectrum in audio recordings
UDC 004.622
DOI: 10.26102/2310-6018/2025.50.3.037
This paper addresses the problem of improving the accuracy of determining the spectral characteristics of voice signals in audio recordings. To solve this problem, a modification of the classical Hamming window function is proposed by introducing an optimizable parameter. The study's relevance stems from the need to improve the reliability of voice recognition and identification systems, especially in the context of biometric applications and authentication tasks. The main objective is the development of an algorithm for calculating the optimal value of this parameter, maximizing the quality of spectral analysis for specific voice frequency ranges. To achieve this objective, the gradient descent method was used to optimize the parameter of the modified function. Quality assessment was performed based on a weighted sum of spectral characteristics (peak factor, spectral line width, signal-to-noise ratio). Experiments were conducted on test signals simulating male (200–400 Hz) and female (220–880 Hz) voices. The results showed that the proposed approach improves the accuracy of determining spectral components, especially in the male baritone range (up to 5.42 % improvement), by achieving clearer identification of fundamental frequencies and reducing side-lobe levels compared to the classical Hamming window. The study's conclusions indicate the potential of adapting window functions to specific frequency ranges of voice signals. The proposed algorithm can be used to improve the performance of biometric identification systems and other applications requiring accurate spectral analysis of voice.
1. Harris F.J. On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform. Proceedings of the IEEE. 1978;66(1):51–83. https://doi.org/10.1109/PROC.1978.10837
2. Faizulaieva O.N., Nevlyudov I.S. Methods for Quality Enhancement of User Voice Signal in Voice Authentication Systems. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2014;(2):118–123. (In Russ.).
3. Alrubei M.A. Comparative Analysis of Interpolation Methods in Evaluation of the Frequency of a Discretized Harmonic Signal. Trudy MAI. 2023;(130). (In Russ.). URL: https://trudymai.ru/eng/published.php?ID=174612
4. Bakaev A.V. Vliyanie formatnykh oblastei na razborchivost' rechi. Informatsionnoe protivodeistvie ugrozam terrorizma. 2008;(11):83–90. (In Russ.).
5. Petukhov D.E., Belov Yu.S. Obzor chasto ispol'zuemykh algoritmov po optimizatsii stokhasticheskogo gradientnogo spuska. E-Scio. 2021;(1):553–561. (In Russ.).
6. Kulemzin D.V., Danilyuk S.S., Seleznev D.V. Analysis of Existing Technologies Personality Authentication by Voice Signal. Modern High Technologies. 2022;(10–1):80–83. (In Russ.). https://doi.org/10.17513/snt.39350
7. Dvorkovich V.P., Dvorkovich A.V. Okonnye funktsii dlya garmonicheskogo analiza signalov. Moscow: Tekhnosfera; 2014. 112 p. (In Russ.).
8. Kaiser J.F. Nonrecursive Digital Filter Design Using the I0-Sinh Window Function. In: Proceedings of the 1974 IEEE International Symposium on Circuits and Systems, 22–25 April 1974, San Francisco, California, USA. IEEE; 1974. P. 20–23.
9. Rabiner L.R., Schafer R.W. Digital Processing of Speech Signals. Moscow: Radio i svyaz'; 1981. 496 p. (In Russ.).
10. Arshakyan A.A., Larkin E.V. Signal-to-Noise Ratio Definition in Observation Systems. News of the Tula State University. Technical Sciences. 2012;(3):168–174. (In Russ.).
11. Kashirina I.L., Demchenko M.V. Research and Comparative Analysis of Optimization Methods Used in the Teaching of Neural Networks. Proceedings of VSU. Series: Systems Analysis and Information Technologies. 2018;(4):123–132. (In Russ.).
12. Tsydypova S.Yu., Tsybikov A.S. Hyperparameters of Gradient Methods for Training Neural Networks. In: Geometry of Manifolds and Its Applications: The Sixth Scientific Conference with International Participation, 27–29 August 2020, Ulan-Ude – Lake Baikal, Russia. Ulan-Ude: Buryat State University Publishing Department; 2020. P. 216–222. (In Russ.).
13. Zaicev A.A., Kureichik V.V., Polupanov A.A. Evolution Methods of Optimization Research Based on Swarm Intelligence. Izvestiya SFedU. Engineering Sciences. 2010;(12):7–12. (In Russ.).
14. Sysolyatina L.G. Zadacha mnogomernoi optimizatsii: metod N'yutona. Almanac of Modern Science and Education. 2012;(8):151–155. (In Russ.).
15. Panteleev A.V., Lobanov A.V. Gradient Optimization Methods in Machine Learning for the Identification of Dynamic Systems Parameters. Modelling and Data Analysis. 2019;9(4):88–99. (In Russ.). https://doi.org/10.17759/mda.2019090407
Keywords: window function, hamming window, spectral analysis, voice signal processing, parameter optimization, gradient descent, biometric identification, spectrum estimation accuracy, STFT
For citation: Shulzhenko A.D., Gorbunova D.A., Novoseltseva A.M., Davidchuk A.G. Modified window function based on the Hamming window for improving the accuracy of determining the voice spectrum in audio recordings. Modeling, Optimization and Information Technology. 2025;13(3). URL: https://moitvivt.ru/ru/journal/pdf?id=2016 DOI: 10.26102/2310-6018/2025.50.3.037 (In Russ).
Received 03.07.2025
Revised 04.08.2025
Accepted 11.08.2025