Реконструкция c, φ и E50 по лабораторным данным: интерпретируемый ансамбль и сравнение моделей
Работая с сайтом, я даю свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта обрабатывается системой Яндекс.Метрика
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Reconstruction c, φ and E50 from laboratory data: interpretable ensemble and model comparison

idTishin N.R.

UDC 004.942+624.131.37
DOI: 10.26102/2310-6018/2026.56.5.002

  • Abstract
  • List of references
  • About authors

The article addresses the problem of reconstructing the strength and deformation characteristics of soils, namely cohesion c, internal friction angle φ, and secant deformation modulus E50, from physical and classification features available in routine laboratory reports. The relevance of the study is due to the fact that, in engineering geological practice, mechanical parameters are not determined for all samples, although these parameters are essential for foundation design calculations and for the parameterization of geotechnical models. The study is based on an archive of laboratory soil data, for which quality control, filtering, informative feature engineering, and independent external validation were performed. To solve the problem, a comparative analysis of machine learning models for tabular data was carried out, including CatBoost, FT-Transformer, and a multitask neural network, and an interpretable model ensemble was also considered. In addition, feature importance analysis was performed to assess the physical consistency of the obtained predictions. It is shown that the best performance is achieved by an ensemble with a dominant contribution from CatBoost, namely FT-Transformer (0.10) + CatBoost (0.90), yielding mean WAPE = 13.16 %, mean R2 = 0.877 and mean Асс±20% = 76.36 %. On the test set, the best solutions provide high-quality reconstruction of the target parameters, while external validation on an independent site confirms the robustness of the approach. It was found that the parameters c and φ are reconstructed most reliably, whereas predicting E50 is a more challenging task due to the greater sensitivity of this parameter to testing conditions and the structural features of the soil. The practical significance of the study lies in the fact that the proposed approach enables a justified reconstruction of missing mechanical soil parameters from standard laboratory test data and can be used in digital systems for engineering geological modeling, laboratory data processing, and preparation of design parameters for engineering practice.

1. Schanz T., Vermeer P.A., Bonnier P.G. The Hardening Soil Model: Formulation and Verification. In: Beyond 2000 in Computational Geotechnics: 10 years of PLAXIS International: Proceedings of the International Symposium beyond 2000 in Computational Geotechnics, 18–20 March 1999, Amsterdam, The Netherlands. Rotterdam: A. A. Balkema; 1999. P. 281–296.

2. Zhao T., Shen F., Xu L. Review and comparison of machine learning methods in developing optimal models for predicting geotechnical properties with consideration of feature selection. Soils and Foundations. 2024;64(6). https://doi.org/10.1016/j.sandf.2024.101523

3. Yuan B., Choo Ch.S., Yeo L.Y., et al. Physics-informed machine learning in geotechnical engineering: a direction paper. Geomechanics and Geoengineering. 2025;20(5):1128–1159. https://doi.org/10.1080/17486025.2025.2502029

4. Alzighaibi W.A., Daghistani F. Machine Learning in Geotechnical Engineering: A State-of-the-Art Review of Research Progress and Barriers to Real-World Implementation. In: 2025 IEEE International Conference on Emerging Trends in Engineering and Computing (ETECOM), 29–30 October 2025, Riffa, Bahrain. IEEE; 2025. https://doi.org/10.1109/ETECOM66111.2025.11319124

5. Bozorgzadeh N., Feng Y. Evaluation structures for machine learning models in geotechnical engineering. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 2024;18(1):52–59. https://doi.org/10.1080/17499518.2024.2313485

6. Lei D., Zhang Y., Lu Zh., et al. A machine learning framework for predicting shear strength properties of rock materials. Scientific Reports. 2025;15(1). https://doi.org/10.1038/s41598-025-91436-8

7. Ahmad M., Zubi M.A., Almujibah H., et al. Improved prediction of soil shear strength using machine learning algorithms: interpretability analysis using SHapley Additive exPlanations. Frontiers in Earth Science. 2025;13. https://doi.org/10.3389/feart.2025.1542291

8. Yuke W., Shuang F., Yanhui Zh., Bei Zh., et al. A data-driven model for predicting shear strength indexes of normally consolidated soils. Chinese Journal of Geotechnical Engineering. 2023;45(S2):183–188. https://doi.org/10.11779/CJGE2023S20025

9. Lundberg S., Lee S.-I. A Unified Approach to Interpreting Model Predictions. arXiv. URL: https://doi.org/10.48550/arXiv.1705.07874 [Accessed 10th March 2026].

10. Jas K., Dodagoudar G.R. Explainable machine learning model for liquefaction potential assessment of soils using XGBoost-SHAP. Soil Dynamics and Earthquake Engineering. 2023;165. https://doi.org/10.1016/j.soildyn.2022.107662

11. Prokhorenkova L., Gusev G., Vorobev A., et al. CatBoost: unbiased boosting with categorical features. arXiv. URL: https://doi.org/10.48550/arXiv.1706.09516 [Accessed 10th March 2026].

12. Gorishniy Y., Rubachev I., Khrulkov V., Babenko A. Revisiting Deep Learning Models for Tabular Data. arXiv. URL: https://doi.org/10.48550/arXiv.2106.11959 [Accessed 10th March 2026].

13. Somepalli G., Goldblum M., Schwarzschild A., Bruss C.B., Goldstein T. SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. arXiv. URL: https://doi.org/10.48550/arXiv.2106.01342 [Accessed 10th March 2026].

14. Hollmann N., Müller S., Eggensperger K., Hutter F. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. arXiv. URL: https://doi.org/10.48550/arXiv.2207.01848 [Accessed 11th March 2026].

15. Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13–17 August 2016, San Francisco, CA, USA. New York: ACM; 2016. P. 785–794. https://doi.org/10.1145/2939672.2939785

16. Ke G., Meng Q., Finley Th., et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 04–09 December 2017, Long Beach, CA, USA. 2017. P. 3146–3154.

17. Friedman J.H. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics. 2001;29(5):1189–1232. https://doi.org/10.1214/aos/1013203451

18. Breiman L. Stacked Regressions. Machine Learning. 1996;24:49–64. https://doi.org/10.1007/BF00117832

19. Caruana R. Multitask Learning. Machine Learning. 1997;28:41–75. https://doi.org/10.1023/A:1007379606734

20. Salih A.M., Raisi-Estabragh Z., Galazzo I.B., et al. A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME. Advanced Intelligent Systems. 2024. https://doi.org/10.1002/aisy.202400304

21. Ankah M.L.Y., Adjei-Yeboah Sh., Ziggah Y.Y., Asare E.N. Advanced hybrid machine learning models with explainable AI for predicting residual friction angle in clay soils. Scientific Reports. 2025;15. https://doi.org/10.1038/s41598-025-05962-6

22. Wu P., Chen J., Huang J., et al. Interpretable machine learning approach for predicting lunar soil shear strength parameters based on data imputation techniques. Advances in Space Research. 2025;76(2):1091–1115. https://doi.org/10.1016/j.asr.2025.04.071

23. Molnar Ch. Interpretable Machine Learning. Lulu.com; 2020. 320 p.

24. Arik S.Ö., Pfister T. TabNet: Attentive Interpretable Tabular Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(8):6679–6687. https://doi.org/10.1609/aaai.v35i8.16826

Tishin Nikita Romanovich

Email: tnick1502@mail.ru

ORCID | eLibrary |

JSC MOSTDORGEOTREST
Bauman Moscow State Technical University

Moskow, Russian Federation

Keywords: engineering geology, soil mechanics, parameter reconstruction, tabular data, catBoost, FT-Transformer, multi-task learning, ensembling, SHAP

For citation: Tishin N.R. Reconstruction c, φ and E50 from laboratory data: interpretable ensemble and model comparison. Modeling, Optimization and Information Technology. 2026;14(5). URL: https://moitvivt.ru/ru/journal/article?id=2249 DOI: 10.26102/2310-6018/2026.56.5.002 (In Russ).

© Tishin N.R. Статья опубликована на условиях лицензии Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NS 4.0)
35

Full text in PDF

Скачать JATS XML

Received 25.02.2026

Revised 07.04.2026

Accepted 30.04.2026