Use of multimodal neural network techniques to assess quality of roadways

Cover Page

Cite item

Full Text

Abstract

The article discusses the problem of automatic detection of pavement defects using multimodal neural network methods.

Aim. To develop and experimentally evaluate a multimodal neural network method for automatically detecting pavement defects using combined analysis of visual and three-dimensional data.

Methods. The Faster R-CNN model is used for detecting damage areas, the Swin Transformer Small model for classifying visual fragments, and the PointNet model for analyzing surface geometry based on lidar data. The predictions from each modality are combined by weighted summation (weights 0.1, 0.6, and 0.4, respectively). The training and testing are conducted on the RSRD multimodal dataset, which includes RGB images and point clouds obtained in various road and weather conditions.

Results. Experimental studies have shown that the multimodal approach provides an increase in classification accuracy ofup to 95.57%, as well as a significant improvement in defect detection metrics. For the pothole class, completeness increased by 27% and F1-score by 20% compared to using individual models.

Conclusions. The developed architecture demonstrates high stability and accuracy in the tasks of analyzing the roadway. The results obtained confirm the effectiveness of the integration of visual and spatial data and the expediency of using multimodal methods to build intelligent monitoring systems for road infrastructure.

About the authors

Mikhail G. Gorodnichev

Moscow Technical University of Communications and Informatics

Email: m.g.gorodnichev@mtuci.ru
ORCID iD: 0000-0003-1739-9831
SPIN-code: 4576-9642

Candidate of Engineering Sciences, Associate Professor, Dean of the Faculty of Information Technology

Russian Federation, 8A, Aviamotornaya street, Moscow, 111024, Russia

Ksenia A. Polyantseva

Moscow Technical University of Communications and Informatics

Email: k.a.poliantseva@mtuci.ru
ORCID iD: 0000-0002-7102-4208
SPIN-code: 8112-8560

Candidate of Technical Sciences, Associate Professor of the Department of Data Mining

Russian Federation, 8A, Aviamotornaya street, Moscow, 111024, Russia

Igor D. Razumovsky

Moscow Technical University of Communications and Informatics

Author for correspondence.
Email: igor.raz@list.ru

Student

Russian Federation, 8A, Aviamotornaya street, Moscow, 111024, Russia

References

  1. Kozyrev S.V., Polyantseva K.A. Comprehensive analysis and comparison of advanced road surface defect detection algorithms using various data collection systems. Inzhenernyy vestnik Dona [Engineering Bulletin of the Don]. 2024. No. 11(119). Pp. 72–116. EDN: JHKKTB. (In Russian)
  2. Ranyal E., Sadhu A., Jain K. Road condition monitoring using smart sensing and artificial intelligence: a review. Sensors. 2022. Vol. 22. No. 8. P. 3044. doi: 10.3390/s22083044
  3. Abdelwahed S.H., Sharobim B.K., Wasfey B. et al. Advancements in real-time road damage detection: a comprehensive survey of methodologies and datasets. Journal of Real-Time Image Processing. 2025. Vol. 22. P. 137. doi: 10.1007/s11554-025-01683-1
  4. Polyantseva K.A., Gorodnichev M.G. Neural network approaches in the problems of detecting and classifying roadway defects. Wave Electronics and Its Application in Information and Telecommunication Systems. 2022. Vol. 5. No. 1. Pp. 364–370. EDN: CFBLOQ
  5. Polyantseva K.A. Development of data accumulation algorithms using a stereo pair and detection of road surface defects. Sovremennye naukoemkie tekhnologii [Modern High Technologies]. 2022. No. 5-1. Pp. 107–112. doi: 10.17513/snt.39156. (In Russian)
  6. Ma N., Fan J., Wang W. et al. Computer vision for road imaging and pothole detection: a state-of-the-art review of systems and algorithms. Transportation Safety and Environment. 2022. Vol. 4. No. 4. P. tdac026. doi: 10.1093/tse/tdac026
  7. Toral V., Krushangi T., Varia Harishkumar R. Automated potholes detection using vibration and vision-based techniques. World Journal of Advanced Engineering Technology and Sciences. 2023. Vol. 10. No. 1. Pp. 157–176.
  8. Wu C., Wang Z., Hu S. et al. An automated machine-learning approach for road pothole detection using smartphone sensor data. Sensors. 2020. Vol. 20. No. 19. P. 5564. doi: 10.3390/s20195564
  9. Sholevar N., Golroo A., Esfahani S.R. Machine learning techniques for pavement condition evaluation. Automation in Construction. 2022. Vol. 136. P. 104190. doi: 10.1016/j.autcon.2022.104190
  10. Dong D., Li Z. Smartphone sensing of road surface condition and defect detection. Sensors. 2021. Vol. 21. No. 16. P. 5433. doi: 10.3390/s21165433
  11. Raslan E., Alrahmawy M.F., Mohammed Y.A. et al. Evaluation of data representation techniques for vibration based road surface condition classification. Scientific Reports. 2024. Vol. 14. P. 11620. doi: 10.1038/s41598-024-61757-1
  12. Jahan I.A., Huq A.S., Mahadi M.K. et al. RoadSense: a framework for road condition monitoring using sensors and machine learning. IEEE Transactions on Intelligent Vehicles. 2024. doi: 10.1109/TIV.2024.3486020
  13. Gu J., Lind A., Chhetri T.R. et al. End-to-end multimodal sensor dataset collection framework for autonomous vehicles. Sensors. 2023. Vol. 23. No. 15. P. 6783. doi: 10.3390/s23156783
  14. Faisal A., Gargoum S. Cost-effective LiDAR for pothole detection and quantification using a low-point-density approach. Automation in Construction. 2025. Vol. 172. P. 106006. doi: 10.1016/j.autcon.2025.106006
  15. Yang C., Yang L., Duan H. et al. A review of pavement defect detection based on visual perception. International Journal of Mechatronics and Applied Mechanics. 2024. No. 17. Pp. 131–146.
  16. Mkrtchian G., Polyantseva K. On the use of an acoustic sensor in the tasks of determining defects in the roadway. Systems of Signals Generating and Processing in the Field of on Board Communications. 2024. Vol. 7. No. 1. Pp. 276–280. doi: 10.1109/IEEECONF60226.2024.10496721
  17. Safyari Y., Mahdianpari M., Shiri H. A review of vision-based pothole detection methods using computer vision and machine learning. Sensors. 2024. Vol. 24. No. 17. P. 5652. doi: 10.3390/s24175652
  18. Chen W., Yang J.S., Xia C. et al. Road surface damage detection based on enhanced YOLOv8. Computers in Industry. 2025. Vol. 173. P. 104363. doi: 10.1016/j.compind.2025.104363
  19. Lincy A., Dhanarajan G., Kumar S.S., Gobinath B. Road pothole detection system. ITM Web of Conferences. 2023. Vol. 53. P. 01008. doi: 10.1051/itmconf/20235301008
  20. Yang L., Deng J., Duan H. et al. An efficient fusion detector for road defect detection. Scientific Reports. 2025. Vol. 15. P. 27959. doi: 10.1038/s41598-025-01399-z
  21. Song W., Zhang Z., Zhang B. et al. ISTD-PDS7: A benchmark dataset for multi-type pavement distress segmentation from ccd images in complex scenarios. Remote Sensing. 2023. Vol. 15. No. 7. P. 1750. doi: 10.3390/rs15071750
  22. Zuo C., Huang N., Yuan C., Li Y. Pavement-DETR: a high-precision real-time detection transformer for pavement defect detection. Sensors. 2025. Vol. 25. No. 8. P. 2426. doi: 10.3390/s25082426
  23. Arya D., Maeda H., Ghosh S.K. et al. RDD2022: a multi-national image dataset for automatic road damage detection. Geoscience Data Journal. 2024. Vol. 11. Pp. 846–862. doi: 10.1002/gdj3.260
  24. Xiao X., Li Zh., Wang W. et al. TD-RD: a top-down benchmark with real-time framework for road damage detection. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Hyderabad, India, 2025. Pp. 1–5. doi: 10.1109/ICASSP49660.2025.10888616
  25. Abdelkader M.F., Hedeya M.A., Samir E. et al. EGY_PDD: a comprehensive multi-sensor benchmark dataset for accurate pavement distress detection and classification. Multimedia Tools and Applications. 2025. Vol. 84. Pp. 38509–38544. doi: 10.1007/s11042-025-20700-w
  26. Xiao X. et al. Roadbench: A vision-language foundation model and benchmark for road damage understanding. arXiv preprint arXiv:2507.17353. 2025. URL: https://arxiv.org/abs/2507.17353. (accessed 09/01/2025)
  27. Khandakar A., Michelson D.G., Naznine M. et al. Harnessing smartphone sensors for enhanced road safety: a comprehensive dataset and review. Scientific Data. 2025. Vol. 12. P. 418. doi: 10.1038/s41597-024-04193-0
  28. Polyantseva K., Gorodnichev M. On the applicability of multimodal neural network methods for determining the quality of the road surface. 2025 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO). Tyumen, Russian Federation, 2025. Pp. 1–6. doi: 10.1109/SYNCHROINFO65403.2025.11079337
  29. Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497. 2016. doi: 10.48550/arXiv.1506.01497
  30. Terven J., Cordova-Esparza D. A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning and Knowledge Extraction. 2024. Vol. 5. Pp. 1680–1716. doi: 10.48550/arXiv.2304.00501
  31. He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385. 2015. doi: 10.48550/arXiv.1512.03385
  32. Tan M., Le Q.V. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine Learning. 2019. doi: 10.48550/arXiv.1905.11946
  33. Liu Z., Lin Y., Cao Y. et al. Swin transformer: hierarchical vision transformer using shifted Windows. arXiv preprint arXiv:2103.14030. 2021. doi: 10.48550/arXiv.2103.14030
  34. Ma L., Li Y., Li J. et al. Mobile laser scanned point-clouds for road object detection and extraction: a review. Remote Sensing. 2018. Vol. 10. No. 10. P. 1531. doi: 10.3390/rs10101531
  35. Zhao H., Jiang L., Jia J. et al. Point Transformer. arXiv preprint arXiv:2012.09164. 2021. doi: 10.48550/arXiv.2012.09164
  36. Qi C.R., Su H., Mo K., Guibas L.J. PointNet: deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv:1612.00593. 2017. doi: 10.48550/arXiv.1612.00593
  37. Qi C.R., Yi L., Su H., Guibas L.J. PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413. 2017. doi: 10.48550/arXiv.1706.02413

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2026 Gorodnichev M.G., Polyantseva K.A., Razumovsky I.D.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Согласие на обработку персональных данных

 

Используя сайт https://journals.rcsi.science, я (далее – «Пользователь» или «Субъект персональных данных») даю согласие на обработку персональных данных на этом сайте (текст Согласия) и на обработку персональных данных с помощью сервиса «Яндекс.Метрика» (текст Согласия).