Using convolutional neural networks for acoustic-based emergency vehicle detection
- Authors: Lisov A.A.1, Kulganatov A.Z.1, Panishev S.A.1
-
Affiliations:
- South Ural State University
- Issue: Vol 9, No 1 (2023)
- Pages: 95-107
- Section: Original studies
- URL: https://journals.rcsi.science/transj/article/view/126662
- DOI: https://doi.org/10.17816/transsyst20239195-107
- ID: 126662
Cite item
Full Text
Abstract
Background: A siren is a special signal given by emergency vehicles such as fire trucks, police cars and ambulances to warn drivers or pedestrians on the road. However, drivers sometimes may not hear the siren due to the sound insulation of a modern car, the noise of city traffic, or their own inattention. This problem can lead to a delay in the provision of emergency services or even to traffic accidents.
Aim: develop an acoustic method for detecting the presence of emergency vehicles on the road through the use of convolutional neural networks.
Materials and Methods: The algorithm of work is based on the conversion of sound from the external environment into its spectrogram, for analysis by a convolutional neural network. An open dataset (“Emergency Vehicle Siren Sounds”) from sources available on Internet sites such as Google and Youtube, saved in the “.wav” audio format, was used as a dataset for siren sounds and city traffic. The code was developed on the Google.Colab platform using cloud storage.
Results: The conducted experiments showed that the proposed method and model of the neural network make it possible to achieve an average efficiency of determining the type of sound with an accuracy of 93.3 % and a speed recognition of 0.0004±5 % of a second.
Conclusion: The use of the developed technology for recognizing siren sounds in city noize will improve traffic safety and increase the chances of preventing a dangerous situation. Also, this system can be an additional assistant for hearing-impaired people while driving and everyday life for timely notification of the presence of emergency services nearby.
Full Text
##article.viewOnOriginalSite##About the authors
Andrey A. Lisov
South Ural State University
Author for correspondence.
Email: lisov.andrey2013@yandex.ru
ORCID iD: 0000-0001-7282-8470
SPIN-code: 1956-3662
postgraduate student
Russian Federation, ChelyabinskAskar Z. Kulganatov
South Ural State University
Email: kulganatov97@gmail.com
ORCID iD: 0000-0002-7576-7949
SPIN-code: 7607-9723
postgraduate student
Russian Federation, ChelyabinskSergei A. Panishev
South Ural State University
Email: panishef.serega@mail.ru
ORCID iD: 0000-0003-2753-2341
SPIN-code: 2676-5207
postgraduate student
Russian Federation, ChelyabinskReferences
- Kanzaria HK, Probst MA, Hsia RY. Emergency department death rates dropped by nearly 50 percent, 1997–2011. Health Affairs. 2016 Jul 1;35(7):1303-8. doi: 10.1377/hlthaff.2015.1394
- Lee J, Park J, Kim KL, Nam J. Sample-level deep convolutional neural networks for music auto-tagging using raw waveforms. arXiv preprint arXiv:1703.01789. 2017 Mar 6. doi: 10.48550/arXiv.1703.01789
- Zhu Z, Engel JH, Hannun A. Learning multiscale features directly from waveforms. arXiv preprint arXiv:1603.09509. 2016 Mar 31. doi: 10.48550/arXiv.1603.09509
- Choi K, Fazekas G, Sandler M. Automatic tagging using deep convolutional neural networks. arXiv preprint arXiv:1606.00298. 2016 Jun 1. doi: 10.48550/arXiv.1606.00298
- Nasrullah Z, Zhao Y. Music artist classification with convolutional recurrent neural networks. In2019 International Joint Conference on Neural Networks (IJCNN) 2019 Jul 14 (pp. 1-8). IEEE. doi: 10.1109/IJCNN.2019.8851988
- Wang Z, Muknahallipatna S, Fan M, et al. Music classification using an improved crnn with multi-directional spatial dependencies in both time and frequency dimensions. In2019 International Joint Conference on Neural Networks (IJCNN) 2019 Jul 14 (pp. 1-8). IEEE. doi: 10.1109/IJCNN.2019.8852128
- Dieleman S, Brakel P, Schrauwen B. Audio-based music classification with a pretrained convolutional network. In12th International Society for Music Information Retrieval Conference (ISMIR-2011) 2011 (pp. 669-674). University of Miami.
- Chen MT, Li BJ, Chi TS. CNN based two-stage multi-resolution end-to-end model for singing melody extraction. InICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019 May 12 (pp. 1005-1009). IEEE. doi: 10.1109/ICASSP.2019.8683630
- Phan H, Koch P, Katzberg F, et al. Audio scene classification with deep recurrent neural networks. arXiv preprint arXiv:1703.04770. 2017 Mar 14. doi: 10.48550/arXiv.1703.04770
- Gimeno P, Viñals I, Ortega A, et al. Multiclass audio segmentation based on recurrent neural networks for broadcast domain data. EURASIP Journal on Audio, Speech, and Music Processing. 2020 Dec;2020:1-9.
- Dai J, Liang S, Xue W, et al. Long short-term memory recurrent neural network based segment features for music genre classification. In2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2016 Oct 17 (pp. 1-5). IEEE. doi: 10.1109/ISCSLP.2016.7918369
- Zhang Z, Xu S, Zhang S, et al. Attention based convolutional recurrent neural network for environmental sound classification. Neurocomputing. 2021 Sep 17;453:896-903. doi: 10.1016/j.neucom.2020.08.069
- Wang H, Zou Y, Chong D, Wang W. Environmental sound classification with parallel temporal-spectral attention. arXiv preprint arXiv:1912.06808. 2019 Dec 14. doi: 10.48550/arXiv.1912.06808
- Sang J, Park S, Lee J. Convolutional recurrent neural networks for urban sound classification using raw waveforms. In2018 26th European Signal Processing Conference (EUSIPCO) 2018 Sep 3 (pp. 2444-2448). IEEE. doi: 10.23919/EUSIPCO.2018.8553247
- Choi K, Fazekas G, Sandler M, Cho K. Convolutional recurrent neural networks for music classification. In2017 IEEE International conference on acoustics, speech and signal processing (ICASSP) 2017 Mar 5 (pp. 2392-2396). IEEE. doi: 10.1109/ICASSP.2017.7952585
- Gwardys G, Grzywczak D. Deep image features in music information retrieval. International Journal of Electronics and Telecommunications. 2014;60:321-6. doi: 10.2478/eletel-2014-0042
- Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Communications of the ACM. 2017 May 24;60(6):84-90. doi: 10.1145/3065386
- Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition 2009 Jun 20 (pp. 248-255). IEEE. doi: 10.1109/CVPR.2009.5206848
- Emergency Vehicle Siren Sounds [Internet]. Kaggle [cited 2023 February 23]. Available from: https://www.kaggle.com/vishnu0399/emergency-vehicle-siren-sounds
- CNN for audio recognition. GitHub [cited 2023 February 23]. Available from: https://github.com/AnLiMan/CNN-for-audio-recognition
Supplementary files
