Speech recognition technology in radiology

Nikita D. Kudryavtsev; Кудрявцев Никита Дмитриевич; Nikita D. Kudryavtsev; Kristina A. Bardasova; Бардасова Кристина Алексеевна; Kristina A. Bardasova; Anna N. Khoruzhaya; Хоружая Анна Николаевна; Anna N. Khoruzhaya

doi:10.17816/DD321420

Speech recognition technology in radiology

Authors: Kudryavtsev N.D.¹, Bardasova K.A.², Khoruzhaya A.N.¹
Affiliations:
1. Moscow Center for Diagnostics and Telemedicine
2. Ural State Medical University
Issue: Vol 4, No 2 (2023)
Pages: 185-196
Section: Reviews
URL: https://journals.rcsi.science/DD/article/view/146885
DOI: https://doi.org/10.17816/DD321420
ID: 146885

Cite item

Full Text

Abstract
Full Text
About the authors
References
Supplementary files
Statistics

Abstract

Speech recognition devices are promising tools for the healthcare system. Speech recognition technology has had a relatively long history of use in Western healthcare systems since the 1970s. However, it became widely used at the beginning of the 21^st century, replacing medical transcriptionists. This technology is relatively new in home healthcare. Its active development began only in the early 2010s, and its implementation in healthcare started in late 2010. This delay is due to the idiosyncrasies of the Russian language and the limited computational power present at the beginning of the 21^st century.

Currently, complexes of devices and software for speech recognition are used in the voice filling of medical records and can reduce the time for preparing reports for radiological examinations compared with traditional (keyboard) text input.

The literature review provides a brief history of speech recognition technology development and application in radiography. Key scientific studies showing its efficacy in Western healthcare systems are reflected. Voice recognition technology in the home is demonstrated, and its effectiveness is evaluated. The prospects for further development of this technology in Russian healthcare are described.

Keywords

medical records, radiation diagnostics, radiology, speech recognition software, voice input

Full Text

##article.viewOnOriginalSite##

About the authors

Nikita D. Kudryavtsev

Moscow Center for Diagnostics and Telemedicine

Email: KudryavtsevND@zdrav.mos.ru
ORCID iD: 0000-0003-4203-0630
SPIN-code: 1125-8637
Russian Federation, Moscow

Kristina A. Bardasova

Ural State Medical University

Email: bardasovakris@mail.ru
ORCID iD: 0009-0002-4310-1357
SPIN-code: 1156-7627
Russian Federation, Ekaterinburg

Anna N. Khoruzhaya

Moscow Center for Diagnostics and Telemedicine

Author for correspondence.
Email: KhoruzhayaAN@zdrav.mos.ru
ORCID iD: 0000-0003-4857-5404
SPIN-code: 7948-6427
Russian Federation, Moscow

References

Vechorko VI. Distribution of working time at an outpatient appointment of a district therapist with a nurse in a polyclinic in Moscow (photochronometric observation). Social Aspects Public Health. 2016;(6):4. (In Russ).
Kaplieva OV, Marega LA, Vorobyeva LP. Timekeeping of working hours of doctors of the children’s consultative and diagnostic department. Far Eastern Med J. 2018;(4):72–76. (In Russ).
Ryabchikov IV, Zagafarov RR, Mukhina VV, et al. Distribution of the traumatologist-orthopaedician’s working time with outpatients. Моscоw Sur J. 2018;(6):38–43. (In Russ). doi: 10.17238/issn2072-3180.2018.6.38-43
Kudryavtsev ND, Sergunova KA, Ivanova GV, et al. Evaluation of the effectiveness of the implementation of speech recognition technology for the preparation of radiological protocols. VIT. 2020;6(S1):58–64. (In Russ). doi: 10.37690/1811-0193-2020-S1-58-64
Blackley SV, Huynh J, Wang L, et al. Speech recognition for clinical documentation from 1990 to 2018: A systematic review. J Am Med Inf Association. 2019;26(4):324–338. doi: 10.1093/jamia/ocy179
Motyer RE, Liddy S, Torreggiani WC, Buckley O. Frequency and analysis of non-clinical errors made in radiology reports using the National Integrated Medical Imaging System voice recognition dictation software. Ir J Med Sci. 2016;185(4):921–927. doi: 10.1007/s11845-016-1507-6
Hodgson T, Coiera E. Risks and benefits of speech recognition for clinical documentation: A systematic review. J Am Med Inf Association. 2016;23(e1):e169–e179. doi: 10.1093/jamia/ocv152
Itakura F. Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process. 1975;23(1):67–72. doi: 10.1109/TASSP.1975.1162641
Leeming W, Porter D, Jackson JD, et al. Computerized radiologic reporting with voice data-entry. Radiology. 1981;138(3):585–588. doi: 10.1148/radiology.138.3.7465833
Simon M, Leeming BW, Bleich HL, et al. Computerized radiology reporting using coded language. Radiology. 1974;113(2):343–349. doi: 10.1148/113.2.343
Vogel M, Kaisers W, Wassmuth R, Mayatepek E. Analysis of documentation speed using web-based medical speech recognition technology: Randomized controlled trial. J Med Internet Res. 2015;17(11):e247. doi: 10.2196/jmir.5072
Ramaswamy MR, Chaljub G, Esch O, et al. Continuous speech recognition in MR imaging reporting. Am J Roentgenol. 2000;174(3):617–622. doi: 10.2214/ajr.174.3.1740617
Poder TG, Fisette JF, Déry V. Speech recognition for medical dictation: Overview in quebec and systematic review. J Med Systems. 2018;42(5):89. doi: 10.1007/s10916-018-0947-0
Sankaranarayanan B, David G, Vishwanath KR, et al. Would technology obliterate medical transcription? In: Proceedings of the 2017 ACM SIGMIS Conference on Computers and People Research. New York, NY, USA: ACM; 2017. P. 97–104. doi: 10.1145/3084381.3084414
Houston JD, Rupp FW. Experience with implementation of a radiology speech recognition system. J Digital Imaging. 2000;13(3):124–128. doi: 10.1007/BF03168385
Saxena K, Diamond R, Conant RF, et al. Provider adoption of speech recognition and its impact on satisfaction, documentation quality, efficiency, and cost in an inpatient EHR. AMIA Jt Summits Transl Sci Proc. 2018;2017:186–195.
Schwartz LH, Kijewski P, Hertogen H, et al. Voice recognition in radiology reporting. Am J Roentgenol. 1997;169(1):27–29. doi: 10.2214/ajr.169.1.9207496
Vogel M, Kaisers W, Wassmuth R, Mayatepek E. Analysis of documentation speed using web-based medical speech recognition technology: Randomized controlled trial. J Med Int Research. 2015;17(11):e247. doi: 10.2196/jmir.5072
Hammana I, Lepanto L, Poder T, et al. Speech recognition in the radiology department: A systematic review. Health Inf Manag. 2015;44(2):4–10. doi: 10.1177/183335831504400201
Mohr DN, Turner DW, Pond GR, et al. speech recognition as a transcription aid: A randomized comparison with standard transcription. J Am Med Inf Association. 2003;10(1):85–93. doi: 10.1197/jamia.m1130
Singh M, Pal TR. Voice recognition technology implementation in surgical pathology: Advantages and limitations. Arch Pathol Laboratory Med. 2011;135(11):1476–1481. doi: 10.5858/arpa.2010-0714-OA
Goss FR, Blackley SV, Ortega CA, et al. A clinician survey of using speech recognition for clinical documentation in the electronic health record. Int J Med Inf. 2019;(130):103938. doi: 10.1016/j.ijmedinf.2019.07.017
Blackley SV, Schubert VD, Goss FR, et al. Physician use of speech recognition versus typing in clinical documentation: A controlled observational study. Int J Med Inform. 2020;(141):104178. doi: 10.1016/j.ijmedinf.2020.104178
Yang L, Ene IC, Belaghi RA, et al. Stakeholders’ perspectives on the future of artificial intelligence in radiology: A scoping review. Eur Radiol. 2022;32(3):1477–1495. doi: 10.1007/s00330-021-08214-z
European Society of Radiology (ESR). Impact of artificial intelligence on radiology: A EuroAIM survey among members of the European Society of Radiology. Insights Imaging. 2019;10(1):105. doi: 10.1186/s13244-019-0798-3
Szymański P, Żelasko P, Morzyet M, et al. WER we are and WER we think we are. arXiv. arXiv:2010.03432.2020. doi: 10.48550/arXiv.2010.03432
Li J. Recent advances in end-to-end automatic speech recognition. arXiv. arXiv:2111.01690.2022. doi: 10.48550/arXiv.2111.01690
Juang BH, Rabiner LR. Hidden markov models for speech recognition. Technometrics. 1991;33(3):251–272.
Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada: IEEE; 2013. P. 6645–6649. doi: 10.48550/arXiv.1303.5778
Deng L, Li X. Machine learning paradigms for speech recognition: An overview. IEEE Trans Audio Speech Lang Process. 2013;21(5):1060–1089.
Kazachkin AE. Speech recognition methods, modern speech technologies. Young Scientist. 2019;(39):6–8. (In Russ).
Kamath U, Liu J, Whitaker J. Deep learning for NLP and speech recognition. Cham: Springer International Publishing; 2019. 621 р.
Wang D, Wang X, Lv S. An overview of end-to-end automatic speech recognition. Symmetry. 2019;11(8):1018. doi: 10.3390/sym11081018
Zhozhikashvili VA, Farkhadov MP, Petukhova NV, Zhozhikashvili AV. The first voice recognition applications in Russian language for use in The Interactive Information Systems. In: Speech and Computer. Saint-Petersburg, SPECOM; 2004. Р. 304–307. (In Russ).
Karpov AA, Ronzhin AA, Li IV. SIRIUS system of dictoron-independent recognition of the merged Russian speech. Izvestia Southern Federal University. Technical Sci. 2005;54(10):44–54. (In Russ).
Irzaev MG. The use of voice input of information in medical institutions to fill in electronic charts and patient medical histories. New technologies and techniques in medicine, biology and ecology: Collection of scientific papers. 2013;(3):149–154. (In Russ).
Vazhenina D, Markov K, Karpov A, et al. State-of-the-art speech recognition technologies for Russian language. In: Proceedings of the 2012 Joint International Conference on Human-Centered Computer Environments. Aizu-Wakamatsu Japan: ACM; 2012. P. 59–63. doi: 10.1145/2160749.2160763
Kamvar M, Chelba C. Optimal size, freshness and time-frame for voice search vocabulary. arXiv. arXiv:1210.8436.2012.
Kipyatkova IS, Karpov AA. An analytical survey of large vocabulary Russian speech recognition systems. SPIIRAS Proceedings. 2014;1(12):7. (In Russ). doi: 10.15622/sp.12.1
Kudryavtsev ND, Semenov DS, Kozhikhina DD, Vladzymyrskyy AV. Speech recognition technology: Results of a survey of radiologists at the Moscow reference center for diagnostic radiology. Healthcare Management. 2022;8(3):95–104. (In Russ). doi: 10.33029/2411-8621-2022-8-3-95-104
Sinitsyn VE, Komarova MA, Mershina EA. Protocol of radiological description: Past, present, future. Bulletin Radiology Radiology. 2014;(3):35–40. (In Russ).
Sobez LM, Kim SH, Angstwurm M, et al. Creating high-quality radiology reports in foreign languages through multilingual structured reporting. Eur Radiol. 2019;29(11):6038–6048. doi: 10.1007/s00330-019-06206-8
Ganeshan D, Duong PA, Probyn L, et al. Structured reporting in radiology. Academic Radiology. 2018;25(1):66–73. doi: 10.1016/j.acra.2017.08.005
Dos Santos PD, Hempel JM, Mildenberger P, et al. Structured reporting in clinical routine. Rofo. 2019. Vol. 191, N 01. P. 33–39. doi: 10.1055/a-0636-3851
Andrianova MG, Kudryavtsev ND, Petryaykin AV. Development of a thesaurus of radiological terms for voice filling of diagnostic research protocols. Digital Diagnostics. 2022;3(S1):21–22. (In Russ). doi: 10.17816/DD105703

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

2. Fig. 1. A simplified scheme of the operation of a classical speech recognition system. An algorithm for recognizing the “signs of osteochondrosis” phrase is presented.

Download (154KB)

Indexing metadata

3. Fig. 2. Workplace of a radiologist at the Moscow Reference Center for Radiation Diagnostics, equipped with a speech recognition system. The process of filling medical records.

Download (129KB)

Indexing metadata

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register