Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics
- Authors: Pavlov N.A.1, Andreychenko A.E.1, Vladzymyrskyy A.V.1, Revazyan A.A.1, Kirpichev Y.S.1, Morozov S.P.1
-
Affiliations:
- Moscow Center for Diagnostics and Telemedicine
- Issue: Vol 2, No 1 (2021)
- Pages: 49-66
- Section: Technical Reports
- URL: https://journals.rcsi.science/DD/article/view/60635
- DOI: https://doi.org/10.17816/DD60635
- ID: 60635
Cite item
Abstract
The article describes a novel approach to creating annotated medical datasets for testing artificial intelligence-based diagnostic solutions. Moreover, there are four stages of dataset formation described: planning, selection of initial data, marking and verification, and documentation. There are also examples of datasets created using the described methods. The technique is scalable and versatile, and it can be applied to other areas of medicine and healthcare that are being automated and developed using artificial intelligence and big data technologies.
Full Text
##article.viewOnOriginalSite##About the authors
Nikolay A. Pavlov
Moscow Center for Diagnostics and Telemedicine
Author for correspondence.
Email: n.pavlov@npcmr.ru
ORCID iD: 0000-0002-4309-1868
SPIN-code: 9960-4160
https://pavlov.rocks
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, Moscow
Anna E. Andreychenko
Moscow Center for Diagnostics and Telemedicine
Email: a.andreychenko@npcmr.ru
ORCID iD: 0000-0001-6359-0763
SPIN-code: 6625-4186
PhD
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, MoscowAnton V. Vladzymyrskyy
Moscow Center for Diagnostics and Telemedicine
Email: a.vladzimirsky@npcmr.ru
ORCID iD: 0000-0002-2990-7736
SPIN-code: 3602-7120
MD, Dr. Sci. (Med.)
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, MoscowAnush A. Revazyan
Moscow Center for Diagnostics and Telemedicine
Email: anushrevazyan@gmail.com
ORCID iD: 0000-0003-1589-2382
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, Moscow
Yury S. Kirpichev
Moscow Center for Diagnostics and Telemedicine
Email: y.kirpichev@npcmr.ru
ORCID iD: 0000-0002-9583-5187
SPIN-code: 3362-3428
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, Moscow
Sergey P. Morozov
Moscow Center for Diagnostics and Telemedicine
Email: morozov@npcmr.ru
ORCID iD: 0000-0001-6545-6170
SPIN-code: 8542-1720
MD, Dr. Sci. (Med.), Professor
Russian Federation, 28-1, Srednyaya Kalitnikovskaya street, 109029, MoscowReferences
- Gusev AV. Prospects for neural networks and deep machine learning in creating health solutions (Compex medical information system, Russian). Vrach i Informatsionnye Tekhnologii. 2017;(3):92–105. (In Russ).
- Ranschaert ER, Morozov S, Algra PR, eds. Artificial intelligence in medical imaging. Cham: Springer International Publishing; 2019. doi: 10.1007/978-3-319-94878-2
- Griffith B, Kadom N, Straus CM. Radiology Education in the 21st Century: Threats and Opportunities. J Am Coll Radiol. 2019;16(10):1482–1487. doi: 10.1016/j.jacr.2019.04.003
- Savadjiev P, Chong J, Dohan A, et al. Demystification of AI-driven medical image interpretation: past, present and future. Eur Radiol. 2019:29(3):1616–1624. doi: 10.1007/s00330-018-5674-x
- Ng А. What artificial intelligence can and can’t do right now. Harvard Business Review; 2016. Available from: https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now
- Renear H, Sacchi S, Wickett KM. Definitions of dataset in the scientific and technical literature. Proceedings of the American Society for Information Science and Technology. 2010;47(1):1-4. doi: 10.1002/meet.14504701240
- Tan SL, Gao G, Koch S. Big data and analytics in healthcare. Methods Inf Med. 2015;54(6):546–547. doi: 10.3414/ME15-06-1001
- Kohli MD, Summers RM, Geis JR. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C- MIMI meeting dataset session. J Digit Imaging. 2017;30(4):392–399. doi: 10.1007/s10278-017-9976-3
- Willemink MJ, Koszek WA, Hardell C, et al. Preparing medical imaging data for machine learning. Radiology. 2020;295(1):4–15. doi: 10.1148/radiol.2020192224
- Morozov SP, Shelekhov PV, Vladzymyrsky AV. Modern approaches to the radiology service improvement. Health Care Standardization Problems. 2019;(5-6):30−34. (In Russ). doi: 10.26347/1607-2502201905-06030-034
- Kulberg NS, Gusev MA, Reshetnikov RV, et al. Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images. Health Care Russian Federation. 2021;64(6):343–350. doi: 10.46563/0044-197x-2020-64-6-343-350
- Preston-Werner T. Semantic Versioning 2.0.0 [Internet]. Available from: https://semver.org
- Morozov SP, Protsenko DN, Smetanina SV, et al. Radiation diagnostics of coronavirus disease (COVID-19): organization, methodology, interpretation of results: Preprint No.CDT ― 2020 ― II. Version 2 from 17.04.2020. The series “Best practices of radiation and instrumental diagnostics”. Issue 65. Moscow : Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Department of Health; 2020. 80 p. (In Russ). Avalable from: https://tele-med.ai/biblioteka-dokumentov/luchevaya-diagnostika-koronavirusnoj-bolezni-covid-19-organizaciya-metodologiya-interpretaciya-rezultatov
- Pavlov N. ECR 2021: Value of technical stratification of medical datasets for AI services. Moscow, 2021. [Internet]. Available from: https://connect.myesr.org/course/ai-in-breast-imaging/
- Morozov SP, Vladzymyrskyy A, Andreychenko A, et al. Moscow experiment on computer vision in radiology: involvement and participation of radiologists. Vrach i informacionnye tehnologii. 2020;(4):14–23. doi: 10.37690/1811-0193-2020-4-14-23
- Morozov SP, Vladzymyrskyy AV, Klyashtornyy VG, et al. Clinical acceptance of software based on artificial intelligence technologies (radiology). Series “Best practices in medical imaging”. Issue 57. Moscow; 2019. 45 p.
- Morozov SP, Andreychenko AE, Pavlov NA, et al. MosMedData: Chest CT scans with COVID-19 related findings dataset. medRxiv. 2020. doi: 10.1101/2020.05.20.20100362
- Sushentsev N, Bura V, Kotniket M, et al. A head-to-head comparison of the intra- and interobserver agreement of COVID-RADS and CO-RADS grading systems in a population with high estimated prevalence of COVID-19. BJR Open. 2020;2(1):20200053. doi: 10.1259/bjro.20200053
- Jin C, Chen W, Caoet Y, et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis. Nat Commun. 2020;11(1):5088. doi: 10.1038/s41467-020-18685-1
Supplementary files
![](/img/style/loading.gif)