Transformer-Based Classiﬁcation of User Queries for Medical Consultancy

D. A Lyutkin; ЛЮТКИН Д. А; D. V Pozdnyakov; ПОЗДНЯКОВ Д. В; A. A Soloviev; СОЛОВЬЕВ А. А; D. V Zhukov; ЖУКОВ Д. В; M. Sh.I Malik; МАЛИК М. Ш.И; D. I Ignatov; ИГНАТОВ Д. И

doi:10.31857/S0005231024030076

Transformer-Based Classiﬁcation of User Queries for Medical Consultancy

Authors: Lyutkin D.A¹, Pozdnyakov D.V¹, Soloviev A.A¹, Zhukov D.V¹, Malik M.S.¹, Ignatov D.I¹
Affiliations:
Issue: No 3 (2024)
Pages: 86-100
Section: Topical issue
URL: https://journals.rcsi.science/0005-2310/article/view/256156
DOI: https://doi.org/10.31857/S0005231024030076
EDN: https://elibrary.ru/TQAELK
ID: 256156

Cite item

Full Text

Open Access
Restricted Access

Access granted
Restricted Access

Subscription Access

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

Представлен новый подход, использующий модель RuBERT для классификации пользовательских запросов в области медицинских консультаций с учетом специализации эксперта. В ходе исследования был собран обширный набор данных, который использовался для дообучения модели RuBERT. Метрика качества полученной модели F1-score составила более 91,8% как при использовании блоковой кросс-валидации, так и при разделении набора данных на обучающую и тестовую выборки. Подход демонстрирует высокую обобщающую способность для различных медицинских подобластей, таких как кардиология, неврология и дерматология. Предложенный подход позволяет сократить время на определение наиболее подходящего специалиста и тем самым повышает качество консультации и медицинской помощи.

Keywords

трансформер, медицинский текст, многоклассовая клаcсификация

References

Trusting Social Media as a Source of Health Information: Online Surveys Comparing the United States, Korea, and Hong Kong / H. Song // J. Medic. Internet Res. 2016. V. 18. No. 3. P. 25. URL: https://www.jmir.org/2016/3/e25. https://doi.org/10.2196/jmir.4193
БэбиБлог — Ответы на любые вопросы о беременности, детях и семейной жизни. Accessed: December 19 , 2022. https://www.babyblog.ru/
Keshavarz H. Evaluating credibility of social media information: current challenges, research directions and practical criteria // Inform. Discover. Deliver. 2021. V. 49. No. 4. P. 269–279. https://doi.org/10.1108/IDD-03-2020-0033
Automatic medical specialty classiﬁcation based on patients’ description of their symptoms / C. Mao / BMC Medical Informatics and Decision Making. 2023. V. 23. https://doi.org/10.1186/s12911-023-02105-7
Tezgider M., Yildiz B., Aydin G. Text classiﬁcation using improved bidirectional transformer // Concurrency and Computation: Practice and Experience. 2022. V. 34. No. 9. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6486. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.6486. https://doi.org/https://doi.org/10.1002/cpe.6486
СпросиВрача: Задай вопрос врачу онлайн и получи ответ мгновенно. Accessed: February 17, 2023. https://sprosivracha.com/
ДОКТУ — поиск лучших врачей и клиник в России. Accessed: February 17, 2023. https://doctu.ru/
Онлайн — медицинские консультации в режиме онлайн. Accessed: February 17, 2023. https://03online.com/
health.mail.ru — Поиск по болезням, лекарствам и ответам врачей. Accessed: February 17, 2023. https://health.mail.ru/
Johnson J.M., Khoshgoftaar T.M. Survey on deep learning with class imbalance // Journal of Big Data. 2019. V. 6. No. 1. P. 27. https://doi.org/10.1186/s40537-019-0192-5
Ma E. NLP Augmentation. 2019. Accessed: February 17, 2023. https://github.com/makcedward/nlpaug
Hecht-Nielsen R. III.3 – Theory of the Backpropagation Neural Network (Based on “nonindent” by Robert Hecht-Nielsen, which appeared in Proceedings of the International Joint Conference on Neural Networks 1, 593–611, June 1989). ×c 1989 IEEE / Neural Networks for Perception / H. Wechsler (Ed.). Academic Press, 1992. P. 65–93. ISBN 978-0-12-741252-8. https://doi.org/10.1016/B978-0-12-741252-8.50010-8. URL: https://www.sciencedirect.com/science/article/pii/B9780127412528500108
Shaheen Z., Wohlgenannt G., Filtz E. Large Scale Legal Text Classiﬁcation Using Transformer Models. 2020. arXiv: 2010.12871 [cs.CL]
Understanding AdamW through Proximal Methods and Scale-Freeness / Z. Zhuang. 2022. arXiv: 2202.00089 [cs.LG]
Automated Learning Rate Scheduler for Large-batch Training / C. Kim. 2021. arXiv: 2107.05855 [cs.LG]
Attention Is All You Need / A. Vaswani. 2017. arXiv: 1706.03762 [cs.CL]
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes / Y. You. 2020. arXiv: 1904.00962 [cs.LG]
Are Transformers more robust than CNNs? / Y. Bai // Advances in Neural Information Processing Systems. 2021. P. 34. Curran Associates, Inc. P. 26831–26843. URL: https://proceedings.neurips.cc/paper ﬁles/paper/2021/ﬁle/ e19347e1c3ca0c0b97de5fb3b690855a
A Survey on Text Classiﬁcation: From Shallow to Deep Learning / Q. Li. 2021. arXiv: 2008.00364 [cs.CL]
Transformers: State-of-the-Art Natural Language Processing / T. Wolf [et al.] // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online : Association for Computational Linguistics. 2020. P. 38–45. URL: https://www.aclweb.org/anthology/2020.emnlp-demos.6
Maida A. Cognitive Computing and Neural Networks: Reverse Engineering the Brain / Handbook of Statistics. V. 35. Elsevier. 2016. P. 39–78. https://doi.org/10.1016/bs.host.2016.07.011 URL: https://doi.org/10.1016/bs.host.2016.07.011
Kostenetskiy P.S., Chulkevich R.A., Kozyrev V.I. HPC Resources of the Higher School of Economics / J. Physics: Conf. 2021. P. 1740. No. 1. P. 012050. https://doi.org/10.1088/1742-6596/1740/1/012050 URL: https://dx.doi.org/10.1088/1742-6596/1740/1/012050
Reimers N., Gurevych I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. 2019. arXiv: 1908.10084 [cs.CL]
Language-agnostic BERT Sentence Embedding / F. Feng. 2022. arXiv: 2007.01852 [cs.CL]
Kuratov Y., Arkhipov M. Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language. 2019. arXiv: 1905.07213 [cs.CL]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding / J. Devlin. 2019. arXiv: 1810.04805 [cs.CL]
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension / M. Lewis. 2019. arXiv: 1910.13461 [cs.CL]

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

Transformer-Based Classiﬁcation of User Queries for Medical Consultancy

Full Text

Abstract

Keywords

About the authors

References

Supplementary files