Developing a computer system for student learning based on vision-language models
- Authors: Shchetinin E.Y.1, Glushkova A.G.2, Demidova A.V.3
-
Affiliations:
- Financial University under the Government of the Russian Federation
- Endeavor
- RUDN University
- Issue: Vol 32, No 2 (2024)
- Pages: 234-241
- Section: Articles
- URL: https://journals.rcsi.science/2658-4670/article/view/315422
- DOI: https://doi.org/10.22363/2658-4670-2024-32-2-234-241
- EDN: https://elibrary.ru/CUCXTY
- ID: 315422
Cite item
Full Text
Abstract
In recent years, artificial intelligence methods have been developed in various fields, particularly in education. The development of computer systems for student learning is an important task and can significantly improve student learning. The development and implementation of deep learning methods in the educational process has gained immense popularity. The most successful among them are models that consider the multimodal nature of information, in particular the combination of text, sound, images, and video. The difficulty in processing such data is that combining multimodal input data by different channel concatenation methods that ignore the heterogeneity of different modalities is an inefficient approach. To solve this problem, an inter-channel attention module is proposed in this paper. The paper presents a computer vision-linguistic system of student learning process based on the concatenation of multimodal input data using the inter-channel attention module. It is shown that the creation of effective and flexible learning systems and technologies based on such models allows to adapt the educational process to the individual needs of students and increase its efficiency.
About the authors
Eugeny Yu. Shchetinin
Financial University under the Government of the Russian Federation
Author for correspondence.
Email: riviera-molto@mail.ru
ORCID iD: 0000-0003-3651-7629
Scopus Author ID: 16408533100
ResearcherId: O-8287-2017
Doctor of Physical and Mathematical Sciences, lecturer of Department of Mathematics
49 Leningradsky Ave, Moscow, 125993, Russian FederationAnastasia G. Glushkova
Endeavor
Email: aglushkova@endeavorco.com
ORCID iD: 0000-0002-8285-0847
Scopus Author ID: 57485591900
researcher
Federation 2 Endeavor, LondonAnastasia V. Demidova
RUDN University
Email: demidova-av@rudn.ru
ORCID iD: 0000-0003-1000-9650
Candidate of Physical and Mathematical Sciences, Assistant professor of Department of Probability Theory and Cyber Security
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationReferences
- Devlin, J., Chang, M., Lee, K. & K., T. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018.
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. & Polosukhin, I. Attention is All you Need in Advances in Neural Information Processing Systems (eds Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S. & Garnett, R.) 30 (Curran Associates, Inc., 2017), 5998-6008.
- Liu Y. andOtt, M., Goyal N. andDu, J., Joshi, M., Chen, D., Levy, O., Lewis M. andZettlemoyer, L. & V., S. RoBERTa: A Robustly Optimized BERT Pretraining Approach 2019.
- Clark, E. & Gardner, M. Simple and Effective Multi-Paragraph Reading Comprehension 2018.
- Klein, G., Kim, Y., Deng, Y., Senellart, J. & Rush, A. OpenNMT: Open-Source Toolkit for Neural Machine Translation in Proceedings of ACL 2017, System Demonstrations (eds Bansal, M. & Ji, H.) 28 (Association for Computational Linguistics, Vancouver, Canada, July 2017), 67-72. doi: 10.18653/V1/P17-4012.
- Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
- Nogueira, R. & Cho, K. Passage Re-ranking with BERT 2019.
- Schröder, S., Niekler, A. & Potthast, M. Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers 2021.
- Yang, F., Wang, X., Ma, H. & Li, J. Transformers-sklearn: a toolkit for medical language understanding with transformer-based models. BMC Medical Informatics and Decision Making 21, 141-157. doi: 10.1186/s12911-021-01459-0 (2021).
- Rashid, M., Höhne, J., Schmitz, G. & Müller-Putz, G. A Review of Humanoid Robots Controlled by Brain-Computer Interfaces. Frontiers in Neurorobotics, 1-28 (2020).
Supplementary files
