Expanding the Database of a Balanced Linguistic Corpus with Values from a Dictionary of Tonality (corpus experiment)

Capa

Citar

Resumo

The proposed research aims to develop and test an algorithm for expanding a balanced dynamic linguistic corpus of more than 3 million tokens with connotative characteristics. To achieve this, the authors rely on original software solutions created at the laboratory for fundamental and applied issues of virtual education at Moscow State Linguistic University. As a result, a properly functioning corpus was obtained with the ability to supplement its individual fragments with data on the connotations of tokens and sentences.

Sobre autores

Alexey Gorozhanov

Moscow State Linguistic University

Autor responsável pela correspondência
Email: a.gorozhanov@linguanet.ru

Doctor of Philology (Dr. habil), Associate Prof. , Professor in the Department of German Language Grammar and History, Faculty for German Language

Rússia

Darya Stepanova

Minsk State Linguistic University

Email: daryastepanova79@gmail.com

PhD (Philology), Associate Prof., Associate Professor in the Department of Theory and Practice of English Speech, 
Faculty for English Language

Belarus

Bibliografia

  1. Gorozhanov, A. I., Guseynova, I. A., Stepanova, D. V. (2024). Natural Language Processing and Fiction Text: Basis for Corpus Research. RUDN Journal Of Language Studies, Semiotics And Semantics, 15(1), 195–210. doi: 10.22363/2313-2299-2024-15-1-195-210.
  2. Stepanova, D. V. (2023). Software package for generating a dynamic media texts corpus. Minsk State Linguistic University Bulletin. Series 1. Philology, 6(127), 123–130. EDN FMBTKO. (In Russ.)
  3. Gorozhanov, A. I. (2023). Extension of a standard balanced linguistic corpus built according to spaCy rules by connotative characteristics. Philology. Theory & Practice, 11(16), 3888–3893. doi: 10.30853/phil20230594. EDN FVUIUL. (In Russ.)
  4. Chernichkin, D. A., Krivenko, A. I. (2023). Media image of Russia in Kazakh Telegram channels. Political Expertise: Politex, 4(19), 565–586. doi: 10.21638/spbu23.2023.404. EDN POURDG. (In Russ.)
  5. Komarova, E. V. (2023). Digital ethics challengers in Russian and English media texts: Migrant Discourse Case Study. Media Linguistics, 2(10), 253–264. doi: 10.21638/spbu22.2023.207. EDN MFJOQV. (In Russ.)
  6. Glushak, V. M. (2023). Negation of German polar words and expressions in automated analysis of text tonality. Philology. Theory & Practice, 10(16), 3287–3292. doi: 10.30853/phil20230510. EDN CWDXEU. (In Russ.)
  7. Chernyshevich, M. V. (2018). The architecture of sentiment-analysis system and its linguistic resources. Minsk State Linguistic University Bulletin. Series 1. Philology, 3(94), 72–80. EDN WXUUJR. (In Russ.)

Arquivos suplementares

Arquivos suplementares
Ação
1. JATS XML


Creative Commons License
Este artigo é disponível sob a Licença Creative Commons Atribuição 4.0 Internacional.

Согласие на обработку персональных данных

 

Используя сайт https://journals.rcsi.science, я (далее – «Пользователь» или «Субъект персональных данных») даю согласие на обработку персональных данных на этом сайте (текст Согласия) и на обработку персональных данных с помощью сервиса «Яндекс.Метрика» (текст Согласия).