Substantive criteria for referring statements from texts to "events" and "factors"
- Authors: Loginova I.V.1, Piekalnits A.S.1, Sabidaeva E.A.1, Antasheva M.S.1, Morozov L.A.2
-
Affiliations:
- National Research University "Higher School of Economics"
- PJSC Sberbank
- Issue: No 4 (2024)
- Pages: 93-110
- Section: Analysis of Textual and Graphical Information
- URL: https://journals.rcsi.science/2071-8594/article/view/278303
- DOI: https://doi.org/10.14357/20718594240408
- EDN: https://elibrary.ru/DDBAJC
- ID: 278303
Cite item
Full Text
Abstract
The purpose of this paper is to advance and automate language models for extracting statements related to events and factors from text documents using the designed linguistic marker system. The paper presents the outcomes of text-mining models of events and factors extraction approbation on the example of analytical research in human potential, social sciences and humanities. The testing and evaluation of the used linguistic models are performed on the basis of the results comparison obtained in automatic mode, in manual mode (with the participation of expert-analytical validation) and semiautomatic mode (using the implemented system of linguistic markers). The introduced approaches resulted in higher performance in extracting statements containing events and factors.
About the authors
Irina V. Loginova
National Research University "Higher School of Economics"
Author for correspondence.
Email: iloginova@hse.ru
Head of Department
Russian Federation, MoscowAnna S. Piekalnits
National Research University "Higher School of Economics"
Email: apiekalnits@hse.ru
Leading Expert
Russian Federation, MoscowElizaveta A. Sabidaeva
National Research University "Higher School of Economics"
Email: esabidaeva@hse.ru
Leading Expert
Russian Federation, MoscowMariia S. Antasheva
National Research University "Higher School of Economics"
Email: msantasheva@hse.ru
Research Intern
Russian Federation, MoscowLev A. Morozov
PJSC Sberbank
Email: lamorozov@sberbank.ru
Leading backend developer
Russian Federation, MoscowReferences
- Popper R. How are foresight methods selected? // Foresight. 2008. Volume 10. No. 6. P. 62-89.
- GURL E. SWOT analysis: a theoretical review. – 2017.
- Petersen J. L. Out of the blue: How to anticipate big future surprises // (No Title). – 1999.
- Gordon T. J. Cross-impact method. – American Council for the United Nations University, 1994. – Т. 4.
- Zionts S. MCDM—If not a roman numeral, then what? //Interfaces. – 1979. – Т. 9. – №. 4. – С. 94-101.
- Willyard C. H., McClees C. W. Motorola's technology roadmap process //Research management. – 1987. – Т. 30. – №. 5. – С. 13-19.
- Bakhtin P., Saritas O., Chulok A., Kuzminov I., Timofeev Trend monitoring for linking science and strategy // Scientometrics. 2017. No. 3. P. 2059-2075.
- Gokhberg, L., Kuzminov, I., Bakhtin, P., Khabirova, E., Chulok, A., Timofeev, A., & Lavrynenko, A. (2017). Big-data-augmented approach to emerging technologies identification: case of agriculture and food sector. Higher School of Economics Research Paper No. WP BRP, 76.
- Gladkaia E. F. Yazykovye markery neverbalizovannyh smyslov v lirike [Linguistic markers of nonverbalized meanings in lyrics] // Prepodavatel' XXI vek [21st Century Educator]. 2008. No. 2. P. 149-152.
- Kolmogorova А. V., Kalinin А. А., Malikova А. V. Tipologiya i kombinatorika verbal'nyh markerov razlichnyh emocional'nyh tonal'nostej v internet-tekstah na russkom yazyke [The types and combinatorics of verbal markers of different emotional tonalities in russian-language Internet texts] // Vestnik Tomskogo gosudarstvennogo universiteta [Tomsk State University Journal]. 2019. No. 448. P. 48-58.
- Ananyeva M. I., Devyatkin D. A., Kamenskaya M. A., Kobozeva M. V., Smirnov I. V. Avtomaticheskoe izvlechenie finansovo-ekonomicheskoi informatsii iz tekstov na russkom yazyke [Automatic extraction of financial and economic information from Russian language texts] // Trudi Instituta sistemnogo analiza Rossijskoj akademii nauk [Proceedings of the Instittute for Systems Analysis Russian Academy of Sciences]. 2018. V. 68. No. 1. S. 23-30.
- Kotelnikov D. S., Lukashevich N. V. Iteracionnoe izvlechenie shablonov opisaniya sobytij po novostnym klasteram [Iterative extraction of event description templates by news clusters] // Trudy XIV Vserossijskoj nauchnoj konferencii RCDL [Proceedings of the XIV All-Russian Scientific Conference RCDL]. 2012. P. 353-359.
- Vlasova N. A. Izvlechenie informacii o situaciyah otstavok- naznachenij v novostnyh tekstah. Opyt razmetki kollekcii. Rezul'taty testirovaniya [Extracting information about retrenchment-appointment situations in news texts. Experience of collection markup. Test results] // Trudy XV Vserossijskoj nauchnoj konferencii RCDL [Proceedings of the XV All-Russian Scientific Conference RCDL]. 2013. P. 145-154.
- Kolmogorova A. V., Kalinin A. A., Taldykina J. A. Yazykovye markery manipulyacii v polyarizovannom politicheskom diskurse: opyt parametrizacii [Linguistic markers of manipulation in polarized discourse: parametric study] // Politicheskaya lingvistika [Political Linguistics]. 2016. No. 4. P. 194-199.
- Semyankova O. I. Lingvisticheskie markery, opredelyayushchie priznaki predvaritel'nogo sgovora, souchastiya i posobnichestva [Linguistic markers determining the signs of prior conspiracy, complicity and aiding and abetting] // Yazyk i pravo: aktual'nye problemy vzaimodejstviya [Language and law: current problems of interaction]. 2016. P.174-179.
- Barabash O. V. Kriterii vyyavleniya korrupciogennyh faktorov v tekste oficial'nyh dokumentov: lingvisticheskij aspekt [Criteria for Identifying Corruptive Factors in the Text of Official Documents: Linguistic Aspect] // Vestnik Penzenskogo gosudarstvennogo universiteta [Bulletin of Penza State University]. 2016. Volume 13. No. 1. P. 17-21.
- Sternin I. A., Shesternina A.M. Markery fejka v mediatekstah [Fake markers in media texts]. Rabochie materialy [Working Materials]. Voronezh: RITM LLC. 2020. P. 4-33.
- Kalegin S. N. Yazykovaya identifikaciya informacionnyh blokov na osnove leksiko-grammaticheskih markerov [Language Identification of Information Blocks Based on Lexicogrammatic Markers] // Sovremennye informacionnye tekhnologii i IT-obrazovanie [Modern Information Technology and IT Education]. 2017. Volume. 13. No. 4. P. 225-231.
- Cherkashina T. T. Yazykovye markery v praktike kommunikativnogo liderstva kak element effektivnogo upravleniya [Language Markers in The Practice of Communicative Leadership as an Element of Effective Management] // Vestnik Moskovskogo universiteta [Bulletin of Moscow University]. 2015. Volume 21. No. 3. P. 112-127.
- Akinina Y. S., Bonch-Osmolovskaya A. A., Kuznetsov I. O., Klintsov V. P., Toldova S. Yu. Rol' obshchej i specificheskoj leksiki pri izvlechenii informacii iz teksta na primere analiza sobytiya «vvod novyh tekhnologij» [The role of general and specific vocabulary in extracting information from the text by the example of the analysis of the event "introduction of new technologies"] // Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriya: Informacionnye tekhnologii [Bulletin of Novosibirsk State University. Series: Information Technologies]. 2012. Т. 10. No. 4. P. 74-80.
- Ananyeva M. I., Devyatkin D. A., Kobozeva M. V., Smirnov I. V. Lingvostatisticheskij analiz tekstov ekstremistskoj napravlennosti [Linguostatistical analysis of extremist texts] // Situacionnye centry i informacionno-analiticheskie sistemy klassa 4i dlya zadach monitoringa i bezopasnosti (SCVRT1516) [Situational Centers and Class 4i Information and Analytical Systems for Monitoring and Security Challenges (SCVRT1516)]. 2016. P. 210-213.
- Devyatkin D. A., Kuznetsova Yu. M., Chudova N. V., Shvets A. V. Intellektual'nyj analiz proyavlenij verbal'noj agressivnosti v tekstah setevyh soobshchestv [Intellectual analysis of verbal aggression manifestations in texts of network communities] // Iskusstvennyj intellekt i prinyatie reshenij [Artificial Intelligence and Decision Making]. 2014. No. 2. P. 27-41.
- Brazhnik S. D., Kasatkina N.N. Lingvisticheskie (yazykovye) pravila v zakonodatel'noj tekhnike [Linguistic (Verbal) Rules in Legislative Technique] // Yuridicheskaya nauka [Legal science]. 2014. No. 3. P. 10-12.
- Bogatyrev M. Y. Izvlechenie faktov iz tekstov estestvennogo yazyka s primeneniem konceptual'nyh grafovyh modelej [Extracting facts from natural language texts using conceptual graph models] // Izvestiya Tul'skogo gosudarstvennogo universiteta. Tekhnicheskie nauki [Proceedings of Tula State University. Technical Sciences]. 2016. No. 7-1. P. 198-208.
- Linyuchev P. Text Mining: sovremennye tekhnologii na informacionnyh rudnikah [Text Mining modern technologies in information mines] // PCWEEK. 2007. No. 6.
- Hühn P. Event and Eventfulness. Handbook of Narratology. Berlin, München, Boston, Germany, USA: De Gruyter. 2014. P. 159-178.
- Sokolov A. V. Forsait: vzglyad v budushchee [Foresight: a look into the future] // Forsait [Foresight]. 2007. Volume 1. No. 1. P. 8-15.
- Popper R. Methodology: Common Foresight Practices & Tools, in Georghiou, L. et al., International Handbook on Foresight and Science Policy: Theory and Practice. – 2007.
Supplementary files
