Technique for Identifying Texts Generated by Large Language Models
- Authors: Fedotova A.M1, Romanov A.S1
-
Affiliations:
- TUSUR
- Issue: Vol 24, No 5 (2025)
- Pages: 1444-1470
- Section: Artificial intelligence, knowledge and data engineering
- URL: https://journals.rcsi.science/2713-3192/article/view/350763
- DOI: https://doi.org/10.15622/ia.24.5.7
- ID: 350763
Cite item
Full Text
Abstract
About the authors
A. M Fedotova
TUSUR
Email: afedotowaa@yandex.ru
Lenin Ave. 40
A. S Romanov
TUSUR
Email: alexx.romanov@gmail.com
Lenin Ave. 40
References
- Fedotova A., Romanov A., Kurtukova A., Shelupanov A. Digital authorship attribution in Russian-language fanfiction and classical literature // Algorithms. 2022. vol. 16. no. 1.
- Романов А.С. Методология идентификации автора текста для решения задач информационной безопасности // Вопросы кибербезопасности. 2024. № 3(61). С. 120–128. doi: 10.21681/2311-3456-2024-3-120-128.
- Kurtukova A., Romanov A., Shelupanov A., Fedotova A. Complex cases of source code authorship identification using a hybrid deep neural network // Future Internet. 2022. vol. 14. no. 10. doi: 10.3390/fi14100287.
- Zellers R., Holtzman A., Rashkin H., Bisk Y., Farhadi A., Roesner F., Choi Y. Defending against neural fake news // Proceedings of the 33rd Int. Conf. on Neural Information Processing Systems. 2019. pp. 9054–9065.
- Kuznetsov, K., Tulchinskii E., Kushnareva L., Magai G., Baranniko S., Nikolenko S., Piontkovskaya I. Robust AI-Generated Text Detection by Restricted Embeddings // Findings of the Association for Computational Linguistics: EMNLP. 2024. pp. 17036–17055. doi: 10.18653/v1/2024.findings-emnlp.992.
- Fraser K.C., Dawkins H., Kiritchenko S. Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods // Journal of Artificial Intelligence Research. 2025. vol. 82. pp. 2233–2278. doi: 10.1613/jair.1.16665.
- Prajapati M., Baliarsingh S.K., Dora C., Bhoi A., Hota J., Mohanty J.P. Detection of AI-generated text using large language model // 2024 International Conference on Emerging Systems and Intelligent Computing (ESIC). 2024. pp. 735–740. doi: 10.1109/ESIC60604.2024.10481602.
- Mitchell E., Lee Y., Khazatsky A., Manning C.D., Finn C. Detectgpt: Zero-shot machine-generated text detection using probability curvature // Proceedings of the 40th International Conference on Machine Learning (PMLR). 2023. pp. 24950–24962.
- Lau H.T., Zubiaga A. Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection // arXiv preprint arXiv:2411.03806. 2024.
- Wu J., Yang S., Zhan R., Yuan Y., Chao L.S., Wong D.F. A survey on LLM-generated text detection: Necessity, methods, and future directions // Computational Linguistics. 2025. pp. 275–338. doi: 10.1162/coli_a_00549.
- GPTZero. URL: gptzero.me (дата обращения: 15.05.2025).
- Kavian A., Pourhashem Kallehbasti M.M., Kazemi S., Firouzi E., Ghafari M. LLM security guard for code // Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering. 2024. pp. 600–603. doi: 10.1145/3661167.366126.
- Wu L. Y., Segura-Bedmar I. AI-generated Text Detection with a GLTR-based Approach // arXiv preprint. arXiv:2502.12064. 2025.
- OpenAI. URL: openai.com (дата обращения: 15.05.2025).
- OriginalityAI. URL: https://originality.ai/ (дата обращения: 15.05.2025).
- AI Detector & Content Checker By Copyleaks. URL: https://copyleaks.com/ai-content-detector (дата обращения: 15.05.2025).
- Writer. AI content detector. URL: https://writer.com/ai-content-detector/ (дата обращения: 15.05.2025).
- Tulchinskii E., et al. Intrinsic dimension estimation for robust detection of AI-generated texts // Advances in Neural Information Processing Systems. 2023. vol. 36. pp. 39257–39276.
- Nikolaev K. Development of a Neural Network Model for Recognizing Russian-Language Generated Texts // 2024 IEEE Int. Multi-Conf. on Engineering, Computer and Information Sciences (SIBIRCON). IEEE, 2024. pp. 396–400. doi: 10.1109/SIBIRCON63777.2024.10758447.
- Gritsay G., Grabovoy A., Chekhovich Y. Open access dataset for machine-generated text detection in Russian. Mendeley Data. V2. 2023. doi: 10.17632/4ynxfp3w53.2.
- Shamardina T., et al. Findings of the the ruatd shared task 2022 on artificial text detection in Russian // arXiv preprint arXiv:2206.01583. 2022.
- RuATD. URL: https://github.com/dialogue-evaluation/RuATD (дата обращения: 15.05.2025).
- Skrylnikov S., Posokhov P., Makhnytkina O. Artificial text detection in Russian language: A BERT-based approach // Proc. Int. Conf. Dialogue. 2022. pp. 1–7.
- Gritsai G., Voznyuk A., Grabovoy A., Chekhovich Y. Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts // arXiv e-prints. 2024. arXiv:2410.14677.
- Pan L., et al. MarkLLM: An Open-Source Toolkit for LLM Watermarking // Proc. of the 2024 Conf. on Empirical Methods in Natural Language Processing: System Demonstrations. 2024. pp. 61–71. doi: 10.18653/v1/2024.emnlp-demo.7.
- Pham C.M., et al. TopicGPT: A Prompt-based Topic Modeling Framework // arXiv e-prints. 2023. arXiv:2311.01449.
- Tong Z., Zhang H. A text mining research based on LDA topic modelling // International conference on computer science, engineering and information technology. 2016. pp. 201–210. DOI : 10.5121/csit.2016.60616.
- Geroimenko V. Key Principles of Good Prompt Design // The Essential Guide to Prompt Engineering: Key Principles, Techniques, Challenges, and Security Risks. Cham: Springer Nature Switzerland. 2025. pp. 17–36.
- Модели генерации текста. URL: https://yandex.cloud/ru/docs/foundation-models/concepts/yandexgpt/models (дата обращения: 15.05.2025).
- Yandex GPT. URL: https://ya.ru/ai/gpt (дата обращения: 15.05.2025).
- GiGaChat. URL: https://giga.chat/ (дата обращения: 15.05.2025).
- DeepSeek. URL: https://www.deepseek.com/ (дата обращения: 15.05.2025).
- Кузнецов С.А. Большой толковый словарь русского языка. Shangwu Yinshuguan, 2020. 1481 с.
- KenLM. URL: https://github.com/kpu/kenlm (дата обращения: 15.05.2025).
- Savkin M., Voznyuk A., Ignatov F., Korzanova A., Karpov D., Popov A., Konovalov V. DeepPavlov 1.0: Your Gateway to Advanced NLP Models Backed by Transformers and Transfer Learning // Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2024. pp. 465–474. doi: 10.18653/v1/2024.emnlp-demo.47.
- RuRoBERTa-large. URL: https://huggingface.co/ai-forever/ruRoberta-large (дата обращения: 15.05.2025).
- Maloyan N., Nutfullin B., Ilyushin E. Dialog-22 ruatd generated text detection // arXiv preprint arXiv:2206.08029. 2022.
- Gritsay G., Grabovoy A., Chekhovich Y. Automatic detection of machine generated texts: Need more tokens // 2022 Ivannikov Memorial Workshop (IVMEM). IEEE, 2022. pp. 20–26. doi: 10.1109/IVMEM57067.2022.9983964.
Supplementary files


