The quality improvement method for detecting attacks on web applications using pre-trained natural language models

Olga A. Kovaleva; Ковалева Ольга Александровна; Alexey Vladimirovich Samokhvalov; Самохвалов Алексей Владимирович; Mikhail A. Liashkov; Ляшков Михаил Андреевич; Sergey Yurevich Pchelintsev; Пчелинцев Сергей Юрьевич

doi:10.18500/1816-9791-2024-24-3-442-451

The quality improvement method for detecting attacks on web applications using pre-trained natural language models

Authors: Kovaleva O.A.¹^,2, Samokhvalov A.V.¹, Liashkov M.A.¹, Pchelintsev S.Y.¹
Affiliations:
1. Derzhavin Tambov State University
2. Tambov State Technical University
Issue: Vol 24, No 3 (2024)
Pages: 442-451
Section: Computer Sciences
URL: https://journals.rcsi.science/1816-9791/article/view/353383
DOI: https://doi.org/10.18500/1816-9791-2024-24-3-442-451
EDN: https://elibrary.ru/OJWHMC
ID: 353383

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

This paper explores the use of deep learning techniques to improve the performance of web application firewalls (WAFs), describes a specific method for improving the performance of web application firewalls, and presents the results of its testing on publicly available CSIC 2010 data. Most web application firewalls work on the basis of rules that have been compiled by experts. When running, firewalls inspect HTTP requests exchanged between client and server to detect attacks and block potential threats. Manual drafting of rules requires experts' time, and distributed ready-made rule sets do not take into account the specifics of particular user applications, therefore they allow many false positives and miss many network attacks. In recent years, the use of pretrained language models has led to significant improvements in a diverse set of natural language processing tasks as they are able to perform knowledge transfer. The article describes the adaptation of these approaches to the field of information security, i.e. the use of a pretrained language model as a feature extractor to match an HTTP request with a feature vector. These vectors are then used to train the classifier. We offer a solution that consists of two stages. In the first step, we create a deep pre-trained language model based on normal HTTP requests to the web application. In the second step, we use this model as a feature extractor and train a one-class classifier. Both steps are performed for each application. The experimental results show that the proposed approach significantly outperforms the classical Mod-Security approaches based on rules configured using OWASP CRS and does not require the involvement of a security expert to define trigger rules.

Keywords

firewalls, HTTP request analysis, pre-trained language models

References

Hacker A. J. Importance of web application firewall technology for protecting web-based resources. ICSA Labs an Independent Verizon Business, 2008, pp. 7. Available at: https://img2.helpnetsecurity.com/dl/articles/ICSA_Whitepaper.pdf (accessed December 28, 2022).
Sureda Riera T., Bermejo Higuera J. R., Bermejo Higuera J., Martinez Herraiz J. J., Sicilia Montalvo J. A. Prevention and fighting against web attacks through anomaly detection technology. A systematic review. Sustainability, 2020, vol. 12, iss. 12, art. 4945. https://doi.org/10.3390/su12124945
Betarte G., Martinez R., Pardo A. Web application attacks detection using machine learning techniques. 17th IEEE International Conference on Machine Learning and Applications (ICMLA). Orlando, 2018, pp. 1065–1072. https://doi.org/10.1109/ICMLA.2018.00174
Betarte G., Gimenez E., Martinez R., Pardo A. Improving web application rewalls through anomaly detection. 17th IEEE International Conference on Machine Learning and Applications (ICMLA). Orlando, 2018, pp. 779–784. https://doi.org/10.1109/ICMLA.2018.00124
Martinez R. Enhancing web application attack detection using machine learning. Montevideo, UdelaR – Area Informatica del Pedeciba, 2019. 82 p.
Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. RoBERTa: A robustly optimized BERT pretraining approach. ICLR 2020 Conference Blind Submission. Addis Ababa, 2020. Available at: https://openreview.net/forum?id=SyxS0T4tvS (accessed January 15, 2023).
Devlin J., Chang M. W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 2019. Minneapolis, 2019, pp. 4171–4186.
Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog, 2019, vol. 1, iss. 8, pp. 9.
Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L. Deep contextualized word representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, 2018, pp. 2227–2237. https://doi.org/10.18653/v1/N18-1202
Mikolov T., Chen K., Corrado G., Dean J. Efficient estimation of word representations in vector space. Computer Science, 2013. arXiv:1301.3781v3 [cs.CL]. https://doi.org/10.48550/arXiv.1301.3781
Bengio Y., Ducharme R., Vincent P., Janvin C. A neural probabilistic language model. Journal of Machine Learning Research, 2003, vol. 3, pp. 1137–1155.
Olah C. Deep learning, NLP, and representations. GitHub blog, posted on 2014, July, 7. Available at: https://colah.github.io/posts/2014-07-NLP-RNNs-Representations/ (accessed January 15, 2023).
Luong M. T., Socher R., Manning C. D. Better word representations with recursive neural networks for morphology. Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 2013, pp. 104–113.
Zou W. Y., Socher R., Cer D., Manning C. D. Bilingual word embeddings for phrase-based machine translation. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1393–1398.
Ethayarajh K. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, 2019, pp. 55–65. https://doi.org/10.18653/v1/D19-1006
Kruegel C., Vigna G. Anomaly detection of web-based attacks. Proceedings of CCS, 2003, pp. 251–261. https://doi.org/10.1145/948109.948144
Corona I., Ariu D., Giacinto G. HMM-Web: A framework for the detection of attacks against web applications. Proceedings of ICC, 2009, pp. 1–6. https://doi.org/10.1109/ICC.2009.5199054
Torrano-Gimenez C., Perez-Villegas A., Maranon G. A. An anomaly-based approach for intrusion detection in web traffic. Journal of Information Assurance and Security, 2010, vol. 5, pp. 446–454.
Yuan G., Li B., Yao Y., Zhang S. Deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection. 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage, AK, USA, 2017, pp. 3896–3903. https://doi.org/10.1109/IJCNN.2017.7966347
Yu Y., Yan H., Guan H., Zhou H. DeepHTTP: Anomalous HTTP Traffic Detection and Malicious Pattern Mining Based on Deep Learning. IET Information Security. Singapore, 2020, vol. 1299. https://doi.org/10.1007/978-981-33-4922-3_11
Qin Z. Q., Ma X. K., Wang Y. J. Attentional payload anomaly detector for web applications. International Conference on Neural Information Processing. Springer, 2018, pp. 588–599. https://doi.org/10.1007/978-3-030-04212-7_52
Vartouni A. M., Teshnehlab M., Kashi S. S. Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security, 2019, iss. 13, pp. 352–361. https://doi.org/10.1049/iet-ifs.2018.5404
Sennrich R., Haddow B., Birch A. Neural machine translation of rare words with subword units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, 2015, pp. 1715–1725. https://doi.org/10.18653/v1/P16-1162
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin I. Attention is all you need. Advances in Neural Information Processing Systems 30, 2017. https://doi.org/10.48550/arXiv.1706.03762
Scholkopf B., Platt J. C., Shawe-Taylor J., Smola A. J., Williamson R. C. Estimating the support of a high-dimensional distribution. Neural Computation, 2001, iss. 13, pp. 1443–1471. https://doi.org/10.1162/089976601750264965

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

Vol 25, No 3 (2025)

Vol 25, No 3 (2025)

The quality improvement method for detecting attacks on web applications using pre-trained natural language models

Full Text

Abstract

Keywords

About the authors

Olga A. Kovaleva

Alexey Vladimirovich Samokhvalov

Mikhail A. Liashkov

Sergey Yurevich Pchelintsev

References

Supplementary files