The Hybrid Method for Accurate Patent Classification
- 作者: Yadrintsev V.1,2, Sochenkov I.1,3
-
隶属关系:
- Federal Research Center Computer Science and Control of the Russian Academy of Sciences
- Peoples’ Friendship University of Russia (RUDN University)
- Lomonosov Moscow State University
- 期: 卷 40, 编号 11 (2019)
- 页面: 1873-1880
- 栏目: Article
- URL: https://journals.rcsi.science/1995-0802/article/view/206101
- DOI: https://doi.org/10.1134/S1995080219110325
- ID: 206101
如何引用文章
详细
This article is dedicated to stacking of two approaches of patent classification. First is based on linguistically-supported k-nearest neighbors algorithm using the method of search for topically similar documents based on a comparison of vectors of lexical descriptors. Second is the word embeddings based fastText, where the sentence (or a document) vector is obtained by averaging the n-gram embeddings, and then a multinomial logistic regression exploits these vectors as features. We show in Russian and English datasets that stacking classifier shows better results compared to single classifiers.
作者简介
V. Yadrintsev
Federal Research Center Computer Science and Control of the Russian Academy of Sciences; Peoples’ Friendship University of Russia (RUDN University)
编辑信件的主要联系方式.
Email: vvyadrincev@gmail.com
俄罗斯联邦, Moscow, 119333; Moscow, 117198
I. Sochenkov
Federal Research Center Computer Science and Control of the Russian Academy of Sciences; Lomonosov Moscow State University
编辑信件的主要联系方式.
Email: sochenkov@isa.ru
俄罗斯联邦, Moscow, 119333; Moscow, 119991
![](/img/style/loading.gif)