The Hybrid Method for Accurate Patent Classification


Citar

Texto integral

Acesso aberto Acesso aberto
Acesso é fechado Acesso está concedido
Acesso é fechado Somente assinantes

Resumo

This article is dedicated to stacking of two approaches of patent classification. First is based on linguistically-supported k-nearest neighbors algorithm using the method of search for topically similar documents based on a comparison of vectors of lexical descriptors. Second is the word embeddings based fastText, where the sentence (or a document) vector is obtained by averaging the n-gram embeddings, and then a multinomial logistic regression exploits these vectors as features. We show in Russian and English datasets that stacking classifier shows better results compared to single classifiers.

Sobre autores

V. Yadrintsev

Federal Research Center Computer Science and Control of the Russian Academy of Sciences; Peoples’ Friendship University of Russia (RUDN University)

Autor responsável pela correspondência
Email: vvyadrincev@gmail.com
Rússia, Moscow, 119333; Moscow, 117198

I. Sochenkov

Federal Research Center Computer Science and Control of the Russian Academy of Sciences; Lomonosov Moscow State University

Autor responsável pela correspondência
Email: sochenkov@isa.ru
Rússia, Moscow, 119333; Moscow, 119991


Declaração de direitos autorais © Pleiades Publishing, Ltd., 2019

Este site utiliza cookies

Ao continuar usando nosso site, você concorda com o procedimento de cookies que mantêm o site funcionando normalmente.

Informação sobre cookies