Word Embedding for Semantically Related Words: An Experimental Study

M. S. Karyaeva; P. I. Braslavski; V. A. Sokolov

doi:10.3103/S0146411619070083

Word Embedding for Semantically Related Words: An Experimental Study

Autores: Karyaeva M.S.¹, Braslavski P.I.², Sokolov V.A.¹
Afiliações:
1. Demidov Yaroslavl State University
2. Ural Federal University
Edição: Volume 53, Nº 7 (2019)
Páginas: 638-643
Seção: Article
URL: https://journals.rcsi.science/0146-4116/article/view/175889
DOI: https://doi.org/10.3103/S0146411619070083
ID: 175889

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Somente assinantes

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.

Palavras-chave

word embedding, word2vec, semantic relations, thesaurus, hyponymy, hypernymy, synonymy

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Word Embedding for Semantically Related Words: An Experimental Study

Texto integral

Resumo

Palavras-chave

Sobre autores

M. Karyaeva

P. Braslavski

V. Sokolov

Arquivos suplementares