TWENTY SIMILARITY FUNCTIONS OF TWO FINITE SEQUENCES

I. Burdonov; Бурдонов И.; A. Maksimov; Максимов А.

doi:10.31857/S0132347423050035

TWENTY SIMILARITY FUNCTIONS OF TWO FINITE SEQUENCES

Autores: Burdonov I.¹, Maksimov A.¹
Afiliações:
1. Ivannikov Institute for System Programming of the Russian Academy of Sciences
Edição: Nº 5 (2023)
Páginas: 3-18
Seção: PARALLEL AND DISTRIBUTED SOFTWARE
URL: https://journals.rcsi.science/0132-3474/article/view/141777
DOI: https://doi.org/10.31857/S0132347423050035
EDN: https://elibrary.ru/ZYAIZN
ID: 141777

Citar

Texto integral

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

The article discusses the various numerical functions that determine the degree of "similarity" of the two given final sequences. These similarity measures are based on the concept we define of embedding in a sequence. A special case of such an attachment is the usual sub-subsequence. Other cases further require equality of distances between adjacent sub-sequence symbols in both sequences. This is generalization of the concept of a sequence segment (substring) in which these distances are unit. In addition, equality of distances from the beginning of the sequences to the first embedding symbol or from the last embedding symbol to the end of the sequences may be required. Except these last two cases, the attachment can be in a sequence several times. The literature uses functions such as the number of common attachments or the number of attachment occurrence pairs in a sequence. In addition to them, we enter three more functions: the sum of the lengths of total investments, the sum of the minima of the number of occurrences of a common embedding in both sequences and the similarity function based on the largest number of symbols of the common embedding. In total, 20 numerical functions are considered, for 17 of which algorithms (including new ones) of polynomial complexity are proposed, for two more functions, algorithms have exponential complexity with reduced a measure of degree. The Conclusion gives a brief comparative description of these investments and functions.

Palavras-chave

sequence analysis common subsequences longest and maximal common subsequences canonical embedding matching joint embeddings algorithms for subsequence and embeddings combinatorics similarity axioms

Sobre autores

I. Burdonov

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Email: igor@ispras.ru
Russia, Moscow

A. Maksimov

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Autor responsável pela correspondência
Email: andrew@ispras.ru
Russia, Moscow

Bibliografia

Wagner R., Fischer M. The string-to-string correction problem // Journal of the ACM. 1974. V. 21. № 1. P. 168–173. https://dl.acm.org/doi/10.1145/321796.321811
Wang H. All common subsequences, in: M.M. Veloso (Ed.), IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007. https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-101.pdf
Cees Elzinga, Sven Rahmann, Hui Wang: Algorithms for Subsequence Combinatorics. Theoretical Computer Science. 2008. V. 409. № 3. P. 394–404. https://doi.org/10.1016/j.tcs.2008.08.035
Gilbert Ritschard and Matthias Studer (editors). Proceedings of the International Conference on Sequence Analysis and Related Methods (LaCOSA II). Lausanne, Switzerland, June 8–10, 2016. https://www.academia.edu/83294569/Proceedings_of_the_International_Conference_on_Sequence_Analysis_and_Related_Methods_LaCOSA_II_Lausanne_Switzerland_June_8_10_2016
Знаменский С.В. Модель и аксиомы метрик сходства, Программные системы: теория и приложения. 2017. Т. 8. Вып. 4. С. 347–357. https://doi.org/10.25209/2079-3316-2017-8-4-347-357
Conte A., Grossi R., Punzi G. et al. Enumeration of Maximal Common Subsequences Between Two Strings // Algorithmica. 2022. V. 84. P. 757–783. https://doi.org/10.1007/s00453-021-00898-5

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nº 4 (2025)

Nº 4 (2025)

TWENTY SIMILARITY FUNCTIONS OF TWO FINITE SEQUENCES

Texto integral

Resumo

Palavras-chave

Sobre autores

I. Burdonov

A. Maksimov

Bibliografia

Arquivos suplementares