TWENTY SIMILARITY FUNCTIONS OF TWO FINITE SEQUENCES

Мұқаба

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат Ашық рұқсат
Рұқсат жабық Рұқсат берілді
Рұқсат жабық Тек жазылушылар үшін

Аннотация

The article discusses the various numerical functions that determine the degree of "similarity" of the two given final sequences. These similarity measures are based on the concept we define of embedding in a sequence. A special case of such an attachment is the usual sub-subsequence. Other cases further require equality of distances between adjacent sub-sequence symbols in both sequences. This is generalization of the concept of a sequence segment (substring) in which these distances are unit. In addition, equality of distances from the beginning of the sequences to the first embedding symbol or from the last embedding symbol to the end of the sequences may be required. Except these last two cases, the attachment can be in a sequence several times. The literature uses functions such as the number of common attachments or the number of attachment occurrence pairs in a sequence. In addition to them, we enter three more functions: the sum of the lengths of total investments, the sum of the minima of the number of occurrences of a common embedding in both sequences and the similarity function based on the largest number of symbols of the common embedding. In total, 20 numerical functions are considered, for 17 of which algorithms (including new ones) of polynomial complexity are proposed, for two more functions, algorithms have exponential complexity with reduced a measure of degree. The Conclusion gives a brief comparative description of these investments and functions.

Авторлар туралы

I. Burdonov

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Email: igor@ispras.ru
Russia, Moscow

A. Maksimov

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Хат алмасуға жауапты Автор.
Email: andrew@ispras.ru
Russia, Moscow

Әдебиет тізімі

  1. Wagner R., Fischer M. The string-to-string correction problem // Journal of the ACM. 1974. V. 21. № 1. P. 168–173. https://dl.acm.org/doi/10.1145/321796.321811
  2. Wang H. All common subsequences, in: M.M. Veloso (Ed.), IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007. https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-101.pdf
  3. Cees Elzinga, Sven Rahmann, Hui Wang: Algorithms for Subsequence Combinatorics. Theoretical Computer Science. 2008. V. 409. № 3. P. 394–404. https://doi.org/10.1016/j.tcs.2008.08.035
  4. Gilbert Ritschard and Matthias Studer (editors). Proceedings of the International Conference on Sequence Analysis and Related Methods (LaCOSA II). Lausanne, Switzerland, June 8–10, 2016. https://www.academia.edu/83294569/Proceedings_of_the_International_Conference_on_Sequence_Analysis_and_Related_Methods_LaCOSA_II_Lausanne_Switzerland_June_8_10_2016
  5. Знаменский С.В. Модель и аксиомы метрик сходства, Программные системы: теория и приложения. 2017. Т. 8. Вып. 4. С. 347–357. https://doi.org/10.25209/2079-3316-2017-8-4-347-357
  6. Conte A., Grossi R., Punzi G. et al. Enumeration of Maximal Common Subsequences Between Two Strings // Algorithmica. 2022. V. 84. P. 757–783. https://doi.org/10.1007/s00453-021-00898-5

© И. Бурдонов, А. Максимов, 2023

Осы сайт cookie-файлдарды пайдаланады

Біздің сайтты пайдалануды жалғастыра отырып, сіз сайттың дұрыс жұмыс істеуін қамтамасыз ететін cookie файлдарын өңдеуге келісім бересіз.< / br>< / br>cookie файлдары туралы< / a>