Fast Trie-Based Method for Multiple Pairwise Sequence Alignment
- Authors: Yakovlev P.A.1
-
Affiliations:
- Biocad Company
- Issue: Vol 99, No 1 (2019)
- Pages: 64-67
- Section: Mathematics
- URL: https://journals.rcsi.science/1064-5624/article/view/225623
- DOI: https://doi.org/10.1134/S1064562419010198
- ID: 225623
Cite item
Abstract
A method for efficient comparison of a symbol sequence with all strings of a set is presented, which performs considerably faster than the naive enumeration of comparisons with all strings in succession. The procedure is accelerated by applying an original algorithm combining a prefix tree and a standard dynamic programming algorithm searching for the edit distance (Levenshtein distance) between strings. The efficiency of the method is confirmed by numerical experiments with arrays consisting of tens of millions of biological sequences of variable domains of monoclonal antibodies.
About the authors
P. A. Yakovlev
Biocad Company
Author for correspondence.
Email: yakovlev@biocad.ru
Russian Federation, St. Petersburg, 191186