Comparative Analysis of Structural Variant Callers on the Short-Read Whole-Genome Sequencing Data

Cover Page

Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

In this study three structural variant callers (Manta, Smoove, Delly) were analysed on the whole-genome sequencing data using four different alignment algorithms: DRAGEN, GDC DNA-Seq Alignment Workflow, GDC DNA-Seq Alignment Workflow + GDC DNA-Seq Co-Cleaning Workflow, NovoAlign, different lengths of raw reads: 2 × 150 bp and 2 × 250 bp, different mean genome coverage values. Results were compared to etalon results of GIAB team. Structural variants validation was hold also with Sanger sequencing. Structural variants deletions and insertions as it turned out were best determined with Manta tool. We’ve got 89–96% of accuracy and 59–70% of sensitivity for analysed deletions, and also 96–99% of accuracy and 15–36% of sensitivity for insertions. Smoove and Delly showed less accurate and sensitive results (Smoove: 91–95% of accuracy and 8–54% of sensitivity for deletions, Delly: 78–87% of accuracy and 31–66% of sensitivity for deletions, 99–100% of accuracy and 1–13% of sensitivity for insertions). Simultaneous using of two or even three structural variant callers didn’t give a rise of accuracy and sensitivity for deletions. Analysis showed that accuracy and sensitivity of structural variant callers rise with the rising of mean genome coverage value, increasing of reads length from 150 to 250 bp influence in to varying degrees on the accuracy and sensitivity of individual tools. Another inference of this study was that accuracy of structural variants callers vary depends on structural variants size range. For example, Manta finds better deletions in the range from 200 and more bp, Delly – from 1000 to 10 000 bp, Smoove – from 200 to 10 000 bp.

About the authors

A. A. Mkrtchian

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Author for correspondence.
Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

S. M. Yudin

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

A. A. Keskinov

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

V. S. Yudin

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

T. A. Shpakova

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

L. V. Frolova

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

E. A. Snigir

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

A. P. Sergeev

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

D. V. Svetlichny

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

M. N. Pilipenko

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

A. A. Ivashechkin

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

P. U. Zemsky

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

S. I. Mitrofanov

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

P. G. Kazakova

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

K. S. Grammatikati

Federal State Budgetary Institution “Centre for Strategic Planning and Management of Biomedical Health Risks”
of the Federal Medical Biological Agency

Email: AMkrtchyan@cspfmba.ru
Russia, 119121, Moscow

V. I. Skvortsova

The Federal Medical Biological Agency of Russia

Email: AMkrtchyan@cspfmba.ru
Russia, 123182, Moscow

References

  1. Pang A.W., MacDonald J.R., Pinto D. et al. Towards a comprehensive structural variation map of an individual human genome // Genome Biol. 2010. V. 11. № 5. P. R52. https://doi.org/10.1186/gb-2010-11-5-r52
  2. The International HapMap Consortium. The international hapmap project // Nature. 2003. P. 789–796. https://doi.org/10.1038/nature02168
  3. Sudmant P.H., Rausch T., Gardner E.J. et al. An integrated map of structural variation in 2,504 human genomes: 7571 // Nature. 2015. V. 526. № 7571. P. 75–81. https://doi.org/10.1038/nature15394
  4. Pös O., Radvanszky J., Buglyo G. et al. DNA copy number variation: Main characteristics, evolutionary significance, and pathological aspects // Biomed. J. 2021. V. 44. № 5. P. 548–559. https://doi.org/10.1016/j.bj.2021.02.003
  5. Alkan C., Coe B.P., Eichler E.E. Genome structural variation discovery and genotyping // Nat. Rev. Genet. 2011. V. 12. № 5. P. 363–367. https://doi.org/10.1038%2Fnrg2958
  6. Mahmoud M., Gobet N., Cruz-Davalos D.I. et al. Structural variant calling: the long and the short of it // Genome Biol. 2019. V. 20. № 1. P. 246. https://doi.org/10.1186/s13059-019-1828-7
  7. Carvalho C.M., Lupski J.R. Mechanisms underlying structural variant formation in genomic disorders // Nat. Rev. Genet. 2016. V. 17. № 4. P. 224–238. https://doi.org/10.1038/nrg.2015.25
  8. Sedlazeck F.J., Lee H., Darby C.A. et al. Piercing the dark matter: bioinformatics of long-range sequencing and mapping // Nat. Rev. Genet. 2018. V. 19. № 6. P. 329–346. https://doi.org/10.1038/s41576-018-0003-4
  9. Collins R.L., Brand H., Karczewski K.J. et al. A structural variation reference for medical and population genetics // Nature. 2021. V. 581. P. 444–451. https://doi.org/10.1038/s41586-020-2287-8
  10. Weischenfeldt J., Symmons O., Spitz F. et al. Phenotypic impact of genomic structural variation: Insights from and for human disease // Nat. Rev. Genet. 2013. V. 14. № 2. P. 125–138. https://doi.org/10.1038/nrg3373
  11. Stankiewicz P., Lupski J. Structural variation in the human genome and its role in disease // Annu. Rev. Med. 2010. V. 61. P. 437–455. https://doi.org/10.1146/annurev-med-100708-204735
  12. Schüle B., McFarland K.N., Lee K. et al. Parkinson’s disease associated with pure ATXN10 repeat expansion // NPJ Parkinson’s Disease. 2017. V. 3. P. 27. https://doi.org/10.1038/s41531-017-0029-x
  13. Yeh Y.C., Ho H.L., Wu Y.C. et al. AKT1 internal tandem duplications and point mutations are the genetic hallmarks of sclerosing pneumocytoma // Mod. Pathol. 2020. V. 33. № 3. P. 391–403. https://doi.org/10.1038/s41379-019-0357-y
  14. Schütte J., Reusch J., Khandanpour C. et al. Structural variants as a basis for targeted therapies in hematological malignancies // Front. Oncol. 2019. V. 9. P. 839. https://doi.org/10.3389%2Ffonc.2019.00839
  15. Ewing A., Meynert A., Churchman M. et al. Structural variants at the BRCA1/2 loci are a common source of homologous repair deficiency in high-grade serous ovarian carcinoma // Clin. Cancer Res. V. 27. № 11. P. 3201–3214. https://doi.org/10.1158/1078-0432.ccr-20-4068
  16. Malhotra D., Sebat J. CNVs: Harbingers of a rare variant revolution in psychiatric genetics // Cell. 2012. V. 148. № 6. P. 1223–1241. https://doi.org/10.1016%2Fj.cell.2012.02.039
  17. Huse K., Taudien S., Groth M. et al. Genetic variants of the copy number polymorphic β-defensin locus are associated with sporadic prostate cancer // Tumor Biol. 2008. V. 29. № 2. P. 83–92. https://doi.org/10.1159/000135688
  18. Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls // Nature. 2010. V. 464. P. 713–720. https://doi.org/10.1038/nature08979
  19. Fanciulli M., Norsworthy P.J., Petretto E. et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity // Nat. Genet. 2007. V. 39. № 6. P. 721–723. https://doi.org/10.1038/ng2046
  20. Fellermann K., Stange D.E., Schaeffeler E. et al. A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to crohn disease of the colon // Am. J. Hum. Genet. 2006. V. 79. № 3. P. 439–448. https://doi.org/10.1086/505915
  21. Hollox E.J., Huffmeier U., Zeeuwen P.L. et al. Psoriasis is associated with increased β-defensin genomic copy number // Nat. Genet. 2008. V. 40. № 1. P. 23–25. https://doi.org/10.1038/ng.2007.48
  22. Zook J.M., Hansen N.F., Olson N.D. et al. A robust benchmark for detection of germline large deletions and insertions // Nat. Biotechnol. 2020. V. 38. P. 1347–1355. https://doi.org/10.1038/s41587-020-0538-8
  23. Ye J., Coulouris G., Zaretskaya I. et al. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction // BMC Bioinformatics. 2012. V. 13. № 1. P. 134. https://doi.org/10.1186/1471-2105-13-134
  24. Sequencing analysis viewer [Electronic resource] // Sequencing analysis viewer support. URL: https://support.illumina.com/sequencing/sequencing_software/ sequencing_analysis_viewer_sav.html (accessed: 15.09.2022).
  25. bcl2fastq [Electronic resource] // bcl2fastq. URL: https://emea.support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software/ downloads.html (accessed: 13.05.2022).
  26. FastQC [Electronic resource] // Babraham bioinformatics FastQC a quality control tool for high throughput sequence data. URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed: 13.05.2022).
  27. Illumina DRAGEN Bio-IT Platform [Electronic resource] // Illumina DRAGEN Bio-IT platform | variant calling & secondary genomic analysis software tool. URL: https://www.illumina.com/products/by-type/informatics-products/dragen-bio-it-platform.html (accessed: 13.05.2022).
  28. Schneider V.A., Graves-Lindsay T., Howe K. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly // Genome Res. 2017. V. 27. № 5. P. 849–864. https://doi.org/10.1101/gr.213611.116
  29. Chen X., Schulz-Trieglaff O., Shaw R. et al. Manta: Rapid detection of structural variants and indels for germline and cancer sequencing applications // Bioinformatics. 2016. V. 32. № 8. P. 1220–1222. https://doi.org/10.1093/bioinformatics/btv710
  30. Smoove [Electronic resource] // Brent S. Smoove: structural-variant calling and genotyping with existing tools. URL: https://github.com/brentp/smoove (accessed: 13.07.2022).
  31. Rausch T., Zichner T., Schlattl A. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis // Bioinformatics. 2012. V. 28. № 18. P. i333–i339. https://doi.org/10.1093/bioinformatics/bts378
  32. NCBI [Electronic resource] // GRCh37 hg19 genome assembly NCBI. URL: https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.13/ (accessed: 15.09.2022).
  33. Zhao H., Sun Z., Wang J. et al. CrossMap: A versatile tool for coordinate conversion between genome assemblies // Bioinformatics. 2014. V. 30. № 7. P. 1006–1007. https://doi.org/10.1093/bioinformatics/btt730
  34. GDC [Electronic resource] // GDC viewer docs. URL: https://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=alignment_cocleaning_workflow (accessed: 13.07.2022).
  35. Li H., Durbin R. Fast and accurate long-read alignment with Burrows–Wheeler transform // Bioinformatics. 2010. V. 26. № 5. P. 589-595. https://doi.org/10.1093/bioinformatics/btp698
  36. Danecek P., Bonfield J.K., Liddle J. et al. Twelve years of samtools and bcftools // GigaScience. 2021. V. 10. № 2. https://doi.org/10.1093/gigascience/giab008
  37. Pedersen B.S., Quinlan A.R. Duphold: scalable, depth-based annotation and curation of high-confidence structural variant calls // GigaScience. 2019. V. 8. № 4. https://doi.org/10.1093/gigascience/giz040
  38. Jeffares D.C., Jolly C., Hoti M. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast // Nat. Commun. 2017. V. 8. P. 14061. https://doi.org/10.1038/ncomms14061
  39. Witty.er [Electronic resource] // Wan Y., Ho K. Witty.er. URL: https://github.com/Illumina/witty.er (accessed: 13.07.2022).
  40. Unipro UGENE [Electronic resource] // Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. Oxford academic. URL: https://academic.oup.com/bioinformatics/article/28/8/1166/195474?login=false (accessed: 13.07.2022).
  41. Robinson J.T., Thorvaldsdottir H., Winckler W. et al. Integrative genomics viewer // Nat. Biotechnol. 2011. V. 29. № 1. P. 24–26. https://doi.org/10.1038/nbt.1754

Supplementary files

Supplementary Files
Action
1. JATS XML
2.

Download (206KB)
3.

Download (536KB)
4.

Download (182KB)
5.

Download (314KB)
6.

Download (743KB)
7.

Download (846KB)
8.

Download (288KB)
9.

Download (246KB)
10.

Download (538KB)
11.

Download (215KB)

Copyright (c) 2023 А.А. Мкртчян, К.С. Грамматикати, П.Г. Казакова, С.И. Митрофанов, П.Ю. Земский, А.А. Ивашечкин, М.Н. Пилипенко, Д.В. Светличный, А.П. Сергеев, Е.А. Снигирь, Л.В. Фролова, Т.А. Шпакова, В.С. Юдин, А.А. Кескинов, С.М. Юдин, В.И. Скворцова

This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies