Review of heterozygosity visualization approaches in the context of conservation research
- Authors: Tomarovsky A.A.1,2, Totikov A.A.1,2, Yakupova A.R.1,2, Graphodatsky A.S.1, Kliver S.F.1
-
Affiliations:
- Institute of Molecular and Cellular Biology, Siberian Branch, Russian Academy of Sciences
- Novosibirsk State University
- Issue: Vol 21, No 4 (2023)
- Pages: 383-400
- Section: Methodology in ecological genetics
- URL: https://journals.rcsi.science/ecolgenet/article/view/254606
- DOI: https://doi.org/10.17816/ecogen609552
- ID: 254606
Cite item
Abstract
The assessment of heterozygosity level is one of the key metrics in conservation biology, as it contributes to the accurate design of conservation programs for endangered species. With the development of whole-genome sequencing technologies, it is now possible to more accurately estimate heterozygosity not only at the organismal level, but also at the population and species level. Contemporary conservation studies involve the processing of large volumes of whole-genome data, leading to problems of interpretation and necessitates the study of modern visualization methods for clear and correct presentation of results. In this review, we comprehensively examine the main types of visualization of heterozygosity assessments obtained using various approaches. We delve into the theory underlying each visualization method and discuss their characteristics using examples from studies of non-model species with different conservation statuses. The review provides insight into current tools for heterozygosity assessment and subsequent visualization, as well as current trends in this field.
Full Text
##article.viewOnOriginalSite##About the authors
Andrey A. Tomarovsky
Institute of Molecular and Cellular Biology, Siberian Branch, Russian Academy of Sciences; Novosibirsk State University
Author for correspondence.
Email: andrey.tomarovsky@gmail.com
ORCID iD: 0000-0002-6414-704X
SPIN-code: 6727-8664
Scopus Author ID: 57264872500
Russian Federation, Novosibirsk; Novosibirsk
Azamat A. Totikov
Institute of Molecular and Cellular Biology, Siberian Branch, Russian Academy of Sciences; Novosibirsk State University
Email: a.totickov1@gmail.com
ORCID iD: 0000-0003-1236-631X
SPIN-code: 9767-3971
Scopus Author ID: 57265434800
Russian Federation, Novosibirsk; Novosibirsk
Aliya R. Yakupova
Email: aliyah.yakupova@gmail.com
ORCID iD: 0000-0003-1486-0864
SPIN-code: 4292-0609
Scopus Author ID: 57264122200
independent researcher
GermanyAlexander S. Graphodatsky
Institute of Molecular and Cellular Biology, Siberian Branch, Russian Academy of Sciences
Email: graf@mcb.nsc.ru
ORCID iD: 0000-0002-8282-1085
SPIN-code: 4436-9033
Scopus Author ID: 7003878913
Dr. Sci. (Biology)
Russian Federation, NovosibirskSergei F. Kliver
Email: mahajrod@gmail.com
ORCID iD: 0000-0002-2965-3617
SPIN-code: 8635-4259
Scopus Author ID: 56449314300
independent researcher
DenmarkReferences
- Soulé ME. What is conservation biology? A new synthetic discipline addresses the dynamics and problems of perturbed species, communities, and ecosystems. BioSci. 1985;35(11):727–734. doi: 10.2307/1310054
- Fuentes-Pardo AP, Ruzzante DE. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations. Mol Ecol. 2017;26(20):5369–5406. doi: 10.1111/mec.14264
- Hoban S, Kelley JL, Lotterhos KE, et al. Finding the genomic basis of local adaptation: Pitfalls, practical solutions, and future directions. Am Nat The University of Chicago Press. 2016;188(4):379–397. doi: 10.1086/688018
- Hoban S, da Silva JM, Mastretta-Yanes A, et al. Monitoring status and trends in genetic diversity for the Convention on Biological Diversity: An ongoing assessment of genetic indicators in nine countries. Conserv Lett. 2023;16(3):e12953. doi: 10.1111/conl.12953
- Ng PC, Kirkness EF. Whole genome sequencing. In: Barnes MR, Breen G, editors. Genetic variation: methods and protocols. Totowa, NJ: Humana Press, 2010. P. 215–226. doi: 10.1007/978-1-60327-367-1_12
- Breed MF, Harrison PA, Blyth C, et al. The potential of genomics for restoring ecosystems and biodiversity: 10. Nat Rev Genet. 2019;20(10):615–628. doi: 10.1038/s41576-019-0152-0
- Kliver SF. Whole genome approach in conservation biology and its perspectives. Ecological genetics. 2021;19(3):281–298. doi: 10.17816/ecogen65152
- Joop Ouborg N, Angeloni F, Vergeer P. An essay on the necessity and feasibility of conservation genomics. Conserv Genet. 2010;11(2):643–653. doi: 10.1007/s10592-009-0016-9
- Dudchenko O, Shamim MS, Batra SS, et al. The Juicebox assembly tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000. bioRxiv. 2018;254797. doi: 10.1101/254797
- Durand NC, Robinson JT, Shamim MS, et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 2016;3(1):99–101. doi: 10.1016/j.cels.2015.07.012
- Luikart G, England PR, Tallmon D, et al. The power and promise of population genomics: from genotyping to genome typing: 12. Nat Rev Genet. 2003;4(12):981–994. doi: 10.1038/nrg1226
- Campbell MR, Vu NV, LaGrange AP, et al. Development and application of single-nucleotide polymorphism (SNP) genetic markers for conservation monitoring of burbot populations. Trans Am Fish Soc. 2019;148(3):661–670. doi: 10.1002/tafs.10157
- Bijlsma R, Loeschcke V. Genetic erosion impedes adaptive responses to stressful environments. Evol Appl. 2012;5(2):117–129. doi: 10.1111/j.1752-4571.2011.00214.x
- Leroy G, Carrol EL, Bruford MW, et al. Next-generation metrics for monitoring genetic erosion within populations of conservation concern. Evol Appl. 2018;11(7):1066–1083. doi: 10.1111/eva.12564
- Frankham R, Ballou JD, Eldridge MD, et al. Predicting the probability of outbreeding depression. Conserv Biol. 2011;25(3):465–475. doi: 10.1111/j.1523-1739.2011.01662.x
- Charlesworth D, Willis JH. The genetics of inbreeding depression: 11. Nat Rev Genet. 2009;10(11):783–796. doi: 10.1038/nrg2664
- Mayr E. Populations, species and evolution. Beknap Press. 453 p.
- Tomimatsu H, Ohara M. Genetic diversity and local population structure of fragmented populations of Trillium camschatcense (Trilliaceae). Biol Conserv. 2003;109(2):249–258. doi: 10.1016/S0006-3207(02)00153-2
- Hanski I. The Shrinking world: Ecological consequences of habitat loss. Excell Ecol. 2005;14.
- Lande R, Barrowclough G. Effective population size, genetic variation, and their use in population management. In: Soulé M, editor. Viable populations for conservation. Cambridge: Cambridge University Press, 1987. P. 87–124. doi: 10.1017/CBO9780511623400.007
- Wright S. Random drift and the shifting balance theory of evolution. In: Kojima K, editor. Mathematical topics in population genetics. Berlin, Heidelberg: Springer, 1970. P. 1–31. doi: 10.1007/978-3-642-46244-3_1
- Nevo E. Genetic variation in natural populations: Patterns and theory. Theor Popul Biol. 1978;13(1):121–177. doi: 10.1016/0040-5809(78)90039-4
- Lewontin R. The genetic basis of evolutionary change. Columbia University Press, 1974. 346 p.
- Steiner CC, Putnam AS, Hoeck PEA, Ryder A. Conservation genomics of threatened animal species. Annu Rev Anim Biosci. 2013;1(1):261–281. doi: 10.1146/annurev-animal-031412-103636
- Weir BS. Genetic data analysis II: Methods for discrete population genetic data. Oxford, New York: Oxford University Press, 1996. 445 p.
- Ritland K. Estimators for pairwise relatedness and individual inbreeding coefficients. Genet Res. 1996;67(2):175–185. doi: 10.1017/S0016672300033620
- Nei M, Li WH. Mathematical model for studying genetic variation in terms of restriction endonucleases. PNAS. 1979;76(10):5269–5273. doi: 10.1073/pnas.76.10.5269
- Wright S. The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution. 1965;19(3): 395–420. doi: 10.2307/2406450
- Shafer ABA, Wolf JBW, Alves PC, et al. Genomics and the challenging translation into conservation practice. Trends Ecol Evol. 2015;30(2):78–87. doi: 10.1016/j.tree.2014.11.009
- Hoffmann A, Griffin P, Dillon S, et al. A framework for incorporating evolutionary genomics into biodiversity conservation and management. Clim Change Responses. 2015;2(1):1. doi: 10.1186/s40665-014-0009-x
- Benestan LM, Ferchaud A-L, Hohenlohe PA, et al. Conservation genomics of natural and managed populations: building a conceptual and practical framework. Mol Ecol. 2016;25(13):2967–2977. doi: 10.1111/mec.13647
- Hoban S, Gaggiotti O, ConGRESS Consortium, Bertorelle G. Sample planning optimization tool for conservation and population genetics (SPOTG): a software for choosing the appropriate number of markers and samples. Methods Ecol Evol. 2013;4(3):299–303. doi: 10.1111/2041-210x.12025
- Nazareno AG, Bemmels JB, Dick CW, Lohmann LG. Minimum sample sizes for population genomics: an empirical study from an Amazonian plant species. Mol Ecol Resour. 2017;17(6):1136–1147. doi: 10.1111/1755-0998.12654
- Gibson J, Morton NE, Collins A. Extended tracts of homozygosity in outbred human populations. Hum Mol Genet. 2006;15(5):789–795. doi: 10.1093/hmg/ddi493
- McQuillan R, Leutenegger A-L, Abdel-Rahman R, et al. Runs of homozygosity in European populations. Am J Hum Genet. 2008;83(3):359–372. doi: 10.1016/j.ajhg.2008.08.007
- Darwin C. The effects of cross and self fertilisation in the vegetable kingdom. Ams PressInc, 1877. doi: 10.5962/bhl.title.104481
- Ceballos FC, Joshi PK, Clark DW, et al. Runs of homozygosity: windows into population history and trait architecture: 4. Nat Rev Genet. 2018;19(4):220–234. doi: 10.1038/nrg.2017.109
- Hoffman JI, Simpson F, David P, et al. High-throughput sequencing reveals inbreeding depression in a natural population. PNAS. 2014;111(10):3775–3780. doi: 10.1073/pnas.1318945111
- Muir WM, Wong GK-S, Zhang Y, et al. Genome-wide assessment of worldwide chicken SNP genetic diversity indicates significant absence of rare alleles in commercial breeds. PNAS. 2008;105(45):17312–17317. doi: 10.1073/pnas.0806569105
- Urbinati I, Stafuzza NB, Oliveira MT, et al. Selection signatures in Canchim beef cattle. J Anim Sci Biotechnol. 2016;7(1):29. doi: 10.1186/s40104-016-0089-5
- Samuels DC, Wang J, Ye K, et al. Heterozygosity ratio, a robust global genomic measure of autozygosity and its association with height and disease risk. Genetics. 2016;204(3):893–904. doi: 10.1534/genetics.116.189936
- Li N, Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165(4):2213–2233. doi: 10.1093/genetics/165.4.2213
- Rife DC. Populations of hybrid origin as source material for the detection of linkage. Am J Hum Genet. 1954;6(1):26–33.
- Robinson JA, Räikkönen J, Vucetich LM, et al. Genomic signatures of extensive inbreeding in Isle Royale wolves, a population on the threshold of extinction. Sci Adv. 2019;5(5):eaau0757. doi: 10.1126/sciadv.aau0757
- Koepfli K-P, Tamazian G, Wildt D, et al. Whole genome sequencing and re-sequencing of the sable antelope (Hippotragus niger): A resource for monitoring diversity in ex situ and in situ populations. G3 Genes Genomes Genetics. 2019;9(6):1785–1793. doi: 10.1534/g3.119.400084
- Big Soviet Encyclopedia. Vol. 20. 3rd ed. 1974. P. 25. (In Russ.)
- Zhu L, Deng C, Zhao X, et al. Endangered Père David’s deer genome provides insights into population recovering. Evol Appl. 2018;11(10):2040–2053. doi: 10.1111/eva.12705
- Beichman AC, Koepfli K-P, Li G, et al. Aquatic adaptation and depleted diversity: A Deep dive into the genomes of the sea otter and giant otter. Mol Biol Evol. 2019;36(12):2631–2655. doi: 10.1093/molbev/msz101
- Abascal F, Corvelo A, Cruz F, et al. Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx. Genome Biol. 2016;17(1):251. doi: 10.1186/s13059-016-1090-1
- Cho YS, Hu L, Hou H, et al. The tiger genome and comparative analysis with lion and snow leopard genomes: 1. Nat Commun. 2013;4(1):2433. doi: 10.1038/ncomms3433
- Miller W, Schuster SC, Welch AJ, et al. Polar and brown bear genomes reveal ancient admixture and demographic footprints of past climate change. PNAS. 2012;109(36):E2382–E2390. doi: 10.1073/pnas.1210506109
- Venn JI. On the diagrammatic and mechanical representation of propositions and reasonings. Lond Edinb Dublin Philos Mag J Sci. 1880;10(59):1–18. doi: 10.1080/14786448008626877
- Miller W, Hayes VM, Ratan A, et al. Genetic diversity and population structure of the endangered marsupial Sarcophilus harrisii (Tasmanian devil). PNAS. 2011;108(30):12348–12353. doi: 10.1073/pnas.1102838108
- Humble E, Dobrynin P, Senn H, et al. Chromosomal-level genome assembly of the scimitar-horned oryx: Insights into diversity and demography of a species extinct in the wild. Mol Ecol Resour. 2020;20(6):1668–1681. doi: 10.1111/1755-0998.13181
- Yakupova A, Tomarovsky A, Totikov A, et al. Chromosome-length assembly of the baikal seal (Pusa sibirica) genome reveals a historically large population prior to isolation in Lake Baikal: 3. Genes. 2023;14(3):619. doi: 10.3390/genes1403061
- Kliver S, Houk ML, Perelman PL, et al. Chromosome-length genome assembly and karyotype of the endangered black-footed ferret (Mustela nigripes). J Hered. 2023;114(5):539–548. doi: 10.1093/jhered/esad035
- Li R, Fan W, Tian G, et al. The sequence and de novo assembly of the giant panda genome: 7279. Nature. 2010;463(7279): 311–317. doi: 10.1038/nature08696
- Dobrynin P, Liu S, Tamazian G, et al. Genomic legacy of the African cheetah, Acinonyx jubatus. Genome Biol. 2015;16(1):277. doi: 10.1186/s13059-015-0837-4
- Lindblad-Toh K, Wade CM, Mikkelsen TS, et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog: 7069. Nature. 2005;438(7069):803–819. doi: 10.1038/nature04338
- Benjamini Y. Opening the box of a boxplot. Am Stat. 1988;42(4):257–262. doi: 10.2307/2685133
- Totikov A, Tomarovsky A, Prokopov D, et al. Chromosome-level genome assemblies expand capabilities of genomics for conservation biology: 9. Genes. 2021;12(9):1336. doi: 10.3390/genes12091336
- Hintze JL, Nelson RD. Violin plots: A box plot-density trace synergism. Am Stat. 1998;52(2):181–184. doi: 10.1080/00031305.1998.10480559
- de Manuel M, Barnett R, Sandoval-Velasco M, et al. The evolutionary history of extinct and living lions. PNAS USA. 2020;117(20): 10927–10934. doi: 10.1073/pnas.1919423117
- Burton JN, Adey A, Patwardhan RP, et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions: 12. Nat Biotechnol. 2013;31(12):1119–1125. doi: 10.1038/nbt.2727
- Lewin HA, Graves JAM, Ryder OA, et al. Precision nomenclature for the new genomics. GigaScience. 2019;8(8):giz086. doi: 10.1093/gigascience/giz086
- Wilkinson L, Friendly M. The history of the cluster heat map. Am Stat. 2009;63(2):179–184. doi: 10.1198/tas.2009.0033
- de Ferran V, Figueiro HV, de Jesus Trindade F, et al. Phylogenomics of the world’s otters. Curr Biol. 2022;32(16):3650–3658.e4. doi: 10.1016/j.cub.2022.06.036
- Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4
- Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195–197. doi: 10.1016/0022-2836(81)90087-5
- Katoh K, Standley DM. MAFFT Multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–780. doi: 10.1093/molbev/mst010
- Magis C, Taly J-F, Bussotti G, et al. T-Coffee: tree-based consistency objective function for alignment evaluation. In: Russell DJ, editor. Multiple sequence alignment methods. Totowa, NJ: Humana Press, 2014. P. 117–129. doi: 10.1007/978-1-62703-646-7_7
- Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5(1):113. doi: 10.1186/1471-2105-5-113
- Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2
- Thompson JD, Gibson TJ, Higgins DG. Multiple sequence alignment using ClustalW and ClustalX. Curr Protoc Bioinforma. 2003;1:2–3. doi: 10.1002/0471250953.bi0203s00
- Altshuler D, Donnelly P; The International HapMap Consortium. A haplotype map of the human genome: 7063. Nature. 2005;437(7063):1299–1320. doi: 10.1038/nature04226
- Durbin RM; The International HapMap Consortium, et al. A map of human genome variation from population-scale sequencing: 7319. Nature. 2010;467(7319):1061–1073. doi: 10.1038/nature09534
- Nusrat S, Harbig T, Gehlenborg N. Tasks, techniques, and tools for genomic data visualization. Comput Graph Forum. 2019;38(3): 781–805. doi: 10.1111/cgf.13727
- Karolchik D, Baertsch R, Diekhans M, et al. The UCSC genome browser database. Nucleic Acids Res. 2003;31(1):51–54. doi: 10.1093/nar/gkg129
- Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178–192. doi: 10.1093/bib/bbs017
- Yates AD, Achuthan P, Akanni W, et al. Ensembl 2020. Nucleic Acids Res. 2020;48(D1):D682–D688. doi: 10.1093/nar/gkz966
- Okonechnikov K, Golosova O, Fursov M, et al. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28(8): 1166–1167. doi: 10.1093/bioinformatics/bts091
- Danecek P, Auton A, Abecasis G, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–2158. doi: 10.1093/bioinformatics/btr330
- Narasimhan V, Danecek P, Scally A, et al. BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics. 2016;32(11):1749–1751. doi: 10.1093/bioinformatics/btw044
- Ihaka R, Gentleman R. R: a language for data analysis and graphics. J Comput Graph Stat. 1996;5(3):299–314. doi: 10.1080/10618600.1996.10474713
- van Rossum G. Python reference manual. Dep Comput Sci. 1995; R9525.
- Hunter JD. Matplotlib: A 2D Graphics environment. Comput Sci. 2007;9(3):90–95. doi: 10.1109/MCSE.2007.55
- Schiavinato M, del Olmo V, Muya VN, Gabaldon T. JLOH: Inferring loss of heterozygosity blocks from sequencing data. bioRxiv. 2023;2023.05.04.539368. doi: 10.1101/2023.05.04.539368
- Gel B, Serra E. KaryoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics. 2017;33(19):3088–3090. doi: 10.1093/bioinformatics/btx346
- Bertrand AR, Kadri NK, Flori L, et al. RZooRoH: An R package to characterize individual genomic autozygosity and identify homozygous-by-descent segments. Methods Ecol Evol. 2019;10(6):860–866. doi: 10.1111/2041-210X.13167
- Zhou J, Liu L, Lopdell TJ, et al. HandyCNV: Standardized summary, annotation, comparison, and visualization of copy number variant, copy number variation region, and runs of homozygosity. Front Genet. 2021;12:731355. doi: 10.3389/fgene.2021.731355
- Biscarini F, Cozzi P, Gaspa G, Marras G. detectRUNS: Detect runs of homozygosity and runs of heterozygosity in diploid genomes. CRAN (The Comprehensive R Archive Network), 2018. Available at: https://cran.r-project.org/web/packages/detectRUNS/vignettes/detectRUNS.vignette.html
- Allaire J. RStudio: integrated development environment for R. Boston MA. 2012;770(394):165–171.
- Kluyver T, Ragan-Kelley B, Pérez F, et al. Jupyter Notebooks-a publishing format for reproducible computational workflows. Elpub. 2016;2016:87–90. doi: 10.3233/978-1-61499-649-1-87