AU Institutional Repository

Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations

Show simple item record

dc.contributor.author . Martin, Alicia R
dc.contributor.author Atkinson, Elizabeth G.
dc.contributor.author Chapman, Sine´ad B.
dc.contributor.author Stevenson, Anne
dc.contributor.author Stroud, Rocky E.
dc.contributor.author Abebe, Tamrat
dc.contributor.author Akena, Dickens
dc.contributor.author Alemayehu, Melkam
dc.contributor.author Ashaba, Fred K.
dc.contributor.author Atwoli, Lukoye
dc.contributor.author Bowers, Tera
dc.contributor.author Chibnik, Lori B.
dc.contributor.author Daly, Mark J.
dc.contributor.author DeSmet, Timothy
dc.contributor.author Dodge, Sheila
dc.contributor.author Fekadu, Abebaw
dc.contributor.author Ferriera, Steven
dc.contributor.author Gelaye, Bizu
dc.contributor.author Gichuru, Stella
dc.contributor.author Injera, Wilfred E.
dc.contributor.author James, Roxanne
dc.contributor.author Kariuki, Symon M.
dc.contributor.author Kigen, Gabriel
dc.contributor.author Koenen, Karestan C.
dc.contributor.author Kwobah, Edith
dc.contributor.author Kyebuzibwa, Joseph
dc.contributor.author Majara, Lerato
dc.contributor.author Musinguzi, Henry
dc.contributor.author Mwema, Rehema M.
dc.contributor.author Neal, Benjamin M.
dc.contributor.author Newman, Carter P.
dc.contributor.author Newton, Charles R.J.C.
dc.contributor.author Pickrell, Joseph K.
dc.contributor.author Ramesar, Raj
dc.contributor.author Shiferaw, Welelta
dc.contributor.author Stein, Dan J.
dc.contributor.author Teferra, Solomon
dc.contributor.author Celia van der Merwe
dc.contributor.author Merwe, Celia van der
dc.contributor.author Zingela, Zukiswa
dc.contributor.author NeuroGAP-Psychosis Study Team
dc.date.accessioned 2025-06-04T07:18:26Z
dc.date.available 2025-06-04T07:18:26Z
dc.date.issued 2021-10-25
dc.identifier.uri http://41.89.205.12/handle/123456789/2611
dc.description Genetic studies in underrepresented populations identify disproportionate numbers of novel associations. However, most genetic studies use genotyping arrays and sequenced reference panels that best capture variation most common in European ancestry populations. To compare data generation strategies best suited for underrepresented populations, we sequenced the whole genomes of 91 individuals to high coverage as part of the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study with participants from Ethiopia, Kenya, South Africa, and Uganda. We used a downsampling approach to evaluate the quality of two costeffective data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole-genome sequencing data. We show that low-coverage sequencing at a depth of R43 captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5–13) performed comparably to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation; 43 sequencing detects 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, effectively identify novel variation particularly in underrepresented populations, and present opportunities to enhance variant discovery at a cost similar to traditional approaches. en_US
dc.description.abstract Genetic studies in underrepresented populations identify disproportionate numbers of novel associations. However, most genetic studies use genotyping arrays and sequenced reference panels that best capture variation most common in European ancestry populations. To compare data generation strategies best suited for underrepresented populations, we sequenced the whole genomes of 91 individuals to high coverage as part of the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study with participants from Ethiopia, Kenya, South Africa, and Uganda. We used a downsampling approach to evaluate the quality of two costeffective data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole-genome sequencing data. We show that low-coverage sequencing at a depth of R43 captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5–13) performed comparably to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation; 43 sequencing detects 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, effectively identify novel variation particularly in underrepresented populations, and present opportunities to enhance variant discovery at a cost similar to traditional approaches. en_US
dc.description.sponsorship ALUPE UNIVERSITY en_US
dc.language.iso en en_US
dc.publisher ALUPE UNIVERSITY en_US
dc.subject Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations en_US
dc.title Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Browse

My Account