Last updated:
Author(s):
Cole M. Williams, Jared O'Connell, Ethan Jewett, William A. Freyman, 23andMe Research Team, Christopher R. Gignoux, Sohini Ramachandran, Amy L. Williams
Publish date:
22 July 2025
Journal:
Human Genetics and Genomics Advances
PubMed ID:
40702725

Abstract

Haplotype phasing, the process of determining which genetic variants are physically located on the same chromosome, is crucial for genetic analyses. Here, we benchmark SHAPEIT and Beagle, two state-of-the-art phasing methods, on two large datasets: >8 million research-consented 23andMe, Inc. customers and the UK Biobank (UKB). Remarkably, both methods’ median switch error rate (SER) (after excluding single SNP switches, which we call “blips”) is 0.00% across all tested 23andMe trio children and 0.026% in British samples from UKB. Across UKB samples, switch errors predominantly occur in regions lacking identity-by-descent (IBD) coverage. SHAPEIT and Beagle excel at intra-chromosomal phasing, but lack the ability to phase across chromosomes, motivating us to develop HAPTiC (HAPlotype Tiling and Clustering), an inter-chromosomal phasing method that assigns paternal and maternal variants genome-wide. Our approach uses IBD segments to phase blocks of variants on different chromosomes. HAPTiC represents the segments a focal individual shares with their relatives as nodes in a signed graph and performs spectral clustering. We test HAPTiC on 1,022 UKB trios, yielding a median per-site phase error of 0.13% in regions covered by IBD segments (45.1% of sites). We also ran HAPTiC in the 23andMe database and found a median phase error rate of 0.49% in Europeans (100% of sites) and 0.16% in admixed Africans (99.8% of sites). HAPTiC enables analyses that require the parent-of-origin of variants, such as association studies and ancestry inference of untyped parents.

Related projects

Patients with complex diseases can have different mutations within a single gene, or set of interacting genes, which predispose them to the same disease. To…

Institution:
Brown University, United States of America

It is thought that, in order for embryos to develop properly, each human must have genetic variability within their own genome (i.e., many genomic sites…

Institution:
Brown University, United States of America

All projects