Last updated:
Author(s):
Fan Wang, Wei Zhang, Fang Yao
Publish date:
1 June 2025
Journal:
The Annals of Applied Statistics

Abstract

In this Supplementary Material, we present theoretical properties, the key lemmas and the proofs of the lemmas and theorems in our work. In addition, all codes and instructions for implementing simulations and real data analysis are included in the file “dBiRS_Code”. The codes are also accessible in GitHub repository https://github.com/ZWCR7/dBiRS. The identification of genetic signal regions in the human genome is critical for understanding the genetic architecture of complex traits and diseases. Numerous methods based on scan algorithms (i.e., QSCAN, SCANG, SCANG-STAAR) have been developed to allow dynamic window sizes in whole-genome association studies. Beyond scan algorithms, we have recently developed the binary and research (BiRS) algorithm, which is more computationally efficient than scan-based methods and exhibits superior statistical power. However, the BiRS algorithm is based on two-sample mean test for binary traits, not accounting for multidimensional covariates or nonbinary outcomes. In this work we propose a new maximal score test based on summary statistics computed from a generalized linear model, which accommodates regression-based statistics and allows testing of both continuous and binary outcomes. We then present a distributed version of the BiRS algorithm (dBiRS) that incorporates this new test, enabling parallel computing of blockwise results by aggregation through a central machine to ensure both detection accuracy and computational efficiency, which has theoretical guarantees for controlling familywise error rates and false discovery rates while maintaining the power advantages of the original algorithm. Applying dBiRS to detect genetic regions associated with fluid intelligence and prospective memory using whole-exome sequencing data from the UK Biobank, we validate previous findings and identify numerous novel rare variants near newly implicated genes. These discoveries offer valuable insights into the genetic basis of cognitive performance and neurodegenerative disorders, highlighting the potential of dBiRS as a scalable and powerful tool for whole-genome signal region detection.

Related projects

Canela-Xandri et al. (Canela-Xandri et al., 2018) has built an atlas of genetic associations on 660 binary traits by using ~452,000 related and unrelated UK…

Institution:
Peking University, China

All projects