Westlake News ACADEMICS

Westlake Researchers Generate Most Comprehensive Collection of Genetic Variants Associated with RNA Splicing in the Human Brain

25, 2022

Email: zhangchi@westlake.edu.cn
Phone: +86-(0)571-86886861
Office of Public Affairs

On Aug 18, 2022, a research article entitled “Genetic control of RNA splicing and its distinct role in complex trait variation” from a team led by Dr Ting Qi and Prof Jian Yang with the School of Life Sciences, Westlake University was published online in Nature Genetics. The team developed a splicing quantitative trait locus (sQTL) mapping method, named THISTLE, with improved power over competing methods. Applying THISTLE together with a complementary sQTL mapping strategy to brain transcriptomic (  n = 2,865) and genotype data, the team identified 12,794 genes with 1,864,200 cis-sQTL, providing the most comprehensive collection of genetic variants associated with a broad spectrum of alternative splicing events in human brain. By integrating the sQTL data into genome-wide association study (GWAS) for twelve brain-related complex traits (including diseases), the team identified 244 genes associated with the traits through cis-sQTLs, ~61% of which could not be discovered using the corresponding expression QTL (eQTL) data, demonstrating the distinct role of most sQTLs in the genetic regulation of complex trait variation.

Most traits in humans such as behaviours, physiological characteristics and disease susceptibilities are influenced by many genetic variants each with a small effect. Genome-wide association study (GWAS) is a widely used experimental design to detect genetic variants associated with a disease of interest. Leveraging genetic variations characterized by comparing the genetic information of a large cohort of people, it uses statistical methods to scan the genome to identify genetic variants associated with the disease. GWAS has led to the discovery of tens of thousands of genetic variants associated with human complex traits and common diseases. However, most of the trait-associated variants are of uncharacterized function, and the mechanisms through which the genetic variants exert their effects on the traits remain largely elusive.

Considering that most of the GWAS signals are in non-coding regions of the genome, one hypothesis is that the genetic variants affect the traits through genetic regulation of gene expression. During the past few years, Yang’s Lab has developed a series of methods to integrate eQTL data with GWAS data to prioritize genes responsible for GWAS signals. However, only a moderate proportion of the GWAS signals have been attributed to cis-eQTLs, likely because of various reasons, including limited power, spatiotemporal eQTL effects that occur in specific tissues or cell types at specific developmental stages, the focus on genomic regions in cis, and mechanisms beyond genetic control of mRNA abundance.

Genetic control of pre-mRNA splicing (also called sQTLs) is another fundamental mechanism of gene regulation but is heavily underexplored compared to eQTLs. In this study, the team developed a powerful sQTL mapping method, THISTLE. They used both simulation and real data to demonstrate the improved power of THISTLE over competing methods. They further applied THISTLE together with a complementary sQTL mapping method to the largest publicly available brain transcriptomic data (n = 2,865)with genotype data to detect sQTLs and then integrated the sQTL summary statistics into GWAS for twelve brain-related traits (including diseases) of large sample sizes (n = 51,710 – 766,345) to prioritize 244 genes associated with the traits through genetic regulation of splicing.

By comparing the sQTLs with the eQTLs identified in this study, they showed that ~61% of the sQTLs are distinct from the eQTLs, suggesting that sQTL mapping warrants more attention in future research. By integrating the sQTLs with GWAS data for twelve brain-related traits, ~61% of the trait-associated genes identified through sQTLs could not be discovered through eQTLs, demonstrating the distinct contribution of sQTLs to the genetic architecture underpinning complex trait variation.

THISLE is efficient, robust and versatile, which is applicable to data from different tissues or cell types or even different species. They have also developed an online tool (https://yanglab.westlake.edu.cn/data/brainmeta) to visualize or download the sQTL and eQTL summary statistics generated in this study. These datasets are helpful for future studies to understand the molecular mechanisms underpinning the genetic regulation of splicing in brain, identify functional genes and variants for other brain-related phenotypes, and improve genomic risk prediction, etc.