Studying moderate phenotypes associated with mendelian disorder genes
Douglas Wightman & Danielle Posthuma

Rare genetic mutations can cause complex syndromes where patients have symptoms in multiple organs. These disorders and mutations have been categorised in online databases like OMIM and ClinVar. We have found that different mutations in genes linked to mendelian disorders can lead to more moderate and more difficult to quantify phenotypes, like regional brain volume changes. This project will use the UK biobank to investigate whether carriers of predicted loss-of-function and missense mutations in genes linked to mendelian disorders present with more moderate phenotypes. Additionally, this project will investigate whether mutations in two separate genes in the same pathway lead to more severe phenotypes. This project is entirely computational and will require the use of bash scripting in a UNIX environment and R. The statistical associations will be determined using a regression framework. A literature review of mendelian disorder genes and epistatic relationship at the start of the project would be useful for gene and phenotype selection.

Explaining comorbidities in Alzheimer’s disease patients
Douglas Wightman & Danielle Posthuma

Alzheimer’s disease is partially heritable and partially attributable to environmental influences. Some Alzheimer’s disease patients present with a different number and a different array of other diseases (comorbidities). The cause for these differences in comorbidities may be environmental or genetic. This project will cluster Alzheimer’s disease cases by their comorbidities to find groups of cases that share similar diseases and then use genetic and environmental data in the UK biobank to find associations that explain the clusters. Date of diagnosis information can also be used to determine clusters of comorbidities that occur before and after Alzheimer’s disease diagnosis. This project is entirely computational and will require the use of bash scripting in a UNIX environment and R (or python). The statistical associations will be determined using a regression framework. Clustering can be performed using latent class analysis or other methods (including machine learning).

Comparing the association between genetic variants and expression on a gene and transcript level
Douglas Wightman & Danielle Posthuma

Genome wide association studies (GWAS) use expression quantitative trait loci (eQTLs) to help connect variants associated with a phenotype to a specific gene. eQTLs are genetic variants that affect how much a gene is expressed. Genes encode multiple versions (transcripts) and these transcripts are abundant in different quantities in different cell types. eQTLs can therefore be measured for expression of a single transcript or on a gene-level (an average of the transcripts). Most GWAS use gene-level eQTLs to connect variants to genes and may be missing transcript specific effects. This project will estimate the local genetic correlation between gene-level eQTLs and transcript level eQTLs using the eQTL catalogue data. Then, transcripts that show distinct eQTLs to gene level data will be identified. These transcript specific eQTLs will be used to connect genetic variants associated with diseases to specific transcripts. This data then may give an insight into which cell types are relevant to the GWAS phenotype. This project is entirely computational and will require the use of bash scripting in a UNIX environment and R (or python). The statistical associations will be determined using a regression framework.

Causal and phenome-wide approaches to DZ twinning
Dorret Boomsma & Nikki Hubers

Twinning -i.e. a pregnancy resulting in more than one offspring- is widely observed in mammalian species. In humans, dizygotic (DZ) twinning is related to other evolutionary important traits such as fertility and longevity.
Paradoxically, DZ twinning is higher in women who smoke and have higher BMI, traits that otherwise are related to female infertility risk (Hoekstra et al., 2010). To test possible causal pathways, we propose to do a Mendelian Randomization (MR) study (e.g. Sanderson et al.,2022) based on the latest meta-GWAS for DZ twinning (Mbarek et al., 20240) with smoking, BMI and possibly other lifestyle traits.
A more extended Phewas study can be carried out by applying latent causal variants (LCV) methods (see for example: Aman et al., 2022). Both approaches take -published- GWA studies as input.

Aman AM, García-Marín LM, Thorp JG, Campos AI, Cuellar-Partida G, Martin NG, Rentería ME. Phenome-wide screening of the putative causal determinants of depression using genetic data. Hum Mol Genet. 2022;31(17):2887-2898. doi: 10.1093/hmg/ddac081
Hoekstra C, Willemsen G, van Beijsterveldt CE, Lambalk CB, Montgomery GW, Boomsma DI. Body composition, smoking, and spontaneous dizygotic twinning. Fertil Steril. 2010;93(3):885-93. doi:10.1016/j.fertnstert.2008.10.012.
Mbarek H, Gordon SD, Duffy DL, Hubers N, Mortlock S, Beck JJ, Hottenga JJ, Pool R, Dolan CV, Actkins KV, Gerring ZF, Van Dongen J, Ehli EA, Iacono WG, Mcgue M, Chasman DI, Gallagher CS, Schilit SLP, Morton CC, Paré G, Willemsen G, Whiteman DC, Olsen CM, Derom C, Vlietinck R, Gudbjartsson D, Cannon-Albright L, Krapohl E, Plomin R, Magnusson PKE, Pedersen NL, Hysi P, Mangino M, Spector TD, Palviainen T, Milaneschi Y, Penninnx BW, Campos AI, Ong KK, Perry JRB, Lambalk CB, Kaprio J, Ólafsson Í, Duroure K, Revenu C, Rentería ME, Yengo L, Davis L, Derks EM, Medland SE, Stefansson H, Stefansson K, Del Bene F, Reversade B, Montgomery GW, Boomsma DI, Martin NG. Genome-wide association study meta-analysis of dizygotic twinning illuminates genetic regulation of female fecundity. Hum Reprod. 2024, 5;39(1):240-257. doi:10.1093/humrep/dead247.
Sanderson E, Glymour MM, Holmes MV, Kang H, Morrison J, Munafò MR, Palmer T, Schooling CM, Wallace C, Zhao Q, Smith GD. Mendelian randomization. Nat Rev Methods Primers. 2022, 10;2:6. doi: 10.1038/s43586-021-00092-5