Need Help?

Predicting the Prevalence of Complex Genetic Diseases from Individual Genotype Profiles Using Capsule Networks

Diseases with a complex genetic architecture, such as amyotrophic lateral sclerosis (ALS) as a most prominent example, are hard to disentangle. "Human mind-friendly" methods, such as linear approaches used in genome wide association studies (GWAS), remain blind to many genetic variants that matter. On the other hand, methods that are sufficiently sophisticated to entirely capture the characteristic interplay of genetic variants remain black boxes for the human mind.

Here, we give this a decided, and as we argue, a very promising try. We present DiseaseCapsule, a capsule network-based approach that enables us to predict prevalence/occurrence of a genetically complex disease from individual genotype profiles. Importantly, capsule networks are considerably easier to decompose into their components than ordinary deep neural networks, which promotes the interpretability of their results.

The data we used are from Project MinE, a large-scale study that aims to reveal the genetic and epigenetic mechanisms that underlie ALS in the framework of a globally concerted collaboration. Specifically, we used data from the Dutch cohort of the project, which contains 7213 healthy ('control') individuals and 3192 individuals affected with ALS. The cohort includes 5208 females and 5197 males. All participants of the study were genotyped using an Illumina 2.5M SNP array.