Need Help?

SEARCH for Diabetes in Youth Study - Genetic Risk Score

Genetic risk scores (GRS) quantify polygenic disease risk into a single measure and can aid in disease classification. GRS studies have focused primarily on adult populations of recent European ancestry. We aimed to assess the utility of GRS in classification of diabetes type among racially/ethnically diverse youth in the United States. We used data from SEARCH baseline visits and follow up visits for which key data elements were available. 2260 participants are included.

Phenotypic Data to be submitted here include:

We performed genotyping using the Illumina Multi-Ethnic Global Array (MEGA) array with 1,697,069 genotyped variants including 748,291 with minor allele frequency <0.01. Genotyping and preliminary quality control checks were performed at the Colorado Center for Personalized Medicine. After additional quality control, 2,238 samples and 900,743 variants remained for analyses. Samples genotyped on the MEGA array were categorized using SEARCH etiologic type and consisted of predominantly type 1 diabetes cases (n=2,051) but also those with other diabetes (n=133 type 2 diabetes, n=52 other diabetes including monogenic diabetes; genetic confirmation with either a genetic clinical test or test performed as an ancillary study to SEARCH). The median reported age at DNA collection was 11.2 years (interquartile range 7.6 – 14.1) with a minimum age of 1.9 years and maximum of 21.9 years.

We used additional data genotyped on the Affymetrix 500K imputation scaffold chip with 239,279 genotyped variants. This cohort consisted of predominantly type 2 diabetes cases (n=417) but also those with type 1 (n=104), and other diabetes types (n=16). After additional quality control, 537 samples and 235,967 variants remained. The median reported age at collection was 11.2 years (quartile range 8.1 – 14.2) with a minimum of 2.0 years and a maximum of 21.1 years. About 100 type 2 diabetes cases were also genotyped using the MEGA chip to ensure concordance between the data sets and facilitate strand alignment between the two chips before genotypic imputation.

The data sets had n=230,228 genotyped variants in common and concordance between the genotypes was high (mean correlation r2 for SNPs used in GRS=0.95). We used the 1000 Genomes reference panel to impute each data set separately, resulting in a total of 34.5M and 27.8M well-imputed variants (r2>0.8) for the MEGA and Affymetrix data sets respectively. We combined high quality imputed variants for analysis.