Data

Gas chromatography-mass spectrometry-based metabolomics to identify molecular signatures of Type 1 Diabetes (T1D) in urine. We utilize three cohorts in different stages post-diagnosis: (1) new onset, (2) within one year of diagnosis and (3) after 6 years of diagnosis.

There were 91 metabolites identified in all three datasets with complete data represented in each cohort dataset.

Cohort 1 (CNMC):

  • 32 T1D cases; 32 healthy controls siblings

Cohort 2 (BDCD):

  • 27 T1D cases; 27 healthy controls siblings

Cohort 3 (IUSOM):

  • 12 T1D cases; 20 healthy controls
  • Cases and controls matched on sex and age
Cohort Group Time Post-Dx Age (std dev) % Female
CNMC Control (Sibling) - 11.56 (3.54) 53.12
CNMC T1D (Sibling) 6 yrs 11.94 (3.13) 46.88
BDCD Control (Sibling) - 11.44 (2.74) 37.04
BDCD T1D (Sibling) 1 yrs 10.37 (2.49) 25.93
IUSOM Control (Unrelated) - 12.1 (3.74) 40.00
IUSOM T1D (Unrelated) 48 hrs 10.75 (3.19) 25.00

 
 

Owners: The Diabetes Autoimmunity Study in the Young (DAISY), Barbara Davis Center for Diabetes (BDCD), Children’s National Medical Center (CNMC), Indiana University School of Medicine (IUSOM)
Objective: Predict new onset T1D based on available metabolomic biomarkers, and evaluate model generalizability through measuring cross-cohort predictive performance.

Approach

Model: Random Forest

Preprocessing: A near-zero variance filter was applied to the data prior to model fitting. No predictors were ultimately filtered. Metabolite ratios were additionally computed based on all unique pairings of the 91 measured metabolites for each cohort (4095 ratios per cohort).

Tuning: Grid-search

Final Models:

Data are accessible on DataHub. Code for data processing, model tuning, and final model fitting/evaluation is available on GitHub.

Results

Non-ratio models: Cross-cohort Predictive Performance
Non-ratio models: Cross-cohort Predictive Performance
Ratio models: Cross-cohort Predictive Performance
Ratio models: Cross-cohort Predictive Performance

 
 
 
 

Non-ratio models: Most Important Predictors Across Cohorts
Non-ratio models: Most Important Predictors Across Cohorts
Ratio models: Most Important Predictors Across Cohorts
Ratio models: Most Important Predictors Across Cohorts

 
 

Note: Metabolites (or metabolite ratios) highlighted in red are those that indicated statistically significant differences at the 0.05 level for at least one cohort.