iCPAGdb - A hypothesis engine for cross-phenotype genetic associations connecting molecular, cellular, and human disease phenotypes


1. Select a data set to review

2. Filter
1. Compute
p-threshold1 (factor1 X 10-x1)
p-threshold2 (factor2 X 10-x2)
Note: p-threshold maximums are H2P2 = 1X10-5, NHGRI = 5X10-8, all others = 1X10-3

2. Filter
1. Upload a GWAS file
Note: Maximum file size is 1GB. Expected upload time is appproxiamtaely 30 seconds per 100MB. For faster upload, reduce your input file to two columns (SNP and P-value) and/or pre-clump for only lead SNPs at your desired threshold. Upload progress is indicated in the bar below the "Browse" button. To download a sample GWAS file for review, click here:    sample GWAS file (severe COVID-19, Ellinghaus et al. 2020)

2. Compute
p-threshold1 (factor1 X 10-x1)
p-threshold2 (factor2 X 10-x2)
Note: p-threshold maximums are H2P2 = 1X10-5, NHGRI = 5X10-8, all others = 1X10-3

3. Filter

Quick-start guide

Review iCPAGdb: Explore pre-calculated iCPAGdb results

  1. Using the selection table (in section 1) click the row with the pair of GWAS datasets that you want to compare (combinations of NHGRI-EBI GWAS catalog, metabolomics, or cellular host-pathogen traits) at the specified p-value thresholds and with the specified population used in calculating LD. Results appear in the "Table" and "Heatmap" sections below the query controls.
    • Table: Each pairwise trait combination is listed (Trait1, Trait2) along with the number of SNPs that overlap directly (Nsharedirect), the number of SNPs that overlap based on LD-proxy (NshareLD), and the number of SNPs that overlap overall (SNPshareall), -log10(p-values) calculated by Fisher’s exact test (-log10(PFisher)) and corrected by either Benjamini-Hochsberg (-log10(PadjustFDR) or Bonferroni (-log10(PBonferroni)), the shared SNPs (SNPshared), similarity indices (Jaccard and Chao-Sorenson), and ontology information based on EFO codes in NHGRI-EBI GWAS catalog. Results can be sorted by any column or restricted by text of any column.
    • Heatmap: By default, the heatmap is based on the Chao-Sorenson similarity index, but options for Jaccard similarity or -log10(PFisher) are also available. Hover over cells to inspect the plotted value for pairs of phenotypes. Click and drag to zoom in on a particular region of the heatmap.
  2. The Filter section (section 2) has controls to alter the contents of the results table and heatmap:
    • Include all SNPs check-box: The SNPshared column of the output lists all shared SNPs between the two traits, but by default lists only the first 100 characters for easier viewing. Checking this box overrides the default limit and will cause display of the complete shared SNP string.
    • Trait (phenotype), SNP, and EFO parent (for grouping) filter specifiers: Table records and heatmap cells are limited to phenotype pairs where either GWAS set contains specified filtering values. Multiple EFOs can be selected.
    • Include compound EFOs check-box: By default, "compound phenotypes" that list more than one phenotype in NHGRI-EBI were included as part of iCPAGdb analyses but are not included in output, as it is unclear which phenotype was tested for genetic association. However, these phenotypes can be made visible, and will be plotted, by checking this box.
    • Heatmap metric selector: Alternates between Fisher, Bonferonni, FDR, Jaccard, and Chao-Sorensen measures
    • Top significant phenotypes to plot: Limits the heatmap to the most significant phenotypic relationships (start small, rendering takes time)
  3. Click the "Table" or "Heatmap" buttons to alternate between views
  4. Click the "Download" button to download the current table of results (filters applied)

Upload GWAS and compute: Upload your own GWAS summary statistics (SNPs and p-values for a desired phenotype) then have iCPAGdb clump the data and run analyses against datasets in iCPAGdb

  1. Click "Browse" to locate a GWAS summary file on your computer. Data reside on our server only for the purpose and duration of computation. Your data are deleted immediately after results are returned to you. Results are for research purposes only. No interpretation of significance of any reported result is implied. The input file should contain comma or tab delimited SNP and p columns, with column headings. Once a local file is selected, the first five records are displayed for you to verify that the uploaded file contains the observations you intend to use and to review column headers. Tab and comma delimiters are also examined in the first five records.
  2. Specify uploaded file delimeter style (comma or tab)
  3. Enter the column headers, as they appear in the file, for the SNP and p columns
  4. Select an iCPAGdb dataset (GWAS source 2) to be compared to your GWAS (GWAS source 1)
  5. Use the p-threshold sliders to select filtering limits for both data sets. Any SNPs in your data or SNP, phenotype combinations in the iCPAGdb data with p-values above the corresponding threshold will be excluded from analysis. The p-threshold for your GWAS has a range of 5 X 10-3 to 1 X 10-20. Maximum thresholds for iCPAGdb data vary by source: 5 X 10-8 for NHGRI, 1 X 10-5 for H2P2, and 1 X 10-3 for all others.
  6. Select continental group to use in identification of shared SNPs by LD-proxy
  7. Click "Compute CPAG"
  8. Filter results as in Review
  9. Click the "Download" button to download the current (filtered) contents of the results table


About iCPAGdb

iCPAGdb (interactive Cross-Phenotype Analysis of GWAS database) provides an atlas of human traits connected through shared genetic architecture. While genome-wide association studies (GWAS) have successfully identified thousands of genetic variants associated with human diseases and traits, understanding how genetic differences impact disease risk and severity remains a formidable challenge. iCPAGdb integrates the results of GWAS across phenotypic scales, identifying and quantifying the significance of pleiotropic loci that impact molecular, cellular, and organismal traits. The goal is to provide a resource that allows experts on a particular human trait to easily develop hypotheses for molecular and cellular phenotypes that underlie the physiology of that trait. Molecules and cellular pathways implicated in this way could serve as novel biomarkers or targets for therapeutic approaches.

iCPAGdb leverages:
  1. the huge and expanding catalog of past GWAS of human diseases and traits collected at the NHGRI-EBI GWAS Catalog
  2. GWAS of molecular and cellular traits. Specifically for:

Users can explore pre-computed iCPAGdb output for these datasets or upload their own GWAS results for rapid comparison against these datasets. As an example, we demonstrate the utility of iCPAGdb with a published GWAS of severe COVID-19: Ellinghaus D, Degenhardt F, Bujanda L, Buti M, Albillos A, Invernizzi P, et al. Genomewide Association Study of Severe Covid-19 with Respiratory Failure. N Engl J Med. 2020.

iCPAGdb publication: An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility

Info on the web app: iCPAGdb web app design

Contacts:
Regarding iCPAGdb: Liuyang Wang or Dennis Ko
Regarding iCPAGdb Portal: Liuyang Wang or Tom Balmat

Bibliography