Skip to main content

Genes

The official gene sets of the IGVF Catalog is from the following GENCODE releases: Gene and Transcripts and Gene Structure (UTRs, Exons etc.) are loaded from the comprehensive annotation file as nodes. This includes all protein-coding, lncRNA and all other gene types annotated by GENCODE

Coding Variant Functional and Prediction scores

[protein page]

Proteins Table

This table displays information about proteins associated with the gene.
ColumnDescription
Protein IDUnique identifier for the protein (click for more details)
Protein NameCommon name of the protein
Full NameComplete scientific name of the protein
SourceOrigin of the protein information (click for source details if available)

Transcripts Table

This table shows different transcripts (RNA versions) of the gene. Genes are linked to transcripts via gene-transcript edges (also from GENCODE).
ColumnDescription
RegionGenomic location of the transcript (click to view the region)
Transcript IDUnique identifier for the transcript (click for more details)
Transcript NameName of the transcript (click for more information)
Gene NameName of the gene this transcript belongs to (click for gene details)
SourceOrigin of the transcript information (click for source details if available)
VersionVersion number of the transcript information

Variants Table.

Variants are linked to genes via both functional characterization experiments and QTLs. IGVFDS4359OODY shows the effect of 183 variants in the PPIF promoter measured via a CRISPR method called Variant-EFFECTS.
eQTLs and splice-QTLs are encoded as gene-variant edges. Datasets have been loaded from the EBI eQTL catalogue.
Additional QTLs have been loaded from the African Functional Genomics; AFGR resource.

Enhancer-Gene Prediction

This table reports predicted enhancers for this gene from the ENCODE-rE2G model and their relevant cell types. Each row reports one predicted enhancer and cell type.
ColumnDescription
Cell TypeCell type in which the enhancer is predicted to regulate the gene
ScoreStrength of the prediction (range: 0 to 1, higher indicates a more confident prediction)
StartGenomic start coordinate of the enhancer
EndGenomic end coordinate of the enhancer
DatasetSource dataset (click for more details)
ModelPredictive model. Currently: ENCODE-rE2G
Distance (Enhancer to Gene)Genomic distance between the enhancer and gene body
Score ranges from 0 (no prediction) to 1 (confident prediction). Currently, this table includes predictions from the ENCODE-rE2G model across 1700 ENCODE biosamples (see Gschwind et al. bioRxiv 2023) The table is initially sorted by Score in descending order, showing the strongest predictions first.

Pathways Enrichment

Pathways and the genes containing them are loaded from Reactome.

Other Edge Types (UI in development)

Gene-Gene Edges

There are two public sources of gene-gene edges; A coxpression matrix from CoexpressDB and genetic interactions from BioGRID

Gene-Variant-Drug (PharmGKB)

The PharmGKB resource, also known as ClinPgx has relationships between genes (and their variants) and drug effects

Gene Structure

This table provides detailed structural information about this gene including exons, introns, UTRs, and coding sequences.
ColumnDescription
TypeType of genomic element (exon, intron, UTR, etc.)
TranscriptTranscript identifier this element belongs to
Exon NumberNumber of the exon (if applicable)
StartGenomic start coordinate
EndGenomic end coordinate
StrandStrand orientation (+ or -)
SourceOrigin of the structural information (click for source details)

Gene Interactions & Coexpression

This table displays genes that interact with or are coexpressed with this gene.
ColumnDescription
Related GeneGene that interacts with or is coexpressed with this gene (click for details)
SourceSource of the interaction data
Z-Score (CoXPresdb)Coexpression z-score from CoXPresdb
Interaction (BioGRID)Type of interaction from BioGRID
Detection (BioGRID)Detection method used in BioGRID
PMIDsPubMed IDs supporting the interaction
BioGRID ConfidenceConfidence score from BioGRID database
IntAct ConfidenceConfidence score from IntAct database
This table shows variants that are related to this gene through various evidence sources including eQTL, sQTL, and other regulatory mechanisms.
ColumnDescription
rsIDReference SNP ID(s) for the variant (click for variant details)
ChromosomeChromosome where the variant is located
PositionGenomic position of the variant
RefReference allele
AltAlternative allele
HGVSHGVS notation for the variant
Evidence SourcesTypes of evidence linking the variant to this gene
Biological ContextBiological contexts where the relationship is observed
Max -log10(p-value)Maximum -log10(p-value) across all evidence sources
This table displays genes and proteins that are related to this gene through various biological relationships.
ColumnDescription
Related EntityName of the related gene or protein (click for details)
TypeWhether the related entity is a Gene or Protein
LocationGenomic location (for genes) or N/A (for proteins)
DescriptionDescription or alternative names for the entity
Each table can be sorted by clicking on the column headers. You can also use the search box above each table to filter the results.

Interactive Visualizations

The gene page includes several interactive visualization components that provide rich insights into gene function, variants, and relationships.

Functional Score Distribution

An interactive histogram showing the distribution of functional/predictive scores for coding variants in this gene. Features:
  • Displays score distributions across different data sources (REVEL, ClinVar, etc.)
  • Hover over bars to see exact counts
  • Automatically updates when selecting different data sources from the table above
  • X-axis shows functional scores (typically 0-1 range)
  • Y-axis shows count of variants
This visualization helps identify patterns in variant pathogenicity predictions and the overall functional landscape of the gene.

Biobank OR Plot

A forest plot visualization showing odds ratios (OR) from biobank studies for different variant classifications. Features:
  • Displays logOR values with confidence intervals as horizontal lines
  • Different colors represent different evidence types:
    • Blue: ClinVar classifications
    • Green: Calibrated predictions
    • Red: Author labels
  • Hover over data points to see exact logOR and confidence interval values
  • Vertical dashed line at logOR = 0 for reference
  • Interactive legend on the right side
Each row represents a different variant classification (e.g., “Pathogenic Very Strong”, “Likely Pathogenic”) with its associated odds ratio from biobank data.

Coding Variants Preview

An interactive, horizontally scrollable browser for exploring coding variants with functional scores. Features:
  • Scrollable Interface: Navigate through variants using horizontal scroll
  • Label Preferences: Choose between rsID, SPDI notation, or rsID-only filtering
  • Auto-selection: Variants are automatically selected as you scroll
  • Detailed View: Selected variant shows:
    • Complete variant details (chromosome, position, ref/alt alleles)
    • HGVS notation and SPDI format
    • rsID when available
    • Functional scores from multiple sources with visual score bars
  • Score Visualization: Each score is displayed with a progress bar and numerical value
This component allows efficient exploration of the functional impact across all coding variants in the gene.

Gene Interaction Network

An interactive network graph showing gene-gene interactions and co-expression relationships. Features:
  • Interactive Navigation:
    • Pan by clicking and dragging
    • Zoom with mouse wheel or zoom buttons
    • Click nodes to navigate to related genes
  • Visual Elements:
    • Central gene (query) shown in teal
    • Related genes shown in coral/orange
    • Edge thickness represents interaction strength
    • Node hover effects with black borders
  • Data Integration: Combines multiple interaction databases (BioGRID, CoXPresdb, etc.)
  • Force-Directed Layout: Nodes automatically arrange based on interaction strength
  • Legend: Shows node type meanings
The network helps visualize the functional context and regulatory relationships of the gene.

Pathway Enrichment Tree

A hierarchical tree visualization showing pathway associations organized by biological processes. Features:
  • Interactive Navigation:
    • Horizontal panning with click-and-drag
    • Arrow button for quick navigation
    • Scroll bar at bottom for position reference
  • Hierarchical Structure:
    • Root: “Pathways”
    • Groups: “Top-level pathways” and “GO biological processes”
    • Leaves: Individual pathways
  • Visual Coding:
    • Different colors for pathway types
    • Rounded rectangles with connecting lines
    • Hover tooltips with detailed information
  • Legend: Color-coded explanation of pathway types
This visualization helps understand the biological processes and pathways this gene participates in.

Biobank OR Plot Details

The Biobank OR Plot specifically shows: Data Sources:
  • ClinVar: Clinical variant classifications
  • Calibrated Predictions: Computationally calibrated pathogenicity scores
  • Author Labels: Original author classifications from functional studies
Interpretation:
  • Points to the right of the reference line (logOR > 0) suggest increased disease risk
  • Confidence intervals show statistical uncertainty
  • Multiple data sources provide complementary evidence for variant impact