Skip to main content

Overview

The IGVF Catalog contains variant nodes primarily sourced from FAVOR.
Variants not present in FAVOR but present in the other variant collections below were also loaded.
Variants may be identified using SPDI, HGVS, or their genomic position.

Non-coding Variant Edges

SourceClassEdge DescriptionDatasets
TopLDstatistical assessmentVariant-to-variant edges describing variants in linkage disequilibrium (LD).TopLD
caQTLstatistical assessmentVariant-to-genomic element edges describing variants associated with chromatin accessibility.ENCODE, AFGR
eQTLstatistical assessmentVariant-to-gene edges describing variants associated with gene expression.EBI eQTL, AFGR
sQTLstatistical assessmentVariant-to-gene edges describing variants associated with alternative splicing.AFGR
Variant-EFFECTSobserved dataVariant-to-gene edges describing causal effects of variants on endogenous gene expression from genome editing experiments.Variant-EFFECTS datasets (Jesse Engreitz, Stanford)
STARR-seqobserved dataVariant-to-biosample edges describing allele-specific regulatory activity in the K562 cell line.STARR-seq datasets (Tim Reddy, Duke)
BlueSTARRpredictionVariant-to-genomic element edges describing predicted regulatory activity in ENCODE cCRE sequences using a STARR-seq–trained model.BlueSTARR dataset (Bill Majoros, Duke)
MPRAobserved dataVariant-to-genomic element edges describing variant activity within genomic elements tested in MPRA.lentiMPRA datasets (Nadav Ahituv, UCSF)

Variant Edges

SourceEdge DescriptionDatasets
TopLDVariant-to-variant edges describing variants in linkage disequilibrium (LD).TopLD
caQTLVariant-to-genomic element edges describing variants associated with chromatin accessibility.ENCODE, AFGR
eQTLVariant-to-gene edges describing variants associated with gene expression.EBI eQTL, AFGR
sQTLVariant-to-gene edges describing variants associated with alternative splicing.AFGR
Variant-EFFECTSVariant-to-gene edges describing causal effects of variants on endogenous gene expression from genome editing experiments.Variant-EFFECTS datasets (Jesse Engreitz, Stanford)
STARR-seqVariant-to-biosample edges describing allele-specific regulatory activity in the K562 cell line.STARR-seq datasets (Tim Reddy, Duke)
BlueSTARRVariant-to-genomic element edges describing predicted regulatory activity in ENCODE cCRE sequences using a STARR-seq–trained model.BlueSTARR dataset (Bill Majoros, Duke)
MPRAVariant-to-genomic element edges describing variant activity within genomic elements tested in MPRA.lentiMPRA datasets (Nadav Ahituv, UCSF)

Enhancer-Gene Model Prediction Table

This table shows which genes are predicted to be regulated by enhancers overlapping the variant you’re viewing.
ColumnDescription
Cell TypeCell type in which the enhancer is predicted to regulate the gene
Target GeneGene predicted to be regulated by the enhancer (click for gene details)
ScoreStrength of the prediction (range: 0 to 1, higher indicates a more confident prediction)
DatasetSource dataset (click for more details)
ModelPredictive model. Currently: ENCODE-rE2G
Variant-Gene DistanceGenomic distance between the variant and gene body
Score ranges from 0 (no prediction) to 1 (confident prediction). Currently, this table includes predictions from the ENCODE-rE2G model across 1700 ENCODE biosamples (see Gschwind et al. bioRxiv 2023) The table is initially sorted by Score in descending order, showing the strongest predictions first.

Variants in Linkage Disequilibrium

This table lists variants in linkage disequilibrium with the query variant, and summarizes functional evidence about those variants. Each row reports one variant in one ancestry. LD information is sourced from 1000 Genomes Phase 3 queried from Ensembl.
ColumnDescription
rsIDThe reference SNP ID (click for variant details)
LD (r²)Measure of linkage disequilibrium (correlation coefficient)
LD (D’)Measure of linkage disequilibrium (D prime statistic)
Cell Types w/ pred. EG linkCell types with predicted enhancer-gene links
Genes w/ pred. EG linkGenes with predicted enhancer-gene links
QTL TypesTypes of QTL associations
Genes w/ QTLsGenes with QTL associations
AncestryThe population ancestry for the LD information
Most Severe ConsequenceThe predicted functional impact
You can filter this table by ancestry using the dropdown menu above the table.

GWAS Association

This table shows phenotypes (traits or diseases) associated with the variant through genome-wide association studies (GWAS).
ColumnDescription
Lead VariantLead variant identifier for the association
Study IDIdentifier for the GWAS study
TraitThe associated phenotype or trait
Lead Variant P-valueStatistical significance of the association
BetaEffect size and direction
95% Confidence IntervalConfidence interval for the effect size
PMIDPubMed ID for the study (click to view publication)
Author (Year)First author and publication year
Study NSample size of the study
LD (r²)Linkage disequilibrium with the lead variant

Coding Variant Prediction

This table provides predictions about the variant’s impact on protein function if it affects a protein-coding region.
ColumnDescription
NameName identifier for the coding variant
ReferenceReference allele
AlternateAlternate allele
Protein NameName of the affected protein
Gene NameName of the affected gene (click for gene details)
Transcript IDTranscript identifier (click for transcript details)
Amino Acid PositionPosition of the amino acid change
HGVS ProteinHGVS notation for the protein change
HGVSHGVS notation for the variant
Reference CodonReference codon sequence
Codon PositionPosition within the codon
SIFT ScoreSIFT deleteriousness prediction score
SIFT4G ScoreSIFT4G deleteriousness prediction score
Polyphen2 HDIV ScorePolyPhen-2 HumDiv prediction score
Polyphen2 HVAR ScorePolyPhen-2 HumVar prediction score
VEST4 ScoreVEST4 pathogenicity prediction score
M-CAP ScoreM-CAP pathogenicity prediction score
REVEL ScoreREVEL pathogenicity prediction score
MutPred ScoreMutPred pathogenicity prediction score
BayesDel addAF ScoreBayesDel with allele frequency score
BayesDel noAF ScoreBayesDel without allele frequency score
VARITY R ScoreVARITY residue-level score
VARITY ER ScoreVARITY ensemble residue-level score
VARITY R LOO ScoreVARITY residue-level leave-one-out score
VARITY ER LOO ScoreVARITY ensemble residue-level leave-one-out score
ESM-1b ScoreESM-1b evolutionary scale modeling score
EVE ScoreEVE (Evolutionary model of Variant Effect) score
AlphaMissense ScoreAlphaMissense pathogenicity prediction score
CADD Raw ScoreCADD raw deleteriousness score
SourceSource of the prediction data (click for details)

Allelic effect(s) on transcription factor binding

This table shows effects of the variant on allelic binding in transcription factor (TF) ChIP-seq experiments.
ColumnDescription
ProteinTranscription factor protein (click for protein details)
GeneGene encoding the transcription factor (click for gene details)
BiosampleBiological sample or cell type
MotifTranscription factor binding motif (click for motif details)
Motif SummarySummary of motif binding effects
Motif Fold-ChangeFold change in motif binding affinity
Ref ScoreBinding score for reference allele
Alt ScoreBinding score for alternate allele
Ref FDRFalse discovery rate for reference allele
Alt FDRFalse discovery rate for alternate allele
Motif PositionPosition of the motif
Motif StrandStrand orientation of the motif
SourceSource of the binding data

pQTL

This table shows protein quantitative trait loci (pQTL) associated with this variant.
ColumnDescription
GeneGene associated with the protein (click for gene details)
ProteinProtein affected by the variant (click for protein details)
BiosampleBiological sample or tissue type
BetaEffect size of the variant on protein levels
P-valueStatistical significance of the association
SourceSource of the pQTL data (click for source details)

QTL

This table shows quantitative trait loci (QTL) associated with this variant.
ColumnDescription
QTL TypeType of QTL (eQTL, sQTL, etc.)
GeneTarget gene (click for gene details)
TissueTissue or cell type where the QTL was detected
Effect SizeMagnitude of the variant’s effect
P-valueStatistical significance of the association

Associated Genes and Protein

This table shows genes and proteins associated with this variant through various mechanisms including eQTLs and motif binding.
ColumnDescription
NameName of the associated gene or protein (click for details)
TypeWhether the entity is a Gene or Protein
Association TypeType of association (eQTL, motif binding, etc.)
SourceSource of the association data
Context/MotifBiological context or motif information
P-valueStatistical significance of the association

Associated Genes (QTLs)

This table shows genes associated with this variant via eQTL or sQTL evidence.
ColumnDescription
GeneAssociated gene name (click for gene details)
TypeType of QTL association
P-value (-log10)Statistical significance (-log10 transformed)
Effect SizeMagnitude of the variant’s effect on gene expression
Biological ContextTissue or cell type context
SourceSource of the QTL data (click for source details)

Associated Drug

This table shows drugs associated with this variant from PharmGKB.
ColumnDescription
DrugDrug identifier (click for drug details)
Gene SymbolsGenes associated with the drug response
PMIDPubMed ID supporting the association (click to view publication)
Phenotype CategoriesCategories of phenotypic effects
SourceSource of the drug association data (click for source details)

Associated Disease

This table shows diseases associated with this variant from ClinGen.
ColumnDescription
DiseaseAssociated disease or condition (click for disease details)
GeneGene associated with the disease variant (click for gene details)
AssertionClinical significance assertion
PMIDsPubMed IDs supporting the association
SourceSource of the disease association data (click for source details)

Biosample Evidence

This table shows experimental evidence per biosample for this variant.
ColumnDescription
BiosampleBiosample name (click for biosample details)
SynonymsAlternative names for the biosample
DescriptionDescription of the biosample
MethodExperimental method used
LabelEvidence label or type
Input CountsInput counts for reference and alternate alleles
Output CountsOutput counts for reference and alternate alleles
Variant SPDISPDI notation for the variant
Post. Prob. EffectPosterior probability of effect
log2 Fold ChangeLog2 fold change measurement
95% Confidence IntervalConfidence interval for the measurement
SourceSource of the experimental evidence

Linkage Disequilibrium Detail

This table provides detailed pairwise linkage disequilibrium information for this variant.
ColumnDescription
Linked Variant (rsID)rsID of the variant in LD (click for variant details)
PositionGenomic position of the linked variant
Base Pair ChangeNucleotide change for the linked variant
SPDISPDI notation for the linked variant
HGVSHGVS notation for the linked variant
Context Variant Base PairBase pair change for the query variant
Context Variant (rsID)rsID of the query variant (click for variant details)
Context PositionGenomic position of the query variant
Context SPDISPDI notation for the query variant
Context HGVSHGVS notation for the query variant
LD (r²)Linkage disequilibrium r-squared value
LD (D’)Linkage disequilibrium D-prime value
AncestryPopulation ancestry for the LD calculation
LabelAdditional label or classification
SourceSource of the LD data (click for source details)
Each table can be sorted by clicking on the column headers. You can also use the search box above each table to filter the results.

Interactive Visualizations

The variant page includes several interactive visualization components that provide rich insights into variant effects, population genetics, and functional predictions.

Primary Variant Visualization

An animated visualization showing the molecular mechanism of the variant using SPDI notation. Features:
  • Animated Sequence: Shows the variant change as deletion and insertion events
  • Color Coding:
    • Teal (#337788): Insertion allele
    • Coral (#CC8877): Deletion allele
  • Multi-phase Animation:
    • Initial display of deletion event
    • Fade-in of insertion allele
    • Arrow indicating the replacement direction
  • Genomic Context: Displays surrounding genomic sequence with placeholder bases
  • Responsive Design: Automatically adjusts to container width
This visualization helps users understand the molecular nature of the variant change at the nucleotide level.

Allele Frequency Distribution

An interactive bar chart displaying population-specific allele frequencies from gnomAD. Features:
  • Population Breakdown: Shows frequencies across major ancestry groups:
    • African, Amish, Ashkenazi Jewish, East Asian, Finnish
    • Native American, Non-Finnish European, Other, South Asian
  • Interactive Elements:
    • Hover over bars to see exact frequency values
    • Color changes on hover for visual feedback
    • Precise tooltips with 3 significant figures
  • Animated Rendering: Bars animate from bottom to top on initial load
  • Rotated Labels: Population labels displayed at 45-degree angle for readability
  • Responsive Scaling: Y-axis automatically scales to accommodate data range
This visualization provides crucial population genetics context for variant interpretation.

Summary Data Table with Score Visualization

An interactive table combining functional prediction scores with visual progress bars. Features:
  • Score Visualization: Each numerical score displayed with:
    • Horizontal progress bar showing relative magnitude
    • Color-coded bar (brand color) indicating score strength
    • Exact numerical value alongside visual representation
  • Sortable Interface: Click column headers to sort by score values
  • Dynamic Columns: Table adapts to show only relevant columns:
    • Functional class (when available)
    • Catalog links (when available)
    • Portal links (when available)
  • Clamped Scoring: All scores normalized to 0-1 range for consistent visualization
  • Loading States: Graceful handling of data loading with appropriate placeholders
This component allows quick visual comparison of functional prediction scores across different tools and databases.

Linkage Disequilibrium Heatmap

An interactive heatmap showing pairwise linkage disequilibrium relationships with nearby variants. Features:
  • Dual Metrics: Toggle between r² and D’ measurements using dropdown selector
  • Interactive Exploration:
    • Hover over cells to see detailed LD statistics
    • Tooltips display both variant IDs and precise r²/D’ values
    • Mouse tracking with real-time feedback
  • Color Gradient: Intensity represents LD strength (white to teal)
  • Variant Organization: Variants automatically sorted by genomic position
  • Export Functionality: Download heatmap as PNG image
  • Responsive Design: Square aspect ratio that scales with container
  • Legend: Color scale reference showing value mapping
The heatmap reveals local LD structure and helps identify variant clusters that may represent the same causal signal.

Ancestry Filtering

An animated interface for filtering LD data by population ancestry. Features:
  • Smooth Transitions: Framer Motion animations for state changes
  • Toggle Interaction:
    • Click ancestry labels to filter data
    • Selected ancestry highlighted with brand color
    • Clear button (×) to remove filter
  • Population Mapping: Displays both short codes and full population names:
    • ASJ (Ashkenazi Jewish), EAS (East Asian), AFR (African)
    • FIN (Finnish), NFE (Non-Finnish European), SAS (South Asian)
    • AMI (Amish), OTH (Other), AMR (American), EUR (European)
  • Dynamic Visibility: Only shows when multiple ancestries are available
  • Hover Effects: Scale animations on hover for visual feedback
This component enables population-specific analysis of linkage disequilibrium patterns.