Part 6·6.4·16 min read

Genetic Diseases and Pathogenic Variants

How inherited DNA variants cause disease — inheritance patterns, variant databases, and the clinical pipeline from sequencing to diagnosis.

genetic diseasesclinical geneticsvariantsinheritancediagnosis
Point mutation: synonymous vs. missense outcomes

Every person carries an estimated 4–5 million relative to the human reference , including 2–3 de novo (not present in either parent). Most are benign. About 50–100 are in associated with disease. A handful may be medically actionable. The clinical genetics problem — identifying which cause disease in a specific patient — is one of the most consequential applications of bioinformatics.

Modes of Inheritance

The way a disease is transmitted from parent to child depends on whether it's dominant or recessive, and whether it's on an autosome, the X , or mitochondrial .

Autosomal Dominant (AD)

One mutant is sufficient to cause disease. Each child of an affected parent has a 50% chance of inheriting the .

Mechanism: either the mutant is toxic/interfering (dominant negative) or a 50% reduction in functional is insufficient (haploinsufficiency).

Examples:

  • Huntington's disease (HTT CAG expansion): expanded polyglutamine tract in huntingtin creates a toxic gain-of-function
  • BRCA1/2 (hereditary breast/ovarian cancer): haploinsufficiency; one functional copy is usually sufficient for repair, but the second hit in somatic leads to cancer
  • Marfan syndrome (FBN1): dominant negative; the mutant fibrillin-1 disrupts normal fibrillin assembly

AD conditions often show variable expressivity (affected individuals differ in severity) and incomplete penetrance (not everyone who carries the develops the disease).

Autosomal Recessive (AR)

Both must be mutated for disease. Carriers (one mutant ) are usually healthy. Two carrier parents have a 25% probability of an affected child.

Mechanism: requires complete or near-complete loss of function. A single functional usually provides enough for normal function.

Examples:

  • Cystic fibrosis (CFTR ): ~1/25 Northern European carriers; ~1/2500 births affected
  • Sickle disease (HBB E6V): HbS homozygotes have profound hemolytic anemia; heterozygotes have sickle trait — mild symptoms but protective against malaria
  • Phenylketonuria (PKU) (PAH ): phenylalanine hydroxylase deficiency; dietary phenylalanine accumulates → neurotoxicity; newborn screening + dietary treatment prevents intellectual disability

X-linked

on the X . Males (XY) are hemizygous — they have only one X, so a single recessive causes disease. Females (XX) can be carriers.

X-linked recessive (XLR): males affected, females usually carriers.

  • Duchenne muscular dystrophy (DMD): frame-disrupting dystrophin ; males affected; females carriers
  • Hemophilia A (F8) and B (F9)
  • Color blindness (OPN1LW/MW)

X-linked dominant (XLD): affects both males and females.

  • Rett syndrome (MECP2): more severe in males; often lethal in hemizygous males

Mitochondrial

Mitochondrial has ~37 . mtDNA is maternally inherited (all mitochondria in the embryo come from the oocyte). Affected mothers pass mtDNA to all children; affected fathers do not.

Mitochondrial diseases affect high-energy tissues: brain, muscle, heart. Examples: MELAS (mitochondrial encephalomyopathy), Leber's hereditary optic neuropathy (LHON).

A complication: heteroplasmy may contain a mixture of normal and mutant mtDNA. Disease severity correlates with the proportion of mutant mtDNA, which can vary between tissues and change over time.

De Novo Mutations

Not all genetic diseases are inherited. De novo — new not present in either parent — arise in the germ (egg or sperm) or in the early embryo. They're found by trio : the patient plus both parents and looking for in the patient that aren't in either parent.

De novo rate: ~1 × 10⁻⁸ per per generation → approximately 60 de novo single per person. Paternal age is the dominant factor: the rate in sperm increases with age (sperm undergo far more divisions than eggs).

De novo cause:

  • ~50% of severe intellectual disability cases
  • Most cases of early-onset neurodevelopmental conditions (autism, epilepsy, schizophrenia)
  • Achondroplasia (FGFR3 G380R de novo in ~98% of cases)

De novo status is a strong indicator of pathogenicity: if a is not inherited and causes a serious , it's likely causative.

Variable Expressivity and Penetrance

Even with a clearly pathogenic , not everyone who carries it is equally affected:

Penetrance: the fraction of individuals with a who display the . BRCA1 pathogenic confer ~70–80% lifetime risk of breast cancer — high but not 100% (incomplete penetrance).

Variable expressivity: individuals with the same pathogenic can have different severity. NF1 (neurofibromatosis type 1) carriers range from mild café-au-lait spots only to severe neurofibromas and malignant peripheral nerve sheath tumors.

Modifying factors: other genetic (modifier ) and environmental exposures can modulate penetrance and expressivity. This is why GWAS and polygenic risk scores are relevant even for monogenic diseases — genetic background modifies outcomes.

The Clinical Genetics Pipeline

Step 1: Clinical Recognition and Ordering Sequencing

The clinical process begins with a patient (or family) presenting with symptoms suggesting a genetic condition. The clinician decides what test to order:

  • Chromosomal microarray: first-tier test for intellectual disability/autism; detects CNVs across the
  • Single : when the clinical presentation strongly suggests a specific diagnosis (e.g., DMD in a boy with early-onset proximal weakness)
  • panel: of multiple associated with the same spectrum (e.g., hereditary cancer panel: BRCA1/2, PALB2, CHEK2, ATM, etc.)
  • Exome : all coding regions (~1–2% of the ); first-tier for unexplained pediatric disease
  • : the full ; highest diagnostic yield; increasingly cost-competitive

Step 2: Variant Calling and Annotation

Raw calling → annotation.

Annotation pipeline:

  1. ANNOVAR/VEP: predict functional consequence (synonymous, missense, stop-gain, splice site)
  2. Population frequencies: gnomAD frequency (AF). If AF > 1% in gnomAD, unlikely to be a dominant disease with high penetrance
  3. In silico pathogenicity predictors: SIFT, PolyPhen-2, REVEL, AlphaMissense (uses structure prediction)
  4. ClinVar lookup: known classifications from other labs
  5. Literature search: published case reports, functional studies

Step 3: Variant Interpretation (ACMG/AMP Guidelines)

The ACMG/AMP 2015 guidelines (updated by ClinGen) provide a standardized framework:

Evidence typeCriteria examples
Population dataPM2: absent from controls; BS1: allele frequency above expected
ComputationalPP3: multiple in silico tools predict damaging; BP4: predict benign
FunctionalPS3: well-established functional studies show damaging effect; BS3: shows no damaging effect
SegregationPP1: co-segregates with disease in multiple affected family members
De novoPS2: confirmed de novo in affected individual; PM6: assumed de novo
AllelicPM3: detected in trans with a pathogenic variant (for AR)
Case dataPS4: prevalence in affected significantly increased vs. controls

Points accumulate to a final classification: pathogenic (≥5 pathogenic points), likely pathogenic, VUS, likely benign, or benign.

Step 4: Return of Results

are reported in a clinical format:

  • Pathogenic/likely pathogenic : reported with interpretation and implications
  • VUS: reported with explanation that significance is unclear; patient and family may return for re-evaluation as evidence accumulates
  • Benign/likely benign: usually not reported

Secondary findings: ACMG recommends reporting pathogenic in 81 that are medically actionable regardless of the test indication — including BRCA1/2, Lynch syndrome , cardiac channelopathies, familial hypercholesterolemia . If you sequence a patient for any reason and find a BRCA1 pathogenic , you report it.

Key Databases for Clinical Variant Interpretation

DatabaseContentURL
ClinVarVariant-disease interpretations from labsncbi.nlm.nih.gov/clinvar
gnomADPopulation frequencies (800k exomes, 76k genomes)gnomad.broadinstitute.org
OMIMGene-disease associations + phenotype descriptionsomim.org
LOVDLocus-specific variant databases (per gene)lovd.nl
ClinGenGene-disease validity curation; variant curation rulesclinicalgenome.org
SpliceAIDeep learning splice effect predictionAvailable as VEP plugin
AlphaMissenseStructure-based missense pathogenicity from DeepMindAvailable via VEP/ANNOVAR

Pharmacogenomics: Variants That Affect Drug Response

Not all medically relevant cause disease — some determine how a patient metabolizes drugs:

DPYD: reduce dihydropyrimidine dehydrogenase activity → severely reduced 5-fluorouracil (chemotherapy) clearance → potentially fatal toxicity. CPIC guidelines recommend DPYD genotyping before 5-FU prescription.

CYP2C19: poor metabolizers (loss-of-function ) fail to convert clopidogrel (anti-platelet) to its active form → reduced antiplatelet effect → higher cardiovascular event risk. Ultra-rapid metabolizers have increased activation.

TPMT/NUDT15: reduce thiopurine methyltransferase activity → increased thioguanine levels from azathioprine/6-mercaptopurine → bone marrow toxicity. Standard of care in pediatric leukemia.

The CPIC (Clinical Pharmacogenomics Implementation Consortium) publishes evidence-based guidelines for -drug pairs — a direct application of clinical interpretation to precision pharmacology.

The Growing Burden of VUS

As becomes routine, the volume of of uncertain significance (VUS) has grown dramatically. A patient undergoing hereditary cancer panel testing receives an average of 1–2 VUS, in addition to any pathogenic findings.

VUS burden creates clinical uncertainty — neither actionable nor dismissible. ClinGen's curation working groups are systematically re-evaluating across to reclassify VUS as evidence accumulates. Machine learning methods (including AlphaMissense, trained on evolutionary and structural data) are improving in silico pathogenicity prediction, reducing the VUS burden computationally.

The long-term trend: more evidence, better algorithms, and larger databases are progressively reclassifying VUS as either pathogenic or benign — making clinical progressively more interpretable.