Every person carries an estimated 4–5 million relative to the human reference , including 2–3 de novo (not present in either parent). Most are benign. About 50–100 are in associated with disease. A handful may be medically actionable. The clinical genetics problem — identifying which cause disease in a specific patient — is one of the most consequential applications of bioinformatics.
Modes of Inheritance
The way a disease is transmitted from parent to child depends on whether it's dominant or recessive, and whether it's on an autosome, the X , or mitochondrial .
Autosomal Dominant (AD)
One mutant is sufficient to cause disease. Each child of an affected parent has a 50% chance of inheriting the .
Mechanism: either the mutant is toxic/interfering (dominant negative) or a 50% reduction in functional is insufficient (haploinsufficiency).
Examples:
- Huntington's disease (HTT CAG expansion): expanded polyglutamine tract in huntingtin creates a toxic gain-of-function
- BRCA1/2 (hereditary breast/ovarian cancer): haploinsufficiency; one functional copy is usually sufficient for repair, but the second hit in somatic leads to cancer
- Marfan syndrome (FBN1): dominant negative; the mutant fibrillin-1 disrupts normal fibrillin assembly
AD conditions often show variable expressivity (affected individuals differ in severity) and incomplete penetrance (not everyone who carries the develops the disease).
Autosomal Recessive (AR)
Both must be mutated for disease. Carriers (one mutant ) are usually healthy. Two carrier parents have a 25% probability of an affected child.
Mechanism: requires complete or near-complete loss of function. A single functional usually provides enough for normal function.
Examples:
- Cystic fibrosis (CFTR ): ~1/25 Northern European carriers; ~1/2500 births affected
- Sickle disease (HBB E6V): HbS homozygotes have profound hemolytic anemia; heterozygotes have sickle trait — mild symptoms but protective against malaria
- Phenylketonuria (PKU) (PAH ): phenylalanine hydroxylase deficiency; dietary phenylalanine accumulates → neurotoxicity; newborn screening + dietary treatment prevents intellectual disability
X-linked
on the X . Males (XY) are hemizygous — they have only one X, so a single recessive causes disease. Females (XX) can be carriers.
X-linked recessive (XLR): males affected, females usually carriers.
- Duchenne muscular dystrophy (DMD): frame-disrupting dystrophin ; males affected; females carriers
- Hemophilia A (F8) and B (F9)
- Color blindness (OPN1LW/MW)
X-linked dominant (XLD): affects both males and females.
- Rett syndrome (MECP2): more severe in males; often lethal in hemizygous males
Mitochondrial
Mitochondrial has ~37 . mtDNA is maternally inherited (all mitochondria in the embryo come from the oocyte). Affected mothers pass mtDNA to all children; affected fathers do not.
Mitochondrial diseases affect high-energy tissues: brain, muscle, heart. Examples: MELAS (mitochondrial encephalomyopathy), Leber's hereditary optic neuropathy (LHON).
A complication: heteroplasmy — may contain a mixture of normal and mutant mtDNA. Disease severity correlates with the proportion of mutant mtDNA, which can vary between tissues and change over time.
De Novo Mutations
Not all genetic diseases are inherited. De novo — new not present in either parent — arise in the germ (egg or sperm) or in the early embryo. They're found by trio : the patient plus both parents and looking for in the patient that aren't in either parent.
De novo rate: ~1 × 10⁻⁸ per per generation → approximately 60 de novo single per person. Paternal age is the dominant factor: the rate in sperm increases with age (sperm undergo far more divisions than eggs).
De novo cause:
- ~50% of severe intellectual disability cases
- Most cases of early-onset neurodevelopmental conditions (autism, epilepsy, schizophrenia)
- Achondroplasia (FGFR3 G380R de novo in ~98% of cases)
De novo status is a strong indicator of pathogenicity: if a is not inherited and causes a serious , it's likely causative.
Variable Expressivity and Penetrance
Even with a clearly pathogenic , not everyone who carries it is equally affected:
Penetrance: the fraction of individuals with a who display the . BRCA1 pathogenic confer ~70–80% lifetime risk of breast cancer — high but not 100% (incomplete penetrance).
Variable expressivity: individuals with the same pathogenic can have different severity. NF1 (neurofibromatosis type 1) carriers range from mild café-au-lait spots only to severe neurofibromas and malignant peripheral nerve sheath tumors.
Modifying factors: other genetic (modifier ) and environmental exposures can modulate penetrance and expressivity. This is why GWAS and polygenic risk scores are relevant even for monogenic diseases — genetic background modifies outcomes.
The Clinical Genetics Pipeline
Step 1: Clinical Recognition and Ordering Sequencing
The clinical process begins with a patient (or family) presenting with symptoms suggesting a genetic condition. The clinician decides what test to order:
- Chromosomal microarray: first-tier test for intellectual disability/autism; detects CNVs across the
- Single : when the clinical presentation strongly suggests a specific diagnosis (e.g., DMD in a boy with early-onset proximal weakness)
- panel: of multiple associated with the same spectrum (e.g., hereditary cancer panel: BRCA1/2, PALB2, CHEK2, ATM, etc.)
- Exome : all coding regions (~1–2% of the ); first-tier for unexplained pediatric disease
- : the full ; highest diagnostic yield; increasingly cost-competitive
Step 2: Variant Calling and Annotation
Raw → → calling → annotation.
Annotation pipeline:
- ANNOVAR/VEP: predict functional consequence (synonymous, missense, stop-gain, splice site)
- Population frequencies: gnomAD frequency (AF). If AF > 1% in gnomAD, unlikely to be a dominant disease with high penetrance
- In silico pathogenicity predictors: SIFT, PolyPhen-2, REVEL, AlphaMissense (uses structure prediction)
- ClinVar lookup: known classifications from other labs
- Literature search: published case reports, functional studies
Step 3: Variant Interpretation (ACMG/AMP Guidelines)
The ACMG/AMP 2015 guidelines (updated by ClinGen) provide a standardized framework:
| Evidence type | Criteria examples |
|---|---|
| Population data | PM2: absent from controls; BS1: allele frequency above expected |
| Computational | PP3: multiple in silico tools predict damaging; BP4: predict benign |
| Functional | PS3: well-established functional studies show damaging effect; BS3: shows no damaging effect |
| Segregation | PP1: co-segregates with disease in multiple affected family members |
| De novo | PS2: confirmed de novo in affected individual; PM6: assumed de novo |
| Allelic | PM3: detected in trans with a pathogenic variant (for AR) |
| Case data | PS4: prevalence in affected significantly increased vs. controls |
Points accumulate to a final classification: pathogenic (≥5 pathogenic points), likely pathogenic, VUS, likely benign, or benign.
Step 4: Return of Results
are reported in a clinical format:
- Pathogenic/likely pathogenic : reported with interpretation and implications
- VUS: reported with explanation that significance is unclear; patient and family may return for re-evaluation as evidence accumulates
- Benign/likely benign: usually not reported
Secondary findings: ACMG recommends reporting pathogenic in 81 that are medically actionable regardless of the test indication — including BRCA1/2, Lynch syndrome , cardiac channelopathies, familial hypercholesterolemia . If you sequence a patient for any reason and find a BRCA1 pathogenic , you report it.
Key Databases for Clinical Variant Interpretation
| Database | Content | URL |
|---|---|---|
| ClinVar | Variant-disease interpretations from labs | ncbi.nlm.nih.gov/clinvar |
| gnomAD | Population frequencies (800k exomes, 76k genomes) | gnomad.broadinstitute.org |
| OMIM | Gene-disease associations + phenotype descriptions | omim.org |
| LOVD | Locus-specific variant databases (per gene) | lovd.nl |
| ClinGen | Gene-disease validity curation; variant curation rules | clinicalgenome.org |
| SpliceAI | Deep learning splice effect prediction | Available as VEP plugin |
| AlphaMissense | Structure-based missense pathogenicity from DeepMind | Available via VEP/ANNOVAR |
Pharmacogenomics: Variants That Affect Drug Response
Not all medically relevant cause disease — some determine how a patient metabolizes drugs:
DPYD: reduce dihydropyrimidine dehydrogenase activity → severely reduced 5-fluorouracil (chemotherapy) clearance → potentially fatal toxicity. CPIC guidelines recommend DPYD genotyping before 5-FU prescription.
CYP2C19: poor metabolizers (loss-of-function ) fail to convert clopidogrel (anti-platelet) to its active form → reduced antiplatelet effect → higher cardiovascular event risk. Ultra-rapid metabolizers have increased activation.
TPMT/NUDT15: reduce thiopurine methyltransferase activity → increased thioguanine levels from azathioprine/6-mercaptopurine → bone marrow toxicity. Standard of care in pediatric leukemia.
The CPIC (Clinical Pharmacogenomics Implementation Consortium) publishes evidence-based guidelines for -drug pairs — a direct application of clinical interpretation to precision pharmacology.
The Growing Burden of VUS
As becomes routine, the volume of of uncertain significance (VUS) has grown dramatically. A patient undergoing hereditary cancer panel testing receives an average of 1–2 VUS, in addition to any pathogenic findings.
VUS burden creates clinical uncertainty — neither actionable nor dismissible. ClinGen's curation working groups are systematically re-evaluating across to reclassify VUS as evidence accumulates. Machine learning methods (including AlphaMissense, trained on evolutionary and structural data) are improving in silico pathogenicity prediction, reducing the VUS burden computationally.
The long-term trend: more evidence, better algorithms, and larger databases are progressively reclassifying VUS as either pathogenic or benign — making clinical progressively more interpretable.