Part 0·0.3·10 min read

Map of the Territory

An overview of all nine parts of Bio for Devs — what each covers, how they connect, and which paths matter most for your work.

overviewcurriculumnavigation

Before you start reading, it helps to know the shape of what you're reading. This chapter maps the full curriculum — what each part covers, how the parts depend on each other, and which paths are most relevant depending on what kind of work you're doing.

The Nine Parts

Part 0 — Why This Matters is what you're reading now. It explains the gap between software and biology, how this site is structured, and how to use it. You can read it in under thirty minutes.

Part 1 — The Infrastructure of Life builds the foundation. The cell as a system, the key molecular players, the membrane as a boundary. This is the hardware layer: before you can understand what the software does, you need to know what it runs on. The In Practice chapter introduces NCBI and the major biological databases you'll query constantly.

Part 2 — The Genetic Code is the heart of the curriculum. DNA as source code, genes as functions, RNA as bytecode, proteins as executables. This part culminates in the Central Dogma — the fundamental information flow of biology — and then shows you how to work with it using Biopython.

Part 3 — Control and Regulation is where biology gets interesting. Gene expression is not a static read of the source code — it's a dynamic, context-dependent process. Epigenetics, splicing, regulatory networks. This part explains how the same genome can produce hundreds of different cell types. The In Practice chapter builds a gene regulatory network using NetworkX and the STRING protein interaction database.

Part 4 — Communication and Signaling explains how cells talk to each other and how they respond to their environment. Receptors, ligands, signaling cascades, the cell cycle. This is the event-driven architecture of biology.

Part 5 — Virology and Immunology covers a topic most developers care about but few understand mechanistically: how viruses work, how the immune system responds, and how vaccines and therapies exploit these mechanisms. The In Practice chapter works with viral sequence data using BLAST.

Part 6 — Variation, Evolution and Disease connects the molecular machinery to population-level phenomena. Mutations, cancer, evolutionary optimization, genetic diseases. The In Practice chapter introduces VCF files — the standard format for genomic variant data.

Part 7 — Computational Neuroscience covers the neuron as a computational unit, biological neural networks, plasticity, brain signals as data, and brain-computer interfaces. This part has the most direct connection to ML and AI. The In Practice chapter analyzes EEG signals using MNE-Python.

Part 8 — Biostatistics and ML Applied to Biology is the capstone. It explains why biostatistics is different from general statistics, covers the essential tests and methods, and walks through a full RNA-seq analysis pipeline. This is the part that turns domain knowledge into working analyses.

How the Parts Connect

The curriculum has a loose dependency graph. Some parts require earlier parts; others can be read more independently.

Part 0 (orientation)
    └── Part 1 (cell infrastructure)
            └── Part 2 (genetic code)    ← central hub
                    ├── Part 3 (regulation)
                    ├── Part 4 (signaling)
                    ├── Part 5 (virology)
                    ├── Part 6 (variation & disease)
                    └── Part 7 (neuroscience)
                                └── Part 8 (stats & ML)

Parts 1 and 2 are prerequisites for everything else. You don't need to have memorized them, but you need to have read them. The rest of the curriculum assumes you know what DNA is, what a protein does, and what the Central Dogma says.

Parts 3–7 are relatively independent of each other, though they share vocabulary. You can read them in any order after Part 2. The exception: Part 7 is much easier after Part 3, because understanding gene regulation in neurons is the same concept as gene regulation everywhere.

Part 8 requires all of the above. The statistical methods only make sense if you understand what you're measuring and why. The ML applications require knowing what the features represent biologically.

For Different Audiences

If you're a software engineer moving into biotech or genomics, read Parts 0–3 first. That covers the vocabulary you'll encounter most — DNA, genes, expression, regulation. Then read Part 6 for variants and Part 8 for the analysis methods.

If you're a data scientist working with omics data, start with Parts 1–2 for the biological context, then go straight to Part 8. Come back to Parts 3–6 as specific topics come up in your work.

If you're an ML engineer working on protein structure, drug discovery, or genomics models, Parts 2 and 3 are essential. Understanding what proteins are at the molecular level — not just as sequences or 3D structures — will make your feature engineering much more principled. Part 8 is directly useful for evaluation methodology.

If you're a researcher from biology who wants to understand the computational side, you can skim Parts 0–3 quickly (you already know this material) and focus on the In Practice chapters, which explain the tools in biological terms.

What You'll Be Able to Do

After Part 1: You can read papers that describe cell-level experiments and understand what's being measured and why.

After Part 2: You can understand what bioinformatics tools are actually computing — what it means to align sequences, call variants, or quantify gene expression.

After Part 3: You can read about regulatory mechanisms and gene networks without losing the thread. You understand why the same gene can behave differently in different cells.

After Part 8: You can design and critique biological data analyses. You know what the statistical assumptions are, why they matter in biology specifically, and what "good enough to publish" looks like.

The In Practice Chapters

Each In Practice chapter ends a part with working code. These chapters are self-contained — you can run the code without having done the theoretical chapters, and the theoretical chapters don't require you to have run the code. But the two together are more than the sum of their parts.

The map is not the territory. You'll encounter concepts in your work that this curriculum doesn't cover in depth. That's expected. The goal is to give you the conceptual foundation from which you can navigate the territory yourself.

Start reading. The gap closes faster than you think.