Database Indexing: Why Your Queries Are Slow (Library Edition)
Imagine a library with 10 million books and no catalog system. No index, no author listing, no subject categories. Just shelves upon shelves of unsorted books.
Want to find all books by Gabriel García Márquez? Better start reading.
That's your database without indexes.
The Bookshelf Analogy
Full Table Scan: The Worst-Case Scenario
Without an index, finding a specific book means checking every single book in the library. In database terms, this is called a "full table scan." It's slow, inefficient, and painful.
The Index: A Sorted Catalog
An index is like a card catalog, sorted by author, title, or subject. You look up "García Márquez," find the card, and it tells you exactly which shelf and row to go to.
In database terms, the index stores the values you're searching for, along with pointers to the actual data rows. Instead of scanning millions of rows, the database scans the index—which is much smaller and sorted.
How Indexes Work
The B-Tree: The Gold Standard
Most databases use B-tree indexes (the "B" stands for "balanced"). Think of it like a sorted array that lets you find things quickly.
- **Root:** The top of the tree. Starts your search.
- **Branches:** Intermediate nodes that narrow down your search.
- **Leaves:** The bottom level. Contains the actual values and pointers to data.
The magic of B-trees is that they're balanced. No matter how much data you add, the search always takes the same number of steps. It's O(log n)—fast.
Hash Indexes: The Quick Lookup
For exact matches, hash indexes are faster. Think of it like a dictionary: you look up a word, and it gives you the definition instantly.
But hash indexes only work for exact matches. Can't do range queries ("find all books published between 2000 and 2010").
What Makes a Good Index
selectivity: The Unique Factor
An index on "gender" (M/F) is almost useless—it only splits data in half. An index on "email" is highly selective—each value is unique.
The more unique values, the better the index.
Column Order Matters
If you query by (last_name, first_name), a composite index on (last_name, first_name) works. But an index on just (first_name) won't help.
The Write Penalty
Every INSERT, UPDATE, or DELETE needs to update every relevant index. More indexes = slower writes.
It's a trade-off. But for read-heavy applications, it's usually worth it.
The Bottom Line
Indexes are the difference between "query takes 10 seconds" and "query takes 10 milliseconds."
Get them right, and your database flies. Get them wrong, and you're reading every book in the library.