Back to Break It Down
Break It Down

Database Indexing: Why Your Queries Are Slow (Library Edition)

Qentium TeamNov 30, 20245 min read

Imagine a library with 10 million books and no catalog system. No index, no author listing, no subject categories. Just shelves upon shelves of unsorted books.

Want to find all books by Gabriel García Márquez? Better start reading.

That's your database without indexes.

The Bookshelf Analogy

Full Table Scan: The Worst-Case Scenario

Without an index, finding a specific book means checking every single book in the library. In database terms, this is called a "full table scan." It's slow, inefficient, and painful.

The Index: A Sorted Catalog

An index is like a card catalog, sorted by author, title, or subject. You look up "García Márquez," find the card, and it tells you exactly which shelf and row to go to.

In database terms, the index stores the values you're searching for, along with pointers to the actual data rows. Instead of scanning millions of rows, the database scans the index—which is much smaller and sorted.

How Indexes Work

The B-Tree: The Gold Standard

Most databases use B-tree indexes (the "B" stands for "balanced"). Think of it like a sorted array that lets you find things quickly.

  • **Root:** The top of the tree. Starts your search.
  • **Branches:** Intermediate nodes that narrow down your search.
  • **Leaves:** The bottom level. Contains the actual values and pointers to data.

The magic of B-trees is that they're balanced. No matter how much data you add, the search always takes the same number of steps. It's O(log n)—fast.

Hash Indexes: The Quick Lookup

For exact matches, hash indexes are faster. Think of it like a dictionary: you look up a word, and it gives you the definition instantly.

But hash indexes only work for exact matches. Can't do range queries ("find all books published between 2000 and 2010").

What Makes a Good Index

selectivity: The Unique Factor

An index on "gender" (M/F) is almost useless—it only splits data in half. An index on "email" is highly selective—each value is unique.

The more unique values, the better the index.

Column Order Matters

If you query by (last_name, first_name), a composite index on (last_name, first_name) works. But an index on just (first_name) won't help.

The Write Penalty

Every INSERT, UPDATE, or DELETE needs to update every relevant index. More indexes = slower writes.

It's a trade-off. But for read-heavy applications, it's usually worth it.

The Bottom Line

Indexes are the difference between "query takes 10 seconds" and "query takes 10 milliseconds."

Get them right, and your database flies. Get them wrong, and you're reading every book in the library.