Indexes are the single most impactful performance tool available to database users. They trade write speed and storage space for dramatically faster reads — and understanding their internal data structures is essential to using them well.

What Is an Index, Really?

An index is a separate data structure that the database maintains alongside your table. It stores a subset of your columns in a format that allows fast lookups, and holds pointers back to the full rows in the original table (the "heap file"). Think of it like the index of a textbook — you do not read the whole book to find a topic, you consult the index, get a page number, and jump there.

B-Tree Indexes

The default index type in PostgreSQL, MySQL, SQLite, and most relational databases. A B-Tree (Balanced Tree) keeps its keys in sorted order across a multi-level tree where every leaf node is at the same depth. This gives O(log n) lookup, insert, and delete performance, and crucially supports:

  • Equality lookups: WHERE email = 'a@b.com'
  • Range queries: WHERE created_at BETWEEN '2025-01-01' AND '2025-06-01'
  • Sorting without a sort step: ORDER BY last_name
  • Prefix matching: WHERE name LIKE 'Ada%'
SQL — POSTGRESQL
-- B-Tree index (default) on a frequently queried column CREATE INDEX idx_users_email ON users (email); -- Composite index: useful when you always filter by -- both columns together. Column ORDER matters. CREATE INDEX idx_orders_user_date ON orders (user_id, created_at DESC);

Hash Indexes

Hash indexes compute a hash of the key and store it in a hash table. Average O(1) lookup — faster than B-Tree for exact equality. The trade-off: they cannot support range queries or sorting. Use them when your query pattern is purely equality-based on a high-cardinality column.

SQL — POSTGRESQL
-- Hash index: O(1) equality, no range support CREATE INDEX idx_sessions_token ON sessions USING hash (token); -- Only useful for: WHERE token = '...' -- Useless for: WHERE token > '...' -- ORDER BY token

When NOT to Add an Index

Every index has costs that compound as you add more:

  • Write overhead — every INSERT, UPDATE, and DELETE must also update every index on the table. A table with 8 indexes on it effectively does 9 writes per row mutation.
  • Storage — indexes can consume as much or more space than the table itself.
  • Small tables — if a table fits in memory, a sequential scan is often faster than an index lookup (no pointer indirection).
  • Low-cardinality columns — an index on a boolean column (true/false) is almost always useless.
The query planner knows all of this. Use EXPLAIN ANALYSE to see whether your index is actually being used — and what the cost estimate is with and without it.

Covering Indexes

A covering index includes all columns that a query needs, so the database can answer entirely from the index without touching the heap. This is the highest-performance scenario because it eliminates all I/O on the main table.