Beginner's Guide: SQL Index: Why and How?
Last updated: February 3rd 2025
Introduction
Imagine a stack of hay, where each piece of hay has a uniquely colored string attached to it. If you want to retrieve a specific piece of hay, all you need to do is pull the string connected to it.
The alternative would be going over the entire stack of hay, one by one, checking every piece—definitely not an efficient approach.
In this analogy: - The haystack represents a heap (unsorted data). - The string represents an index.
Creating a Users Table
Here’s an example of creating a simple users
table:
CREATE TABLE users ( id INT, username VARCHAR(50), age INT );
Now let’s query for a user with the username superuser
:
SELECT username, id, age FROM users WHERE username = 'superuser';
⚠️ Why This Is Bad
Without an index, this query performs a full table scan (O(N) time complexity).
That’s quite possibly the worst thing you could do to your database. If you repeatedly run O(N) queries for every read operation, you might as well write your data on parchment and send it to the user via homing pigeon.
So… What Should You Do?
Tie every hay piece to a string—in other words, create a non-clustered index:
CREATE INDEX idx_users_table_by_username ON users(username);
Alternatively, you can define a PRIMARY KEY
while creating the table:
CREATE TABLE users ( id INT, username VARCHAR(50) PRIMARY KEY, age INT );
Why Is Indexing Powerful?
Imagine you have a table with 1 billion records. Here’s how an index changes the game:
Scenario | Complexity | Operations |
---|---|---|
Without an Index | O(N) | Up to 1 billion operations |
With an Index | O(log N) | Approximately 30 operations |
Be Careful with Indexes
1. Size
Indexes can consume up to 20% of the table’s size, depending on the database engine and type of index. While the performance benefits usually outweigh the storage cost, be thoughtful about which columns to index.
2. Slower Writes
Indexes make write operations slower (e.g., INSERT
, UPDATE
, DELETE
) because the database has to update the index every time data changes. It’s like adding more strings to your haystack—eventually, it becomes harder to manage.
3. Not Every Column Deserves an Index
Indexing low-selectivity columns (e.g., gender with only two values) is like alphabetizing two books on a shelf—pointless.
Focus on columns with high selectivity, like unique usernames or email addresses.
4. Rebuild Indexes Regularly
Over time, indexes can become fragmented, like a chaotic desk drawer. Rebuilding indexes periodically keeps them efficient and organized.
Conclusion
Indexes are incredibly powerful, but they come with trade-offs. Use them wisely, monitor their impact, and keep your database performing at its best!
This article was written by Ahmad Adel. Ahmad is a freelance writer and also a backend developer.
Related articles
-
Beginner's Guide: Clustered vs Non-clustered Index
Short article on Clustered and Non-clustered indexes
Last updated: February 3rd 2025
-
Database Clustering: Theory
Brief theory on database clustering
Last updated: February 27th 2025
-
Database Clustering: The Infrastructure
On setting up infrastructure for clustering your Database
Last updated: February 27th 2025