Database Partitioning in PostgreSQL

Last updated: February 5th 2025

Introduction

A wise man once said: “The best way to query a billion-rows table is not to query a billion-rows table.”

Indexing is a hell of a tool; it can easily boost your database performance by 500x with a single line of code. Unfortunately, sometimes even indexing isn’t enough. If you happen to work with a table containing billions of rows—yes, billions with a “B”—then you’ll face a new set of challenges.

Let’s play pretend. You are a software engineer at The New York Times, and they’ve tasked you with the mission of digitizing their entire archive. They want to make all content, starting from September 18, 1851, available online. This archive contains 4K images of articles, transcriptions, and historical videos and audios.

While this data holds immense historical value, it will likely generate $0 in revenue. Why? Because no one cares; no one is reading it. But you, being a clever software engineer, decide to cut costs by reserving the precious M.2 SSDs for your “hot” articles that bring millions of visitors daily and keeping these historical records on old, cheap, and slow HDDs. After all, they won’t get much traffic anyway.

To optimize storage, you decide to partition the data by range as follows:

Historical Era

1851–1870, 1871–1890, 1891–1914, 1915–1928, 1929–1939, 1940–1946, 1947–1956, 1957–1969, 1970–1979, 1980–1989, 1990–1999, 2000–2008

These partitions will rest on some cheap drive—maybe not an HDD, but definitely not an M.2. Each range represents part of human history.

Modern Era

  • 2009–2016 (Obama era, social media growth)
  • 2017–2020 (Trump presidency, polarization)
  • 2021–2023 (Post-Trump, COVID-19 recovery)

Now, you instruct your developers to create a special kind of routing pattern to accommodate the partitioned database:
https://www.nytimes.com/{year}/{month}/{day}/{slug}

This will result in URLs like the following:

https://www.nytimes.com/1940/07/27/archives/bugs-bunny-by-tex-avery-in-all-the-cinemas-starring-today.html

Implementing Partitioning

Step 1: Give ownership of the partition to PostgreSQL

$ sudo chown postgres:postgres /path/to/partition

Step 2: Create the partitioned structure

Create tablespaces for different storage tiers

CREATE TABLESPACE hdd LOCATION '/path/to/hdd';
CREATE TABLESPACE ssd LOCATION '/path/to/ssd';

Create the main partitioned table

CREATE TABLE nyt_articles (
    article_id SERIAL,
    publication_date DATE NOT NULL,
    slug TEXT NOT NULL,
    content TEXT,
    media_urls TEXT[],
    PRIMARY KEY (article_id, publication_date)
) PARTITION BY RANGE (publication_date);

Historical partitions on HDD

CREATE TABLE historical_1851_1870 PARTITION OF nyt_articles
    FOR VALUES FROM ('1851-01-01') TO ('1871-01-01')
    TABLESPACE hdd;
CREATE TABLE historical_1871_1890 PARTITION OF nyt_articles
    FOR VALUES FROM ('1871-01-01') TO ('1891-01-01')
    TABLESPACE hdd;

Modern partitions on SSD

CREATE TABLE modern_2009_2016 PARTITION OF nyt_articles
    FOR VALUES FROM ('2009-01-01') TO ('2017-01-01')
    TABLESPACE ssd;
CREATE TABLE modern_2017_2020 PARTITION OF nyt_articles
    FOR VALUES FROM ('2017-01-01') TO ('2021-01-01')
    TABLESPACE ssd;
CREATE TABLE modern_2021_2023 PARTITION OF nyt_articles
    FOR VALUES FROM ('2021-01-01') TO ('2024-01-01')
    TABLESPACE ssd;

Future partitions

CREATE TABLE modern_2024_2030 PARTITION OF nyt_articles
    FOR VALUES FROM ('2024-01-01') TO ('2030-01-01')
    TABLESPACE ssd;

Now, every time an insert occurs, PostgreSQL will handle the partitioning for you.

Conclusion

This article outlined steps on how to implement partitioning in PostgreSQL.

This article was written by Ahmad AdelAhmad is a freelance writer and also a backend developer.