Matthew Hodge
Full Stack Developer

MySQL Table Partitioning: A Practical Guide with Examples

When your tables start hitting tens of millions of rows, queries that used to be instant start dragging. Indexes help, but at a certain scale they're not always enough. MySQL table partitioning is one of those tools that can make a real difference — and it's often overlooked because it feels more "DBA territory" than developer territory.

It's not. As a developer working on data-heavy applications, understanding partitioning will save you when indexes alone aren't cutting it.


What Is Table Partitioning?

Partitioning splits a single logical table into multiple physical storage segments called partitions, each holding a subset of the data. From the application side, nothing changes — you still query the same table with the same SQL. MySQL handles the routing internally.

The key benefit is partition pruning: when MySQL knows that the data you're querying only lives in certain partitions, it skips the rest entirely. For large tables filtered on the partition key, this can reduce the amount of data scanned from millions of rows to thousands.


Setting Up: A Sample Orders Table

Let's use an orders table to demonstrate each partition type. First, create the base dataset:

CREATE DATABASE partitioning_demo;
USE partitioning_demo;

RANGE Partitioning

RANGE partitioning splits data based on column value ranges. The most common use case is partitioning by date — splitting years or months into separate partitions.

CREATE TABLE orders (
    id          INT NOT NULL AUTO_INCREMENT,
    customer_id INT NOT NULL,
    total       DECIMAL(10, 2) NOT NULL,
    status      VARCHAR(20) NOT NULL,
    created_at  DATE NOT NULL,
    PRIMARY KEY (id, created_at)
)
PARTITION BY RANGE (YEAR(created_at)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p2025 VALUES LESS THAN (2026),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

Insert some sample data across years:

INSERT INTO orders (customer_id, total, status, created_at) VALUES
(1,  120.00, 'completed', '2022-03-15'),
(2,  340.50, 'completed', '2022-07-22'),
(3,   89.99, 'refunded',  '2023-01-10'),
(4,  450.00, 'completed', '2023-06-18'),
(5,  230.00, 'pending',   '2024-02-28'),
(6,  670.00, 'completed', '2024-08-14'),
(7,  115.00, 'completed', '2025-01-05'),
(8,  290.00, 'pending',   '2025-04-20');

Now run an EXPLAIN to see partition pruning in action:

EXPLAIN SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

Output:

+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | orders | p2024      | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+

Only p2024 is scanned. The 2022, 2023, 2025 and future partitions are skipped entirely.

Compare that to a query without a partition filter:

EXPLAIN SELECT * FROM orders WHERE status = 'completed';
+----+-------------+--------+--------------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions         | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+--------------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | orders | p2022,p2023,p2024, | ALL  | NULL          | NULL | NULL    | NULL |    8 |    62.50 | Using where |
|    |             |        | p2025,p_future     |      |               |      |         |      |      |          |             |
+----+-------------+--------+--------------------+------+---------------+------+---------+------+------+----------+-------------+

All partitions scanned — because status is not the partition key. This is why choosing the right partition key matters.

Viewing Partition Info

SELECT
    PARTITION_NAME,
    TABLE_ROWS,
    DATA_LENGTH,
    INDEX_LENGTH
FROM
    information_schema.PARTITIONS
WHERE
    TABLE_SCHEMA = 'partitioning_demo'
    AND TABLE_NAME = 'orders';
+----------------+------------+-------------+--------------+
| PARTITION_NAME | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH |
+----------------+------------+-------------+--------------+
| p2022          |          2 |       16384 |        16384 |
| p2023          |          2 |       16384 |        16384 |
| p2024          |          2 |       16384 |        16384 |
| p2025          |          2 |       16384 |        16384 |
| p_future       |          0 |       16384 |            0 |
+----------------+------------+-------------+--------------+

LIST Partitioning

LIST partitioning maps specific discrete values to partitions. Useful when you're partitioning by something like region, status group, or tenant type.

CREATE TABLE tickets (
    id         INT NOT NULL AUTO_INCREMENT,
    subject    VARCHAR(255) NOT NULL,
    region     VARCHAR(20) NOT NULL,
    priority   INT NOT NULL,
    PRIMARY KEY (id, region)
)
PARTITION BY LIST COLUMNS (region) (
    PARTITION p_europe  VALUES IN ('UK', 'DE', 'FR', 'NL'),
    PARTITION p_americas VALUES IN ('US', 'CA', 'MX', 'BR'),
    PARTITION p_apac    VALUES IN ('AU', 'JP', 'SG', 'IN')
);
INSERT INTO tickets (subject, region, priority) VALUES
('Login issue',      'UK', 2),
('Billing question', 'US', 1),
('Slow performance', 'AU', 3),
('Password reset',   'DE', 1),
('API timeout',      'SG', 2);
EXPLAIN SELECT * FROM tickets WHERE region IN ('UK', 'DE', 'FR');
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | tickets | p_europe   | ALL  | NULL          | NULL | NULL    | NULL |    2 |    66.67 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

Only the Europe partition is touched.


HASH Partitioning

HASH partitioning distributes rows evenly across a fixed number of partitions based on a hash of a column. You don't get partition pruning on range queries, but you get an even data spread — useful for load balancing writes across partitions.

CREATE TABLE sessions (
    id         BIGINT NOT NULL AUTO_INCREMENT,
    user_id    INT NOT NULL,
    payload    TEXT,
    last_seen  DATETIME NOT NULL,
    PRIMARY KEY (id, user_id)
)
PARTITION BY HASH (user_id) PARTITIONS 8;

With HASH partitioning, MySQL decides which partition a row goes into automatically:

INSERT INTO sessions (user_id, payload, last_seen) VALUES
(1001, 'data...', NOW()),
(1002, 'data...', NOW()),
(1003, 'data...', NOW()),
(1004, 'data...', NOW()),
(1005, 'data...', NOW());
SELECT
    PARTITION_NAME,
    TABLE_ROWS
FROM
    information_schema.PARTITIONS
WHERE
    TABLE_SCHEMA = 'partitioning_demo'
    AND TABLE_NAME = 'sessions';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          1 |
| p1             |          1 |
| p2             |          0 |
| p3             |          0 |
| p4             |          1 |
| p5             |          1 |
| p6             |          0 |
| p7             |          1 |
+----------------+------------+

Rows spread across partitions based on user_id % 8.


Adding and Dropping Partitions

One of the practical benefits of RANGE partitioning by date is that you can drop old data instantly by dropping a partition — no slow DELETE scans, no table locks.

-- Drop all 2022 data instantly
ALTER TABLE orders DROP PARTITION p2022;

-- Add a new partition for 2026
ALTER TABLE orders ADD PARTITION (
    PARTITION p2026 VALUES LESS THAN (2027)
);

Dropping a partition is an O(1) metadata operation. A DELETE WHERE created_at < '2023-01-01' on a 50-million-row table is not.


When to Use Partitioning

Partitioning is a good fit when:

  • You have very large tables (tens of millions of rows or more).
  • You regularly query on the partition key — if your queries don't filter on it, you won't benefit.
  • You want to drop old data cheaply (archiving or rolling window patterns).
  • You have high write throughput and want to distribute I/O across partitions.

It's probably overkill when:

  • Your table is well under a few million rows — a well-placed index will do the job.
  • Your queries aren't aligned with any natural partition key.
  • You're on a managed DB service with storage limitations where physical segmentation doesn't help as much.

Things to Watch Out For

  • The partition key must be part of the primary key and any unique key — MySQL enforces this.
  • Foreign keys and partitioned tables don't mix — MySQL doesn't support foreign keys on partitioned tables.
  • MAXVALUE partitions can't be subdivided later without reorganising, so plan ahead.
  • Not all storage engines support partitioning — InnoDB is fine, but some others aren't.

Final Thoughts

Table partitioning isn't magic, but when the conditions are right it genuinely changes the game. The ability to drop a year's worth of data in milliseconds, or prune a query to scan one-fifth of a table, is something you can't easily replicate with indexes alone.

Start by looking at your biggest, oldest tables. If they're growing unbounded and you're regularly querying by date or some other range column, partitioning is worth adding to your toolkit.


To get us started with some useful MySQL commands I asked ChatGPT the following: "provide a basic employee table containing fake data with columns id, name, department and salary where the department information is in another table".

Lets go ahead and make a database called useful_mysql_commands and import the script below:

import.sql
CREATE TABLE departments (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255)
);

INSERT INTO departments (name)
VALUES
  ('Sales'),
  ('Marketing'),
  ('Technology');

CREATE TABLE employees (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255),
  department_id INT,
  salary INT,
  FOREIGN KEY (department_id) REFERENCES departments(id)
);

INSERT INTO employees (name, department_id, salary)
VALUES
  ('John Doe', 1, NULL),
  ('Jane Doe', 2, 65000),
  ('Bob Smith', 3, 75000),
  ('Alice Johnson', 1, 65000),
  ('Michael Davis', 2, 75000),
  ('Sarah Wilson', 3, 55000),
  ('William Brown', 1, 75000),
  ('Emma Wilson', 2, 65000),
  ('David Johnson', 3, 55000),
  ('Natalie Davis', 1, 65000),
  ('Richard Wilson', 2, 75000),
  ('Angela Johnson', 3, 55000),
  ('Christopher Davis', 1, 65000),
  ('Ashley Brown', 2, 75000),
  ('Brian Wilson', 3, 55000),
  ('Margaret Johnson', 1, 65000),
  ('Evelyn Davis', 2, 75000),
  ('Dorothy Wilson', 3, 55000),
  ('Joseph Brown', 1, 65000),
  ('Ava Johnson', 2, 75000),
  ('Daniel Davis', 3, 55000),
  ('Ruth Wilson', 1, 65000),
  ('Ryan Johnson', 2, 75000),
  ('Edward Davis', 3, 55000),
  ('Maria Brown', 1, 75000),
  ('Samantha Johnson', 2, 65000),
  ('Jenna Davis', 3, 55000),
  ('Harold Wilson', 1, 65000),
  ('Isabelle Johnson', 2, 75000),
  ('Brandon Davis', 3, 55000),
  ('Madison Brown', 1, 65000),
  ('Keith Johnson', 2, 75000),
  ('Deborah Wilson', 3, 55000),
  ('Mark Davis', 1, 65000),
  ('Olivia Johnson', 2, 75000),
  ('Avery Davis', 3, 55000),
  ('Aiden Wilson', 1, 65000),
  ('Lucas Johnson', 2, 75000),
  ('Ethan Davis', 3, 55000),
  ('Mia Brown', 1, 75000),
  ('William Johnson', 2, 65000),
  ('Sofia Davis', 3, 55000),
  ('Mason Wilson', 1, 65000),
  ('Emily Johnson', 2, 75000),
  ('Ella Davis', 3, 55000);

Great, you might have noticed that our first record is null regarding John Doe's salary (which is intended), now that our Database is ready and our records have been imported, let's take a look at some useful MySQL commands

1. COALESCE

The COALESCE function is a function in MySQL used to return the first non-NULL value from a list of expressions. It is often used to substitute missing or NULL values with a default value.

Example:

Suppose we have a table named employees with columns id, name, department, and salary. Some of the salary values are missing in the table, and we want to replace the missing values with the value R10,000.

To replace the missing salary values with R10,000, we can use the following query:

caolesce.sql
SELECT 
    id, 
    name
    COALESCE(salary, 10000) as salary
FROM 
    employees;

In this query, the COALESCE function is used to substitute missing or NULL salary values with R10,000. The resulting table will have the missing salary values replaced with R10,000.

In conclusion, the COALESCE function is an important function in MySQL that can help you manage missing or NULL values in your data by providing a default value.

2. Simple Join

A join is a way to combine rows from two or more tables based on a related column between them. In MySQL, we have several types of join like inner join, left join, right join, and full outer join.

Example:

Using the two tables named employees and departments, To get the information of employees along with their respective departments, we can use the following query:

simple-join.sql
SELECT 
    employees.name, 
    departments.name
FROM 
    employees
JOIN 
    departments 
ON 
    employees.department_id = departments.id;

3. Group By

The GROUP BY clause is used to group rows with similar values into summary rows. It is often used with aggregate functions like SUM, AVG, MAX, MIN, COUNT, etc.

Example:

To get the average salary of employees grouped by departments, we can use the following query:

simple-group-by.sql
SELECT 
    departments.name,
    AVG(salary)
FROM 
    employees
JOIN 
    departments 
ON 
    employees.department_id = departments.id
GROUP BY 
    departments.name;

4. Having Clause on Group By

The HAVING clause is used to filter the grouped result set and limit the number of groups returned. It is used after the GROUP BY clause and works like a filter on the groups.

Example:

To get the average salary of employees grouped by departments where the average salary is greater than R60,000, we can use the following query:

having-clause.sql
SELECT 
    departments.name,
    AVG(salary)
FROM 
    employees
JOIN 
    departments 
ON 
    employees.department_id = departments.id
GROUP BY 
    departments.name;
HAVING
    AVG(salary) > 60000;

5. Indexes what they are and how to create one

Indexes are a common way to improve the performance of queries in MySQL. By creating an index on one or more columns of a table, you can speed up searches and reduce the amount of data that needs to be scanned.

Here's how you can create an index on the employees table:

index-employee-name.sql
CREATE INDEX idx_employee_name ON employees (name);

In this example, we're creating an index on the name column of the employees table. The idx_employee_name is the name of the index, which you can choose to be anything meaningful.

If you frequently run queries that search for employees by name, this index will significantly improve the performance of those queries. For example, here's a query that searches for all employees with a name containing 'John':

select-after-index.sql
SELECT 
    * 
FROM 
    employees
WHERE 
    name 
LIKE '%John%';

With the index in place, this query will run much faster and be more efficient than without the index.

It's important to note that while indexes can improve query performance, they can also slow down updates and inserts, since the database has to maintain the index whenever data is changed. So it's important to consider the trade-off and choose which columns to index carefully.