MySQL Index Overviews: Composite B-Tree Indexes

Comments 0

Share to social media

Indexes in MySQL are one of the primary ways to enhance query performance and they are especially useful when the primary use case of our project refers to reading data stored in a database. We‘ve already told you about the nuanced world of indexes in MySQL – and there we‘ve mentioned that MySQL has multiple types of SQL indexes available for you to choose from.

The primary index type available within MySQL is the B-Tree index which we‘ve already covered in one of our previous articles. If you know your way around MySQL though, you will certainly be aware of other nuances of indexes, too – and one of those nuances has to do with the fact that B-Tree indexes can also consist of multiple columns (commonly referred to as composite indexes). In this example, we’re using MariaDB as our DBMS of choice. Percona Server for MySQL and MySQL Server will act identically.

Find the queries recreating the table structure and composite indexes in the appendix, and let’s get started.

What are Composite Indexes?

In the database world, composite indexes refer to indexes on multiple columns. Composite indexes are sometimes referred to as multicolumn indexes because that‘s precisely what they are – such indexes are keys on multiple columns, while indexes on the same column are referred to as simple indexes.

Multicolumn indexes have various specifics unique to themselves – depending on the database management system you find yourself using, you will find that composite indexes can be created on up to 16 or 32 columns, that these kinds of indexes only use B-Tree (MySQL) or GIN, BRIN, and GiST index types (PostgreSQL), and that a multicolumn index can be similar to a sorted array.

Composite Indexes in MySQL

Let‘s not get too far away from our primary DBMS of interest – MySQL. As noted before, the specifics of the components within your DBMS depend directly on the database management system you find yourself using – that‘s true for every component including composite indexes.

Composite indexes in MySQL:

  • May consist of up to 16 columns.
  • Will be used if your query uses all columns in the composite index or the first X columns in the composite index allowing the query optimizer to use any leftmost prefix of the index to look up rows.
  • Have a crucially important feature – the order of columns within the index.

Here‘s a basic composite index definition:

When composite indexes are in use, the order of the columns within the index is a crucial point. One cannot skip columns or read them out of order for the index to be used.

Can you guess what we just did? We‘ve created a composite index called composite_idx on a table named demo_table and created the multicolumn index on three columns – c1, c2, and c3. Not exactly rocket science, is it?

Such an index definition enables you to have indexed search capabilities on queries that act in three ways:

  1. Queries that make use of the c1 column
  2. Queries that use both c1 and c2 columns
  3. Queries that utilize all three – c1, c2, and c3 – columns.

Interesting, right? Such an interesting behavior applicable to defined multicolumn indexes also means that certain limits are in place – aside from being able to define up to 16 columns and not more, users employing composite indexes should also be aware of the fact that MySQL can only access or use your composite indexes in order, and MySQL will also stop using the composite index after the first range condition. We’ve provided an example of how everything looks like below.

Let‘s try them out! We have a simple table depicting items purchased by customers that internally looks like so:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\DESCRIBE demo_table.png

Image 1 – DESCRIBE Query on a MariaDB Table without a Composite Index

Now, we define a composite index on the columns item_purchased and quantity_purchased, then inspect (DESCRIBE) the table again:

The output should be similar to:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\DESCRIBE with Composite Indexes.png

Image 2 – DESCRIBE Query on a Table with a Composite Index

Great! We now see that we have a key – an index – referred to as “MUL” – a multiple-column, or composite, index. Do note that various database management systems display such indexes differently and MySQL (and its counterparts) will only show “MUL” to the side of the first column that makes up a composite index, but inspecting the table structure in phpMyAdmin (or other SQL clients, for that matter) will show you that your column has multiple indexes more clearly once you inspect the table structure. See the example below:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Indexes in phpMyAdmin.png

Image 3 – Indexes in phpMyAdmin

Now we will dig into the internals of composite indexes, then run a couple of SQL queries to better understand the internals of multicolumn indexes in our MySQL infrastructure.

Index Order in Composite Indexes and Internals

Here we come back to the main capability of composite indexes: they are B-Tree indexes, but with a special feature in that they contain multiple columns within one index. As such, the order of columns is crucially important: SQL queries will run the most efficiently when you have a composite index with a column order ranging from the most to least distinct counting from the left. In other words, try to order your columns in such an order that the column that returns the highest amount of distinct rows when queried with the SELECT DISTINCT query comes first.

Suppose we have a demo_data table like so with a bunch of data inside of it (running COUNT(*) queries in InnoDB can take much longer than in this example if you run bigger data sets – InnoDB doesn’t store an internal row count inside of itself unlike MyISAM):

The output should be similar to:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\DESCRIBE demo_data.png

Image 4 – The Structure of demo_data Table

Let’s now add a composite index spanning all three columns with this query (the USING BTREE clause is optional and is the default):

So for our table:

Executing this you will see the following:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Adding Composite Index on demo_data.png

Image 5 – Adding a B-Tree Composite Index

Let’s now confirm our index has been added on all three columns:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Composite Indexes in phpmyAdmin.png

Image 6 – A B-Tree Composite Index on a Table

Great. Now, take note of the index order – it’s email, then username, then IP. Not the other way around! We can prove the index order is at play by running a query like SHOW INDEXES and observing the Seq_in_index column:

The output will be:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\SHOW INDEXES FROM demo_data.png

Image 7 – SHOW INDEXES in MariaDB

This means that our index will be used if we use queries like these (note that we use all columns inside of the composite index):

Output:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Using Indexes on demo_data.png

Image 8 – Using Composite Indexes Searching for Multiple Columns

Our composite index will also be used if we search solely for the first column from the left-hand side (in this case, email):

You will see something like the following:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\EXPLAIN demo_data WHERE email.png

Image 9 – Using Composite Indexes Searching for One Column

You may also see a situation where possible_keys may be depicted as NULL, but the key (chosen index) wouldn’t be NULL:

This should return:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\possible_keys NULL key not null.png

Image 10 – NULL or NOT NULL?

Such situations can happen if MySQL (in this case, MariaDB) thinks that none of the indexes are suitable for a lookup, but all the columns selected by our query are columns of an index, which is exactly what’s happening here.

If we had another, non-indexed column, our composite index would still be used if our query would use that column and a column in the composite index:

Executing this should return something like:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\new_column and using index.png

Image 11 – Using a Composite Index with a Non-Indexed Column

However, if our query wouldn’t use the leftmost prefix in the composite index (this column is email), no index would be used:

The output:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\No Composite Index Used on demo_data.png

Image 12 – No Usage of Composite Index

Analyzing Composite Indexes in MySQL

Now that we know the basics of how composite indexes work internally, we can start analyzing them.

The first SQL query we run is a simple SELECT with one column in question:

Executing this, you should see something similar to:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\EXPLAIN Simple.png

Image 14 – EXPLAINing a Simple SELECT Query

The output of the EXPLAIN query shows that MariaDB elects to use our composite index. Our index has a key length of 482 bytes and that allows us to determine how many parts of our composite index are used by MySQL. This time, it’s a safe bet that MariaDB uses the item_purchased part of the composite_idx and we can confirm it by swapping the column order with a query like so:

Then trying the query again:

Now you will see:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\EXPLAIN Simple on demo_table2.png

Image 15 – The Same SQL Query with a Composite Index in Switched Order

See – no possible keys here! Why? We didn’t use the leftmost prefix of the column that was indexed, we’ve used the rightmost prefix instead. In

This also explains the key length of 482 bytes – MariaDB stops using the multicolumn index at the first range condition and key_len specifies the number of bytes that are used from the index.

Let’s try again – this time we run a different query:

This should be the output:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\EXPLAIN without Using Indexes.png

Image 16 – A Query That Doesn’t Use Indexes

This time we see that MySQL didn’t even consider any indexes to be used. To explain this, we need to take a step back and remember the specifics of multicolumn indexes – such indexes will only be used for finding rows if we use a leftmost prefix of the index – in our case, that’s the item_purchased column. Let’s include the column, then try again:

Now you will see the index has been used:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Wildcard - Index Used.png

Image 17 – A Query That Uses a Composite Index

Woohoo! MariaDB uses our index again. MariaDB can traverse an index backward too and let’s not forget that the usual rules for indexes apply here too – attentive readers will notice that we have a wildcard only at the end of our search statement because otherwise, no index will be used no matter the order of the columns:

Now you see:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Wildcard - No Index Used.png

Image 18 – a Query with a Wildcard

A query searching for exact matches, however, would use the composite index:

You can see that in the output:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\Perfect Equality Operator.png

Image 19 – a Query Searching for Exact Matches

Do note that where we see that MySQL uses the composite index, the length of the key (the index) remains being 482 bytes no matter what we do – that means that MySQL stops using the multicolumn index at the first range condition and doesn’t use the other part of the index (we defined a composite index on two columns – item_purchased and quantity_purchased.)

Composite Indexes, Sorting, and Search Operators

Everything gets even more interesting when sorting operations are involved – suppose we have a table like so:

You can see the table information:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\DESCRIBE people.png

Image 20 – Another Example of a Table Using Composite Indexes

On this table, we’ve defined a composite index like so:

We will now check the plans of a couple of SELECT queries with the ORDER BY clause:

This will return the following outputs:

D:\Misc\Redgate\Blogs\May 2024\MySQL Index Deep Dive - Composite Indexes\Images\EXPLAINS - People.png

Image 21 – Exploring Sorting in Composite Indexes

Now we‘ve run three queries – attentive readers will notice that the Extra column in all of the result sets is different:

  1. The first result ordered by the last_name was using the index condition and using the WHERE clause. The first example is using the composite index (composite_idx) because we search for two columns (first_name and last_name) that are both part of a composite index.
  2. The second result ordered by the birthday was using the index condition, using a WHERE clause, and using a filesort operation. Sorting occurs when no applicable index satisfies the ordering of rows.
  3. The third result searching for a birthday and ordering by birthday was using a WHERE clause and using a filesort operation.

Here‘s what that means:

  • Using the index condition means that MySQL filters rows by using the index and then uses the table to retrieve the matched rows.
  • Using a WHERE clause can have different meanings. Some argue that this piece of information is frequently inconsistent at best, but in this specific case, it most likely means that the expression within a WHERE clause performed and index seek operation.
  • Using a filesort operation means that rows in the table were read and sorted during an extra sorting phase during query execution.

Aside from the explanation of the query internals, one would benefit from using all of the parts of the composite index. Doing so is remarkably simple and in our case, would look like so:

Pay special attention to the second part of the query straight after the WHERE clause – we search for a first_name and ORDER BY last_name because our index looks exactly like that:

To use a composite index with an ORDER BY clause, ensure that you search for the leftmost column, then order by the rest with an equality operator.

Do keep in mind that the direction your ORDER BY is sorted in also has a say here:

  • MySQL will avoid a filesort operation if all parts of your composite index are sorted in an ascending (ASC) or descending (DESC) order and it has all of the rows from the equality match. This may seem like a narrow use case, but it’s worth keeping in mind. Remember that queries are tasks composed of smaller tasks – eliminating any unnecessary smaller task will make the performance of your query faster.
  • You may not be able to avoid a filesort operation if one part of your composite index is sorted in a different manner than the other (e.g. if one part of your query is sorted in an ASCending order and the other in a DESCending order.) If you want your database to make use of your covering index, make sure that your query criteria specifies that – e.g. if you have a covering index on two columns, search for one column and the other or search for one column and order by the other, etc.
  • If you search for something in a way that‘s backwards to the way you defined your composite indexes, you may be privy to a backwards index scan where MySQL decides to start searching for something starting from the end of the root node of the B-Tree index.

Key Takeaways and Summary

Composite, or multicolumn, indexes are indexes that are defined on multiple columns. To effectively use such indexes, make sure to use the “leftmost prefix“ principle: put the column you‘re accessing first in the first place, the second – in the second place, and so on.

An understanding of indexes will take you very far in the database world and beyond. The understanding of multicolumn indexes in your specific database management system of choice – in our case, MySQL and its counterparts like MariaDB Server and Percona Server – will significantly improve the performance of SQL queries that use filtering or sorting operations on columns.

Before embarking on any kind of index within your database management system, make sure to choose the proper index type after educating yourself on the Simple Talk site, educate yourself on their upsides and downsides, and most importantly – use indexes wisely. Use the index, Luke! Use indexes wisely and until next time.

Appendix – Table Structure and Demo Data

In case you want to create the tables and indexes in this blog, the table structure and all demo data is available here.

To generate a bunch of random data, take any data set with >100K rows, or anonymize if you have live data (this SQL query will update all usernames to one of 10 random values):

If you’re lazy, you can also use RAND() like so to create quick sample data (this query will set all usernames to random numeric values):

Anonymize email addresses with these queries (replace email domains with different domains):

And anonymize IP addresses like so (these queries will anonymize the first 200,000 IP addresses to 127.0.0.1, and the rest to 127.0.0.2):

Load comments

About the author

Lukas Vileikis

See Profile

Lukas Vileikis is an ethical hacker and a frequent conference speaker. Since 2014, Lukas has found and responsibly disclosed security flaws in some of the most visited websites in Lithuania. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.