In the MySQL world, there are a couple of ways to take backups of your data. Backups in MySQL can be categorized into two distinct categories: logical or physical. MySQL also comes with a lot of tools helping us achieve our backup objectives: we have walked you through one of the most popular offerings in this space – Percona XtraBackup – in the second part of this series, and the first part of this series covered one of the main logical MySQL backup tools called
mysqldump. As far as backups are concerned, though, we are far from done – for example,
mysqldump alone has a brother called
mysqlpump. Since the two CLI-based backup tools are frequently confused with each other, the aim of this blog is to clear the confusion up a little.
What is mysqlpump?
mysqlpump is a backup utility that is used via the command-line interface. The tool is very similar to
mysqldump in that it provides us with the ability to take logical backups, but also different at the same time – the goal of
mysqlpump is to be an extendable, parallel-supporting replacement of
mysqldump. In their blog from 2015, MySQL team said that one of the primary aims of introducing
mysqlpump was not be forced to implement legacy functionality that is provided by
mysqlpump was introduced to the world of MySQL in 2015 – the tool was added to MySQL 5.7.8.
The most basic use case of
mysqlpump looks like this (the tool can be used in the same fashion both on Linux and on Windows):
mysqlpump is designed in such a way that it will only dump tables that were created by a user – in other words, it will refrain from backing up tables that were created internally by MySQL (it won‘t touch any of the internal tables.)
The one feature that is exclusive to
mysqlpump, though, is the ability to issue backups using multiple threads at once – to figure out how many threads your system has, follow these steps:
- Linux users – issue a command like
lscpu | egrep ‘Thread|CPU\(s\)’and look at the output (you should see something similar to “Thread(s) per core.”)
- Windows users – open up task manager and look at the number next to “Logical processors” – they represent the amount of threads in the system:
Once you know the number of threads in your system, think about the number of threads you wish (and can) allocate to
mysqlpump – things might seem difficult and complex, but your thought process shouldn’t take more than a couple of seconds –
mysqlpump is usually quick to complete too.
Once you know the number of threads in your infrastructure, configure the number of threads that can be used together with
mysqlpump. Threads are to be configured using the
--parallel-schemas options. Also, consider providing the username and password that you’re using to access MySQL itself through my.cnf for secure access: passwords provided through the CLI can be observed by accessing the history of issued commands. If you don’t want to do that, feel free to provide a username and a password, but be aware that
mysqlpump itself will issue you warnings if you do so:
mysqlpump --default-parallelism=[threads] > data.sql
Since the “specialty” of
mysqlpump is to back up data using multiple threads, we can also define the number of threads that we want the tool to use to dump a specific database – this functionality of
mysqlpump can be exceptionally useful if we’re working with bigger sets of data, but need a logical backup to be taken.
mysqlpump – the Details
- To specify more threads to dump a large database and less threads for smaller databases, look at the option named
parallel-schemas. If you define your parameters like so, 6 threads will be used for larger databases, and 2 for smaller ones:
mysqlpump --parallel-schemas=6:large_db1,large_db2 --parallel_schemas=2:small_db [--default-parallelism=x] > backup.sql
--default-parallelismoption: a default number of threads when dumping other databases (databases that are not defined) can also be specified.
- To back up only specific databases, define your query like so (here db_1 and db_2 represent two separate databases):
mysqlpump –-databases db_1 db_2 > databases.sql
You can also accomplish the same goal like so:
mysqlpump --include-databases=db_1,db_2 --result-file=dump.sql
mysqlpumpcan also perform an “empty” dump (meaning that it can only back up the schema of the database, but not the data contained within):
mysqlpump --include-databases=data_1,db2,db3 --skip-dump-rows --result-file=mydata.sql
mysqlpumpalso gives us the ability to use wildcards inside of any parameter. That means that if we have a lot of databases, backing a part of them (only those starting with the letter x, for example) is a breeze:
mysqlpump --include-databases=x% --result-file=s_databases.sql
- We can also exclude databases from being backed up like so (the following query would exclude all databases starting with
test_and back up all the rest):
mysqlpump --exclude-databases=test_% --result-file=data.sql
- We can also work with entire patterns (the following query would exclude all tables matching the
demopattern from the beginning):
mysqlpump --exclude-tables=__demo --result-file=backup.sql
- And last but not least,
mysqlpumpis also able to work with events and routines – all the same, just specify events or routines instead of databases and tables:
mysqlpump --[include|exclude]-[events|routines]=title1,title2 --result-file=verycoolbackup.sql
mysqlpump can do a number of other things as well – all of the information on
mysqlpump, as with everything MySQL-related, can be found at the documentation.
The Downsides of mysqlpump
However, as powerful as
mysqlpump is, it’s also not without its weaknesses. Research made by Giuseppe Maxia back in September 2015 suggests that
mysqlpump is faster than
mysqldump, but only slightly – the blogger provides an example where
mysqldump takes 3 minutes and 33 seconds to execute, whereas mysqlpump takes 2 minutes and 55 seconds – the difference is there, but we presume that the blogger has expected it to be way bigger than it was.
The research was made with approximately 20 million rows inside of the tables in the database – a figure big enough for both scripts to handle, but not large enough to invoke
SELECT * INTO OUTFILE (a command used to back up bigger sets of data.) This suggests that
mysqlpump is good if we find ourselves needing to use specific options not available to
mysqldump, but doesn’t provide much of an upside otherwise.
mysqlpump entered the MySQL scene in 2015 with the release of MySQL 5.7.8. At first it was thought that it could be a reliable replacement for
mysqldump and also offer a couple of “exotic” options not available to its counterpart, but as time went by
mysqlpump proved that it still needs more refinement to be considered a reliable replacement for mysqldump.
mysqlpump are part of the MySQL’s ecosystem to this day, yet people steer towards
mysqldump more than they do towards
mysqlpump – while
mysqlpump has its use cases, it is thought that its speed is not significant enough to outweigh its counterpart.
We hope that you’ve enjoyed reading this article, stay around the Redgate blog to learn more about everything database-related, and until next time.