Backing up MySQL Part 4: mydumper

In this blog, we walk you through one of the most prominent MySQL logical backup solutions - a CLI-based tool called mydumper.

There are many tools we can use to back up our MySQL databases. Some are well-known and used by the best technology companies out there (mysqldump comes to mind), and some are a little less famous, but still have their place in the MySQL world. Enter mydumper – the tool is built by the engineering team over at Percona and it‘s supposedly created to address performance issues caused by mysqldump.

Prerequisites and Installation

mydumper is a tool that lets people take logical backups of their MySQL databases. The tool is built according to the best MySQL practices, it‘s being actively maintained, and works on all flavors of MySQL, including Percona Server and MariaDB. Note that the tool is not built-in to MySQL – to use it, one needs to install the development versions of MySQL or any of its flavors, install a couple of development tools, and install the development versions of GLib, ZLib, PCRE, and ZSTD – once all of these libraries are installed, we should follow the instructions for installation that are available here.

All of these libraries accomplish specific goals – GLib is a bundle of libraries, ZLib is used to compress data, PCRE stands for „Perl-Compatible Regular Expressions“, and ZSTD – otherwise known as ZStandard – is a compression algorithm developed by Facebook.

The installation of mydumper will provide us with:

  • The ability to export and import data using multiple threads in parallel, sort of like mysqlpump does.
  • The ability to use regex expressions when exporting data.
  • Easily manageable output – all tables will be backed up into separate files exclusive to them, data and metadata will be separated, etc.

Using mydumper

Once mydumper is installed, we can start playing with the tool. Even though there are multiple advanced features provided by the tool, the syntax isn‘t rocket science and looks like the following:

mydumper [options]

That‘s it, really. Well, users will still need to specify the host (--host), user (--user), and password (--password) to work with, but other than that, only the options remain. The output that is received will, of course, vary and depend directly on the options that are specified, so let‘s look into them as well:

  • Users can specify the --database option to provide mydumper with the database to work with (dump data from.) Alternatively, the --B option can also be specified.
  • The --tables option, or its shortened version --T, will let users specify a comma-delimited list of tables to take a copy of.
  • The databases and tables that are being dumped need to have an output directory that can be specified by appending --outputdir or --o (the directory must be writable.)
  • Users can also split INSERT statements into smaller size by using --statement-size (or --s.) Note that the size of INSERT statements are to be specified in bytes (default size – 1,000,000 bytes.)
  • mydumper allows its users to specify which storage engines to ignore (i.e. mydumper will not back up tables built on this specific storage engine) by using --ignore-engines or --i.
  • The --regex or --r option will let users specify databases and tables using regex in the format of „database.table.“ Something like this should do:
    mydumper --regex „(^demo_db\.demo_database$)“
  • To specify the number of threads mydumper is able to use (the default number of allocated threads is 4), consider the --threads or --t option.
  • mydumper is also able to automatically send long-running queries to the timeout realm or kill long-running queries: that can be done by either specifying the --long-query-guard or --kill-long-queries parameters (or their shorter counterparts which are --l and --k respectively.)

The options specified above should tell you a little about just how powerful mydumper can be when used properly – however, do note that other options can be used as wellmydumper also comes with a brother called myloader which is essentially a backup tool that restores data generated by mydumper from a directory specified by the user. To use myloader, invoke it like so:

myloader --directory=[directory] [--overwrite_tables] --user=your_user

The --directory option takes the directory inside of which the backup created by mydumper is being stored, and the overwrite-tables option can also be invoked to overwrite all existing tables.

Make sure to read up on the docs surrounding the two tools here and follow Percona’s blog to learn more about the tools they develop – explore the Databases part of SimpleTalk to learn more about the functionality of all kinds of databases (our blog is not limited to MySQL!) and how they impact the behavior of applications, and we’ll see you in the next one.