MySQL is a core technology here at Yext, and since we’ve grown from a scrappy roomful of engineers to a mid-sized company, we’ve had to peel various subsets of data from our original single database and move them onto separate systems. The challenge varies depending on the actual data (or the systems that utilize them) in question.

Since we’re talking about growing out of a “startup” phase, we 1) don’t have a big SAN where we can just snapshot entire filesystems on our production databases and move them around, and 2) would generally like to incorporate a MySQL upgrade while moving data onto a brand new system.

As we’ve expanded from our original two databases (master and slave) to our current 58 separate instances, we’ve used (or at least tried) a few different techniques. This particular blog post was inspired by the last migration, in which we moved about 700GB of business listings data from the original main database into its own separate environment.

The basic available methods are as follows:

Full Binary Copy

This is potentially the fastest method - it means copying the entire MySQL data directory from one server to another, thus reproducing the original database in its entirety (and subsequently deleting the tables you don’t need on the new system). If this is actually an option, it’s a great one. It’s simple and relatively fast. Unfortunately, you need to ensure that the files on disk don’t change throughout this process. There are a few things to consider here, depending on your environment, but you’re typically talking about down-time through the entire process. This also may involve some additional steps to upgrade grant tables, etc. if moving major MySQL versions.

Partial Binary Copy

You can do a binary copy of individual InnoDB tables, provided you are using the “file per table” configuration. Once they are moved to the new server, you manually create the table structure, and then import the tablespace to load the data. This can take a few minutes, but is rather efficient. There are some caveats, however. Most notably (at least for us), it can not handle partitioned tables at all. You could theoretically use this method for all the non-paritioned tables, and then do the dump/import on the partitioned tables. In practice, however, the benefits of using the partial binary copy method completey disappear if you have to restore even a dozen or so gigabytes by the dump/import method, as you’re talking about significant wall-clock time, and adding significant complexity to your migration procedure.

MySQL Dump/Import

This method is very straightforward, like the full binary copy. It has the advantage of making any sort of migration (major MySQL versions, storage engines, etc.) fairly easy. But it takes quite a while to re-import the data. If you’re dealing with hundreds of GB or several TB, this option will be eliminated by the time requirement. Importing the data will generally take an order of magnitude longer than dumping or binary copying.

The basic process is to run a mysqldump on the source data in which you’re interested, and then pipe the resulting output into a mysql command against the destination database. For example:

db1 ~ $ mysqldump -u user -ppass dbname | pv | mysql -u user -ppass -h db2 dbname

Of course, you can redirect the original output to a file, and import via the mysql command later, if that suits your needs. Regardless, the overall copy speed can vary tremendously depending on the actual data in question and the destination server’s capabilities, but in our environment (with fairly good hardware), we expect an average of about 3-5MB/s, which is awfully slow.

Restore from innobackupex Backup

This is similar to a full binary copy, but innobackupex allows a huge reduction in down-time, if your data-set is large. It generally takes a great deal more time, and more human attention, but it can be done with very little interruption to production systems. Basically, innobackupex allows you to take a full binary backup of your database while it’s actively running. The way this works is by effectively keeping a log of all the changes made during its backup process and storing them in a log file of sorts. When you restore this data, you can “apply” the log and thus account for all the changes made on the live system during the (potentially multi-hour) backup.

This whole process can take a fair bit of time depending on a few factors like the size of your data-set, the activity of your database during the backup, and the destination hardware. Because of this, the innobackupex option is not performed during a change window - you do this while your production systems are running. But later, after you restore the backup to your new system, you make it a slave of the original master, to account for all the data accumulated during the restore process. This makes the process take even longer, but the key point is that you haven’t had any down-time yet.

Once the new slave catches up to the original master, you’re ready to have a change window in which you migrate your applications to the new system.

Scaling Problems

Based on the aforementioned methods available, we have consistently chosen to use the innobackupex option. It requires a fair bit of preparation, but it nearly eliminates down-time and provides a fairly smooth transition.

Unfortunately, the last migration we performed uncovered a nasty caveat. After performing the innobackupex backup, restoring it on the new hardware, applying the innobackupex log, slaving it off the original master and waiting, we realized that the load on our legacy master server had scaled to the point that our slave would never catch up.

This is because our original master is MySQL 5.5. This version (and older versions) have relatively good single-threaded replication performance, and the entire set of slaves have no issue keeping up with the master. Unfortunately, the new system, with a newer MySQL, has significantly worse single-threaded replication performance. This was quite a disappointing realization to come to at 3AM on a Sunday.

The solution to this is pretty straightforward, but it certainly put the brakes on our original migration. In order for this to succeed the next time, we had to perform the innobackupex procedure, but restore it onto a 5.5 environment so that the new servers could catch up and remain in sync with the master, and then incorporate an in-place MySQL upgrade during our change window.

This is a pretty reasonable migration strategy, but it took a while to arrive at it, due to the creeping size of our data and activity (innobackupex logs and replication load) on the legacy master.