I have a database that is constantly growing, so I must take great care of these records.
I was looking on the internet and Oracle Corporation (owner of MySQL since 2010) provides a paid product, MySQL Enterprise Backup , to perform this type of backup with one command, mysqlbackups
.
An example of use:
mysqlbackup --backup-image=/backups/sales.mbi --backup-dir=/backup-tmp backup-to-image
With this tool there is a way to perform incremental backup and perform differential backup.
My question basically is: How to perform an incremental or differential backup with basic MySQL commands?
How to perform an incremental or differential backup with basic MySQL commands?
Incremental backups can be made using the tools provided by MySQL, such as
mysqldump
, later using tools such asdiff
to store only the differences.The problem is that it
mysqldump
has several problems for a business use:The backup process overloads the CPU and memory due to the way the data is obtained. Different SQL queries are launched to the server and the server must process the results to send them to
mysqldump
. At the same time , itmysqldump
must collect that information and save it in an SQL format in order to be able to reproduce it later.The size of the backups is much larger than the data on disk. In the case of tables with binary data (blobs) they are dumped in hexadecimal.
A subsequent or simultaneous processing (using pipes) of compression would increase the backup time or processor load and the restore time because the decompression process would have to be performed.
Backup duration is extremely high compared to the equivalent of reading data directly from disk.
Restore is extremely slow compared to the equivalent of writing directly to disk.
Although the tool provides options such as
--single-transaction
making consistent backups at the database level, it is not robust (*) and can block access to all tables and databases on the server if the transaction log fills up during large dumps. number of records until the backup transaction is completed.Options such as
--lock-tables
block the use of the tables during the dump, causing queries against those tables to block.The restore process overloads the CPU, disk and memory of the server due to the interpretation and validation of each SQL query made and also caused by the verification of the keys in the insertion of each record, increasing the more records have been inserted ( although the impact can be reduced by using options like
--disable-keys
).(*) From the documentation :
Using
ALTER TABLE
,CREATE TABLE
,DROP TABLE
,RENAME TABLE
,TRUNCATE TABLE
during a backup might get incorrect results or cause it to fail.Proposed solution
If you can make use of another tool (completely free, unlike MySQL) I recommend using the backup tool developed by Percona called XtraBackup :
In Spanish:
Or, what is the same, the performance of the database is slightly degraded during the backup but no table locking is performed or interferes with the normal work of the server.
In the documentation on incremental backups you can see an example of use:
To make a complete backup, just run:
To perform an incremental backup from the previous one:
Functioning
The process is detailed on the product's website .
It makes a binary copy of the disk files, minimizing the impact to the server, analyzing the transaction log to mark the exact starting point (the
LSN
log sequence number) and, therefore, being able to provide a complete backup consistent, while updating the log data in the background.Differential backups work by copying only the differences stored in the transaction log, reducing disk access and backup time.
The only drawback is that the MyISAM tables cannot be saved incrementally because they are not transactional, so the complete copy of them is made by performing a table lock.
Nowadays there is no compelling reason to maintain tables in MyISAM, so this should not be a problem.
Restoring the backup forces a recovery to update the information in the tables as far as the transaction log ( crash recovery ), so the recovery will also result in a consistent restore at the full server base level (and not at the database level as the "in a transaction" mode would do).
Also, since the restore is done in a different directory (which then needs to be moved to the production directory) a MySQL server can be run on a different port to access the restored data.
It can be useful to test them or access them to make a specific access to some deleted data or to make a selective export or recovery.
Disadvantages
If the tables are not optimized (they have "residue to purge", modified or deleted records that leave the old values taking up unused space) during the backup the residue will be copied to the full backup (but not to the incremental one), so it is It is advisable to execute a
OPTIMIZE TABLE
to those tables of greater size or that tend to accumulate a greater residue before a complete backup.Backup via LVM
Another way to perform a consistent backup is through LVM snapshots .
All MySQL content should be stored on the same logical volume for this solution to be applicable (transaction logs, replication logs if any, tablespaces, etc).
During the recovery process a crash recovery should be forced to be done
rollback
for all those transactions that were not completed at the time of the backup.The process would be carried out as follows:
Where:
-L1G
allows the original partition to be modified 1 GB before it is full. If database activity is high, a larger snapshot may be required.-n mysqlbackup
will namemysqlbackup
this logical volume.-s mysqldatos
indicates that it will be a snapshot of the logical volumemysqldatos
.To mount the snapshot and access it for storage or calculate incremental backup:
The tool to manage full and incremental backups from the snapshot could be
duplicity
, which supports incremental backup of large files, storing only the modified parts from the previous backup.Some time ago I was looking for the school and for a project that I had that is that.
Incremental backup of your database with Git
A really interesting way to make backups of your databases (because it is trivial and very powerful) is using Git. The process is simple and is based on performing database dumps so that each row of the tables is an isolated insert, that way in each commit we will only be saving the differences with respect to the last state (both deletes and inserts, such as updates).
In the specific case of MySQL, we would initially do something like this:
From then on we can automate the process with a script as simple as this:
Depending on the volume of queries that your database has, it will be interesting for you to put it in cron with a certain frequency or another. Additionally it would be advisable to run $ git gc to optimize the repository. For example, twice a day and once a week maintenance:
Also, from another computer, no one prevents you from doing one
$ git clone ssh://equipo:path/to/mydatabase
and having all the history of the database in a flash (well, that's relative, it will occupy its own...) or even schedule a $ git pull to have several backups on different machines. In short, endless options open up.As we have mentioned in the comments, I think you have an option that has not been considered until now, which is to make the platform work instead of looking for a more "manual" solution.
Your problem is that it grows a lot
BBDD
and I understand that what you don't want to do isbackup
constantly blocking theBBDD
.I can give you 3 solutions, depending on what you need or can afford at the platform level, I recommend one or the other, in this case I put them in order of complexity/cost for the application.
MySQL Master/Slave
In this case what we have is a one
master de MySQL
and oneslave
of that machine. In this way, while you can continue writing in himmaster
, heslave
has a complete copyBBDD
of the one that you can only read, but that you can block as many times as you want and which you can shamelessly crush, since he will getupdates
them as he goes. that I canMounting it is relatively simple, I leave you a tutorial on how to install a
master/slave de MySQL
Even if you control a little English, following the commands is enough. I have it in a very large application in production and no problem.
The good thing about it is that it can be assembled with two machines, therefore the cost of the platform is quite small and you can expand it as much as you want. The number of reads is very high and the number of writes as high as the
master
. In the same way, if the crashesslave
nothing happens and if the crashesmaster
you can change theslave
tomaster
and save the application.mysql-cluster
It is a new feature of the version
5.7
that I have only been able to test with a test project so I cannot give you an extremely good feedback, I can only tell you that it seems that it has been tested a lot and the opinions that I have seen are quite favorable.Here you have the tutorial to install a
cluster de MySQL
, it is not as simple as amaster/slave
but it gives you many more things than it. For example, it hasautomatic sharding
to allow you to greatly increase the number of writes. If you don't know what it issharding
, they comment on it on StackOverflow in English , surely reading will help you.Galley Cluster
It is one
cluster
ofmaster/master/master
this, you can read and write from the three servers at the same time, I think it is too much for what you need, but if you think that the application can increase a lot in size, it is a very good option.The problem is that there must be at least 3 servers for it to
cluster
be sufficiently consistent, since what is called must existquorum
. The number of reads does not increase as much as in acluster de MySQL
because it does not have theautomatic sharding
but unlikemaster/slave
a server can go down without any problem because thequorum
can continue to exist.Here you have how to install galera in CentOS7 and install galera in Ubuntu 16.04 , since I don't know what system you use, I'll leave you both. I always prefer
CentOS
orDebian
for everything related to servers.differences between
Galera
andMySQL Cluster
The differences between
Galera
and the clustes ofMysql
are quite a few, I recommend that you read this slideshare , since if we put it here the answer would be almost infinite and then the opinions of each one would enter as to which is better or worse.I think that with this I can give you an idea of the options you have at the platform level, without having to pull something more "strange" things like putting a
git
to a file ofMySQL
If you need more information, comment and I'll help you as much as I can.