Here’s some of the lessons I had to learn in production environment so you don’t have to pay the price of making the same mistakes.
Lesson 1 : A multi column index performs way better than several indexes on each column
For example lets assume we have a users table including first_name and last_name columns , now if we’re querying both columns together most of the time, having one multi column index on both columns would perform much better than separate indexes on each column.
Things to consider before jumping into using this:
- You can notice the big difference when you have a lot of records and a lot of read queries, and after some point the multi column index might be the only possible solution to keep the database responsive.
- Your conditions in the read queries MUST be with the same order as the index, for example if your index is (first_name,last_name) and the query is something like “SELECT * FROM users WHERE last_name=‘Potter’ AND first_name = ‘Harry’ ” the index will NOT be used, the conditions should be in the same order as the index, so the query should be like “SELECT * FROM users WHERE first_name = ‘Harry’ AND last_name=‘Potter’”.
- This index also will be used for queries like “SELECT * FROM users WHERE first_name = ‘Harry’ ” (because it’s the most left column of index).
- You might need the single column indexes in addition to the multi column one, now this case might not make sense for my example but that’s just an example and in your case you might have a ton of data and you still need to answer a lot of queries with both singular and multiple conditions.
Lesson 2: You can back up and restore the entire database way faster by copying the data files instead of dumping SQL
As your database grows larger dumping sql files gets slower and restoring it would even take longer, for example a 100GB database the restoring part might take around 1 day or even more which is not ideal.
Things to consider before doing this:
- You MUST stop the database before copying, and it SHOULD be gracefully, for example if you’re using docker consider using the
-toption for increasing the timeout, for example
docker stop -t 8000
- you SHOULD have your database configuration files and the exact version of MySQL backed up as well for restoring, docker also helps a lot with this as you can easily have them as code.
Lesson 3: Creating a replica/follower and achieving zero downtime backups is easier than I thought
After following the “Lesson 2” all you need to do is to restore one backup and configure the leader and follower with down time of just one regular backup, and after that you can achieve zero downtime backups by doing the periodic backups from “Lesson 2” on the follower instead of the leader.
You might think what about the writes that is happening after you created the backup? should we keep everything down while we’re moving the data for the follower? The answer is NO, if you configure your leader correctly the leader would log the changes after the backup and whenever your follower start it’s process, it’ll catch up with those logged changes as long as those changes are not lost because of the leader’s configured maximum capacity for changes.
Lesson 4: Tuning database has huge impact on the performance
Tuning your configurations might not be important as you’re pressure on database is low, but as it grows the configuration becomes critical, I suggest using https://github.com/major/MySQLTuner-perl that examines your instance and gives you suggestions for configuration.
Lesson 5: Be aware of race conditions
My personal experience was about a situation that according to my software was impossible to happen and took a LONG time for me to find the reason, but if I was familiar with race conditions beforehand I would at least knew the cause way earlier.
The solution for race conditions depends on the situation and there is no one fit all solution, for my case I managed to fix the issue using a unique constraint.
One popular example is incrementing view counters, when two process try to increment the value at the same time and they both have the same old value, and both increment that value by 1 at the same time and as the result the counter would be increase by 1 instead of 2.
One solution for this problem is to use the column name inside the update query instead of the old value.