When to use and not to use Galera Cluster for scaling MySQL

Saeid Raei
3 min readOct 25, 2022

Software products often don’t alert you with their limitations upfront and most of the time you need to have the bigger picture while digging deep into their documentation to find them out or you’re gonna have to pay the price of using them and finding those limitations out which often can be very expensive to do.

For example take a look at the MariaDB documentation , it states that galera is highly scalable, in fact it talks about near linear scalability.

In order to understand the case we need to know the difference between synchronous and asynchronous replication, so lets briefly check them out.

Synchronous replication

Basically it means that every write in a leader node (active node or writable node) needs to wait for other nodes to do the same write in order to be considered as a successful write, in this way we are sure that all the nodes are in the same consistent state and none of them have different or inconsistent data.

Asynchronous replication

Every write on a leader (active or writable node) doesn’t wait for any other node to do the write and as long as it’s successful on that node it would be considered successful, in this way the write happens faster and the replication happens in a different process, but other nodes would be in a inconsistent state before the replication completes (known as eventual consistency). this is the default replication mode in MySQL or PostgreSQL.

How does Galera cluster works?

Each node in galera cluster would have copy of all the data and the replication happens synchronously, this means that every write needs to wait for other nodes to do the same write as well. This makes your write queries not scalable as by adding more nodes adds extra overhead that happens over network!

Wait, why did we added multiple leaders (active or writable nodes) to begin with? Distributing writes load, right? But here not only we didn’t do that, also we’ve added extra overhead to write queries, so we did the opposite of what we assume to be the case when we have multiple nodes accepting write queries.

Best way to describe the the nodes being writable is that we have all the nodes synchronized anyway so might as well make them writable.

When Galera cluster can be useful

If your application needs the data to be the most up to date on each node at any time, then you need to have synchronous replication, but MySQL by default doesn’t provide that.
Now if you have way more read queries than writes and don’t need to have write scalability then galera would be an option alongside other options like group replication.

Galera vs default replication

In galera cluster by adding nodes to cluster your writes gets worst while improving reads, that’s the price of getting the true synchronization or consistency, in order to improve your write queries, you need to have your nodes to communicate in a very low latency network in which a normal replication also would happen relatively fast.

If you can accept possible several milliseconds inconsistency, having a load-balancer like ProxySQL in front of a leader and followers nodes would scale much better.

Also there are a lot of cases that true consistency is required like finance related applications or an online quiz that the result would impact the person taking the quiz.

Thanks for reading.

--

--