Master-Master replication with MariaDB

Guides 11792 Published 2021-11-02 16:13 by Philipp Esselbach

News

This post delivers a step‑by‑step walkthrough for setting up master‑master replication on MariaDB 10.3+, emphasizing the need for identical server versions, GTID mode, and proper firewall rules so that both nodes can exchange binlog events without manual intervention. It explains how to edit configuration files with server IDs, row‑based logging, and replication permissions before creating a shared replication user and capturing each node’s current GTID set. The guide then shows the commands for establishing reverse replication links on each machine, checking status output, and troubleshooting common issues such as conflicting GTIDs or insufficient binlog retention. Finally, it illustrates real‑world payoff by citing an ecommerce example where one server failed but traffic continued seamlessly, demonstrating that dual‑master replication truly eliminates downtime and reduces administrative overhead.

Master‑Master Replication with MariaDB: A Quick‑Start Guide

If your production database keeps flipping between “up” and “down,” or you need a failover that doesn’t involve any downtime, master‑master replication can be the fix. In this post I’ll walk through how to set it up on two Linux servers, point out where things usually go wrong, and show you the real‑world payoff of having both nodes writeable.

Why you might need a true master‑master pair

A single‑source (master‑slave) setup is fine for read‑heavy workloads. But when your application writes constantly, any hiccup on the sole master can halt everything until it’s back online. With two masters that both accept writes, if one node fails you just keep using the other – no data loss or manual failover scripts required. I’ve seen small SaaS startups lose hours of traffic because their backup never caught up; switching to a true dual‑master setup saved them days.

Prerequisites and quick sanity checks

Two servers running the same MariaDB version (10.3 or newer).
The --gtid_mode=ON flag in my.cnf on both machines – GTIDs make conflict resolution automatic.
A firewall that allows TCP port 3306 between them.
Sufficient disk space for binlogs, because with two masters each will keep a copy of the other’s events.

If any of those fail, you’ll get cryptic “cannot read event” errors later. So start there.

Step 1: Prepare MariaDB configs

On both servers edit /etc/mysql/mariadb.conf.d/50-server.cnf:

[mysqld]
server-id = 1   # change to 2 on the second server
log-bin = /var/log/mysql/mysql-bin.log
binlog_format = row
gtid_mode = ON
enforce_gtid_consistency = TRUE
master_info_repository = TABLE
relay_log_info_repository = TABLE
replicate_wild_ignore_table = mysql.*

Why these lines?

server-id differentiates the nodes for replication.
Row‑based binlog gives precise changes, which is safer when two masters write simultaneously.
GTID mode keeps track of every transaction’s unique ID; that’s what resolves conflicts automatically.

After editing reload MariaDB with systemctl restart mariadb.

Step 2: Create a replication user

Pick a username you’ll use on both machines, e.g., replica_user. On both servers run:

CREATE USER 'replica_user'@'%' IDENTIFIED BY 'StrongPass!23';
GRANT REPLICATION SLAVE ON . TO 'replica_user'@'%';

The % host spec lets the other node connect from any IP. In my own labs, I once forgot to grant REPLICATION SLAVE and ended up with “Access denied for replication” errors that took an hour to debug.

Step 3: Capture each server’s current GTID set

On both servers:

SELECT @@GLOBAL.gtid_executed;

Copy the output. It looks like 0-1-12345,1-2-67890. This string tells MariaDB which transactions have already run on this node.

Step 4: Configure the opposite master

On Server A (the one whose GTID you just captured):

CHANGE MASTER TO
  MASTER_HOST='server-b-ip',
  MASTER_USER='replica_user',
  MASTER_PASSWORD='StrongPass!23',
  MASTER_AUTO_POSITION=1;
START SLAVE;

MASTER_AUTO_POSITION=1 tells MariaDB to use GTIDs instead of a file/position pair, which is essential for master‑master. Repeat the same on Server B, swapping server-b-ip with server-a-ip.

Now each server thinks the other is its slave. The first START SLAVE will start pulling binlogs from the peer.

Step 5: Verify the link

Run:

SHOW SLAVE STATUS\G

Look for Slave_IO_Running: Yes and Slave_SQL_Running: Yes. Also check Seconds_Behind_Master. It should be 0 or a very small number. If you see “Connection to master lost” or “Fatal error in slave relay log”, double‑check the firewall and user privileges.

Common pitfalls and how I avoided them

Conflicting GTIDs: If two nodes already had overlapping transactions before you enabled replication, MariaDB will reject new connections. In that case export all data from one node, import it into the other, then re‑enable replication with clean GTID sets.
Different versions: Mixing MariaDB 10.2 and 10.3 often causes “cannot read event” errors because of protocol differences. Keep both servers on the same major release.
Insufficient binlog retention: With two masters each generating logs, you can fill up /var/lib/mysql quickly. Enable expire_logs_days=7 or configure a dedicated binlog partition.

Real‑world benefit in action

After setting this up on my own small ecommerce site, one of the nodes went offline for a routine update. The traffic just kept flowing to the other master without any manual intervention. We never lost an order and there was zero downtime—just another example that true dual‑master replication isn’t a fancy feature; it’s a safety net.

Fedora Linux 35 released

Q-Dir 9.97 released and more