Photo by Vlad Shapochnikov / Unsplash

Replicate MySQL database in ClickHouse with [experimental] MaterializedMySQL Database Engine

Data Dec 25, 2023

In this guide, we'll walk through the seamless process of replicating a MySQL database into ClickHouse, leveraging the experimental MaterializedMySQL Database Engine. By replicating MySQL data into ClickHouse, we harness the real-time analytical capabilities of ClickHouse while preserving data integrity and consistency.

Introduction

MySQL stands as a stalwart in the realm of relational database management systems, renowned for its open-source nature and robust features. On the other hand, ClickHouse shines as a lightning-fast, column-oriented database management system, empowering users to generate analytical data reports in real-time through SQL queries.

Minimum Software Requirements

Ensure you have the following components installed and configured:

mysql.cnf MySQL server configuration for Replication

The following are mandatory MySQL server configuration which must be set.

  1. Stop MySQL service.
systemctl status mysql
systemctl stop mysql.service
  1. Update the configuration.

Edit the MySQL configuration file mysqld.cnf:

sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

Add/Update the following parameters:

[mysqld]
server-id                     = 223344
log_bin                       = mysql-bin
expire_logs_days              = 1
binlog_format                 = row
binlog_row_image              = FULL
default_authentication_plugin = mysql_native_password
gtid_mode                     = on
enforce_gtid_consistency      = on

ClickHouse reads binlog and performs DDL and DML queries.

  1. Start MySQL service.
systemctl start mysql.service

Reference: https://clickhouse.com/docs/en/engines/database-engines/materialized-mysql

MySQL user creation for replication

Create a MySQL user specifically for replication:

create user clickhouse_replication@'localhost' identified with mysql_native_password by 'ChRep$316';
grant replication slave, replication client, reload, select on *.* to clickhouse_replication@'localhost';
flush privileges;

Sample MySQL Data Setup

Setup a sample table and insert data into MySQL:

  1. Create a table.
CREATE TABLE `user` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `created_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `last_modified_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `username` varchar(255) DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UK_sb8bbouer5wak8vyiiy4pf2bx` (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
  1. Insert sample data.
INSERT INTO `user`(`username`,`email`)VALUES('John Doe','johndoe@example.com');

INSERT INTO `user`(`username`,`email`)VALUES('Jane Doe','janedoe@example.com');

CLickHouse server configuration

sudo systemctl status clickhouse-server
sudo systemctl stop clickhouse-server

Update ClickHouse server configuration to allow experimental MaterializedMySQL engine:

sudo nano /etc/clickhouse-server/users.xml

Add/Update the following value within <profiles>:

    <profiles>
        <default>                           <allow_experimental_database_materialized_mysql>1</allow_experimental_database_materialized_mysql>
        </default>
    </profiles>
sudo systemctl start clickhouse-server

Login to ClickHouse server using terminal or any GUI tool and verify that the changes have been saved.

clickhouse-client --password   
SELECT
    name,
    value,
    changed,
    description
FROM system.settings
WHERE name = 'allow_experimental_database_materialized_mysql'

carbon

Activating the replication in ClickHouse

Create a ClickHouse database with MaterializedMySQL engine for replication:

CREATE DATABASE mysqlCH
ENGINE = MaterializeMySQL('127.0.0.1', 'ch', 'clickhouse_replication', 'ChRep$316')
SETTINGS allows_query_when_mysql_lost = 1, max_wait_time_when_mysql_unavailable = 10000, materialized_mysql_tables_list = 'user'

carbon--1-

Verify Replication

Check if the database and the table with data got replicated into ClickHouse:

select * from mysqlCH.user;

carbon--2-

Additional Operations

Perform additional operations in MySQL and verify replication status in ClickHouse:

INSERT additional rows in MySQL table and check the replication status.

INSERT INTO `user`(`username`,`email`)VALUES('Alice','alice@example.com');
INSERT INTO `user`(`username`,`email`)VALUES('Bob','alice@example.com');

carbon--3-

UPDATE Alice's email in MySQL user table.

UPDATE `ch`.`user` SET `email` = 'alice@domain.com' WHERE `username` = 'Alice';

carbon--4-

DELETE Bob in MySQL user table.

DELETE FROM `ch`.`user` WHERE (`id` = '4');

carbon--5-

With these steps, you've successfully set up MySQL replication into ClickHouse using the MaterializedMySQL engine. Harness the power of ClickHouse's real-time analytical capabilities while ensuring data consistency and integrity. Explore the vast possibilities of real-time analytics with ClickHouse and unlock insights from your replicated MySQL data effortlessly.

Tags

Anantha Raju C

| Poetry | Music | Cinema | Books | Visual Art | Software Engineering |