Severalnines - docker

In the previous post, we looked into the basics of running MySQL containers on multiple Docker hosts managed by Swarm Mode, a native orchestration tool comes with Docker 1.12. However, at the time of writing, Docker Engine Swarm Mode does not support other networking plugins like Calico, Weave or Flannel. If we’d like to run any of these, we must run it outside of Docker Swarm mode and use other tools for orchestration e.g, Kubernetes, Mesos or Docker Swarm.

In this blog post, we are going to look into other networking drivers that support multi-host networking to best fit our MySQL setups. We are going to deploy MySQL Replication on top of three Docker hosts via Calico’s driver on multi-host networking. Weave and Flannel will be covered in the upcoming blog posts.

Calico cannot be treated as an “overlay network” - which means it does not encapsulate one packet inside another packet. It uses pure Layer 3 approach and avoids the packet encapsulation associated with the Layer 2 solution which simplifies diagnostics, reduces transport overhead and improves performance. Calico also implements BGP protocol for routing combined with a pure IP network, thus allows internet scaling for virtual networks.

Consider having 3 physical hosts installed with Docker Engine v1.12.1. All hosts are running on CentOS 7.1. The following is the output of /etc/hosts on each host:

192.168.55.111    docker1.local docker1
192.168.55.112    docker2.local docker2
192.168.55.113    docker3.local docker3

Key-Value Store (etcd)

Etcd is a popular open-source distributed key value store that provides shared configuration and service discovery. A simple use-case is to store database connection details or feature flags in etcd as key value pairs.

Calico requires etcd to operate. Etcd can be clustered with many instances. In this example, we are going to install etcd on each of the Docker host and form a three-node etcd cluster for better availability.

Install etcd packages:
```
$ yum install etcd
```

Modify the configuration file accordingly depending on the Docker hosts:

$ vim /etc/etcd/etcd.conf

For docker1 with IP address 192.168.55.111:

ETCD_NAME=etcd1
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.111:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

For docker2 with IP address 192.168.55.112:

ETCD_NAME=etcd2
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.112:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

For docker3 with IP address 192.168.55.113:

ETCD_NAME=etcd3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

Start the service on docker1, followed by docker2 and docker3:
```
$ systemctl start etcd
```

Verify our cluster status:

[docker3 ]$ etcdctl cluster-health
member 2f8ec0a21c11c189 is healthy: got healthy result from http://0.0.0.0:2379
member 589a7883a7ee56ec is healthy: got healthy result from http://0.0.0.0:2379
member fcacfa3f23575abe is healthy: got healthy result from http://0.0.0.0:2379
cluster is healthy

That’s it. Our etcd is now running as a cluster on three nodes. Our setup now looks like this:

Calico Installation

Ensure etcd is installed as mentioned in the “Key-Value Store (etcd)” section. The following commands should be performed on each Docker host unless specified otherwise:

Download Calico and make it executable:

$ wget http://www.projectcalico.org/builds/calicoctl -P /usr/local/bin
$ chmod +x /usr/local/bin/calicoctl

Create the calico node for docker1. Specify the Docker host’s IP address as below:

[root@docker1 ~]$ calicoctl node --ip=192.168.55.111 --libnetwork
Running Docker container with the following command:

docker run -d --restart=always --net=host --privileged --name=calico-node -e HOSTNAME=docker1.local -e IP=192.168.55.111 -e IP6= -e CALICO_NETWORKING_BACKEND=bird -e AS= -e NO_DEFAULT_POOLS= -e ETCD_AUTHORITY=127.0.0.1:2379 -e ETCD_SCHEME=http -v /var/log/calico:/var/log/calico -v /lib/modules:/lib/modules -v /var/run/calico:/var/run/calico calico/node:latest

Calico node is running with id: 37cf85e600e3fc04b16cdad7dc1c9f3bb7e8ace80b5fef4dbc21155d60440a78
Waiting for successful startup
Waiting for etcd connection...
Calico node started successfully
Calico libnetwork driver is running with id: 4c2c4b3fd5b8155622a656440513680c8da051ed6881a94a33fbdc1e8748c060

Same goes to docker2, where the host IP address is 192.168.55.112:

[root@docker2 ~]$ calicoctl node --ip=192.168.55.112 --libnetwork
Running Docker container with the following command:

docker run -d --restart=always --net=host --privileged --name=calico-node -e HOSTNAME=docker2.local -e IP=192.168.55.112 -e IP6= -e CALICO_NETWORKING_BACKEND=bird -e AS= -e NO_DEFAULT_POOLS= -e ETCD_AUTHORITY=127.0.0.1:2379 -e ETCD_SCHEME=http -v /var/log/calico:/var/log/calico -v /lib/modules:/lib/modules -v /var/run/calico:/var/run/calico calico/node:latest

Calico node is running with id: 37cf85e600e3fc04b16cdad7dc1c9f3bb7e8ace80b5fef4dbc21155d60440a78
Waiting for successful startup
Waiting for etcd connection...
Calico node started successfully
Calico libnetwork driver is running with id: 4c2c4b3fd5b8155622a656440513680c8da051ed6881a94a33fbdc1e8748c060

Then, on docker3:

[root@docker3 ~]$ calicoctl node --ip=192.168.55.113 --libnetwork
Running Docker container with the following command:

docker run -d --restart=always --net=host --privileged --name=calico-node -e HOSTNAME=docker3.local -e IP=192.168.55.113 -e IP6= -e CALICO_NETWORKING_BACKEND=bird -e AS= -e NO_DEFAULT_POOLS= -e ETCD_AUTHORITY=127.0.0.1:2379 -e ETCD_SCHEME=http -v /var/log/calico:/var/log/calico -v /lib/modules:/lib/modules -v /var/run/calico:/var/run/calico calico/node:latest

Calico node is running with id: 37cf85e600e3fc04b16cdad7dc1c9f3bb7e8ace80b5fef4dbc21155d60440a78
Waiting for successful startup
Waiting for etcd connection...
Calico node started successfully
Calico libnetwork driver is running with id: 4c2c4b3fd5b8155622a656440513680c8da051ed6881a94a33fbdc1e8748c060

A calico-node service (“calico/node:latest”) runs on each Docker host which handles all of the necessary IP routing, installation of policy rules, and distribution of routes across the cluster of nodes.

Verify the Calico status:

[root@docker1 ~]$ calicoctl node show
+---------------+----------------+-----------+-------------------+--------------+--------------+
|    Hostname   |   Bird IPv4    | Bird IPv6 |       AS Num      | BGP Peers v4 | BGP Peers v6 |
+---------------+----------------+-----------+-------------------+--------------+--------------+
| docker1.local | 192.168.55.111 |           | 64511 (inherited) |              |              |
| docker2.local | 192.168.55.112 |           | 64511 (inherited) |              |              |
| docker3.local | 192.168.55.113 |           | 64511 (inherited) |              |              |
+---------------+----------------+-----------+-------------------+--------------+--------------+

Configure an IP pool for our Calico network:

[root@docker1 ~]$ calicoctl pool add 192.168.0.0/16

Verify the address pool is there:

[root@docker1 ~]$ calicoctl pool show --ipv4
+----------------+---------+
|   IPv4 CIDR    | Options |
+----------------+---------+
| 192.168.0.0/16 |         |
+----------------+---------+

Create a new profile so we can group all MySQL Replication containers under the same roof:

[root@docker1 ~]$ calicoctl profile add mysql-replication
Created profile mysql-replication

With the profile created, we can illustrate our architecture as per below:

Deploying Multi-Host MySQL Replication Containers with Calico

Calico can be configured without having to use the Docker networking commands. Rather than have Docker configure the network, we are going use the “calicoctl” command line tool to add a container into a Calico network - this adds the required interface and routes into the container, and configures Calico with the correct endpoint information.

Calico use profiles to manage container isolation. You can create profiles and append containers with Calico network into different profiles. Only containers in the same profile are able to talk to each other. Containers from different profiles cannot access each other even though they are in the same CIDR subnet. As shown in step #7 in the previous section, we are going to append all containers under the same profile called mysql-replication.

Host docker1 will be running mysql-master container, while mysql-slave1 and mysql-slave2 containers on host docker2 and docker3 respectively.

MySQL Master

The following commands should be performed on docker1.

Firstly, create a directory to be used by the master:

[root@docker1 ~]$ mkdir -p /opt/Docker/mysql-master/data

Create the MySQL master container:

[root@docker1 ~]$ docker run -d \
--name mysql-master \
-v /opt/Docker/mysql-master/data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=mypassword \
 mysql \
--gtid_mode=ON \
--log-bin \
--log-slave-updates \
--enforce-gtid-consistency \
--server-id=1

Create the slave user to be used by our slaves:

[root@docker1 ~]$ docker exec -ti mysql-master 'mysql' -uroot -pmypassword -vvv -e "GRANT REPLICATION SLAVE ON *.* TO repl@'%' IDENTIFIED BY 'slavepass'

Add Calico’s network interface into the container and assign an IP address in the range of Calico pool:

[root@docker1 ~]$ calicoctl container add mysql-master 192.168.0.50
IP 192.168.0.50 added to mysql-master

Add the container into the profile:

[root@docker1 ~]$ calicoctl container mysql-master profile append mysql-replication
Profile(s) mysql-replication appended.

When each container is added to Calico, an "endpoint" is registered for each container's interface. Containers are only allowed to communicate with one another when both of their endpoints are assigned the same profile.

MySQL on Docker: Understanding the Basics - The Webinar

Join us on September 27th for a much anticipated free webinar MySQL on Docker: Understanding the Basics

MySQL Slave #1

The following commands should be performed on docker2.

Create a directory to be used by the slave1:

[root@docker2 ~]$ mkdir -p /opt/Docker/mysql-slave1/data

Create the MySQL slave1 container:

[root@docker2 ~]$ docker run -d \
--name mysql-slave1 \
-v /opt/Docker/mysql-slave1/data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=mypassword \
mysql \
--gtid_mode=ON \
--log-bin \
--log-slave-updates \
--enforce-gtid-consistency \
--server-id=101

Add Calico’s network interface into the container and assign an IP address in the range of Calico pool:

[root@docker2 ~]$ calicoctl container add mysql-slave1 192.168.0.51
IP 192.168.0.51 added to mysql-slave1

Add the container into the profile:

[root@docker2 ~]$ calicoctl container mysql-slave1 profile append mysql-replication
Profile(s) mysql-replication appended.

Change the master and supply the slave credentials as created in mysql-master:

[root@docker2 ~]$ docker exec -ti mysql-slave1 'mysql' -uroot -pmypassword -e 'CHANGE MASTER TO master_host="192.168.0.50", master_user="repl", master_password="slavepass", master_auto_position=1;' -vvv

Start the replication:

[root@docker2 ~]$ docker exec -ti mysql-slave1 'mysql' -uroot -pmypassword -e "START SLAVE;" -vvv

Verify if mysql-slave1 is running:

[root@docker2 ~]$ docker exec -ti mysql-slave1 'mysql' -uroot -pmypassword -e "SHOW SLAVE STATUS \G"
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...

MySQL Slave #2

The following commands should be performed on docker3.

Create a directory to be used by the slave2:

[root@docker3 ~]$ mkdir -p /opt/Docker/mysql-slave2/data

Create the MySQL slave2 container:

[root@docker3 ~]$ docker run -d \
--name mysql-slave2 \
-v /opt/Docker/mysql-slave2/data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=mypassword \
mysql \
--gtid_mode=ON \
--log-bin \
--log-slave-updates \
--enforce-gtid-consistency \
--server-id=102

Add Calico’s network interface into the container and assign an IP address in the range of Calico pool:

[root@docker3 ~]$ calicoctl container add mysql-slave2 192.168.0.52
IP 192.168.0.52 added to mysql-slave2

Add the container into the profile:

[root@docker3 ~]$ calicoctl container mysql-slave2 profile append mysql-replication
Profile(s) mysql-replication appended.

Change the master and supply the slave credentials as created in mysql-master:

[root@docker3 ~]$ docker exec -ti mysql-slave2 'mysql' -uroot -pmypassword -e 'CHANGE MASTER TO master_host="192.168.0.50", master_user="repl", master_password="slavepass", master_auto_position=1;'

Start the replication:

[root@docker3 ~]$ docker exec -ti mysql-slave2 'mysql' -uroot -pmypassword -e "START SLAVE;"

Verify if mysql-slave2 is running:

[root@docker3 ~]$ docker exec -ti mysql-slave2 'mysql' -uroot -pmypassword -e "SHOW SLAVE STATUS \G"
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...

We can get a summary of all endpoints created by Calico with “--detailed” flag:

[root@docker1 ~]$ calicoctl endpoint show --detailed
+---------------+-----------------+------------------------------------------------------------------+----------------------------------+-----------------+-------------------+-------------------+--------+
|    Hostname   | Orchestrator ID |                           Workload ID                            |           Endpoint ID            |    Addresses    |        MAC        |      Profiles     | State  |
+---------------+-----------------+------------------------------------------------------------------+----------------------------------+-----------------+-------------------+-------------------+--------+
| docker1.local |      docker     | 89d2ef40918100037e250911e782f71129dd38d7253df274c70d4a31b281de0f | bb3b9fb4870e11e6a88a000c29d498bb | 192.168.0.50/32 | 6a:d1:42:30:05:9a | mysql-replication | active |
| docker2.local |      docker     | ef379f4d46f957165513f86e7859613be9971e82364dc81f1641fd4faae1ec5d | 96aef76c870f11e6b31a000c2903c574 | 192.168.0.51/32 | 3a:d5:08:0c:d9:4a | mysql-replication | active |
| docker3.local |      docker     | 61e9f51a6fb5b36a30e35d2779e1064267604fad0ee28562567cf166a2d90727 | 387c06c0886411e6b2f1000c29bb2913 | 192.168.0.52/32 | da:18:9f:42:68:9c | mysql-replication | active |
+---------------+-----------------+------------------------------------------------------------------+----------------------------------+-----------------+-------------------+-------------------+--------+

Our architecture is now looking like this:

Exposing to the Public

Now we are ready to expose our MySQL Replication to the outside world. To do this, add an inbound rule to port 3306:

[root@docker1 ~]$ calicoctl profile mysql-replication rule add inbound allow tcp to ports 3306

Verify the profile’s inbound and outbound rules:

[root@docker1 ~]$ calicoctl profile mysql-replication rule show
Inbound rules:
   1 allow from tag mysql-replication
   2 allow tcp to ports 3306
Outbound rules:
   1 allow

Then, on each of the Docker host, add the DNAT iptables rules respectively. Consider the public IP is listening on interface eth0 of the Docker host.

Docker1 (mysql-master):

[root@docker1 ~]$ iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 3306 -j DNAT --to 192.168.0.50:3306
[root@docker1 ~]$ iptables -A OUTPUT -t nat -p tcp -o lo --dport 3306 -j DNAT --to-destination 192.168.0.50:3306

Docker2 (mysql-slave1):

[root@docker2 ~]$ iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 3306 -j DNAT --to 192.168.0.51:3306
[root@docker2 ~]$ iptables -A OUTPUT -t nat -p tcp -o lo --dport 3306 -j DNAT --to-destination 192.168.0.51:3306

Docker3 (mysql-slave2):

[root@docker3 ~]$ iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 3306 -j DNAT --to 192.168.0.52:3306
[root@docker3 ~]$ iptables -A OUTPUT -t nat -p tcp -o lo --dport 3306 -j DNAT --to-destination 192.168.0.52:3306

Verify that the outside world can reach the MySQL master on docker1:

[host-in-another-galaxy]$ mysql -uroot -pmypassword -h192.168.55.111 -P3306
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 5.7.15-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> show master status\G
*************************** 1. row ***************************
             File: 89d2ef409181-bin.000001
         Position: 429
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 4f25482e-870d-11e6-bd5b-0242ac110002:1

With port 3306 exposed, our final architecture can be illustrated as per below:

That sums up our blog post now. Calico is well-known for its performance. If you wonder how good Calico performs in multi-host containers, you can read this blog post written by Percona’s Vadim Tkachenko. In the next blog post, we are going to look into Weave.

Tags:

In the previous blog post, we have looked into the multi-host networking capabilities with Docker with native network and Calico. In this blog post, our journey to make Galera Cluster run smoothly on Docker containers continues. Deploying Galera Cluster on Docker is tricky when using orchestration tools. Due to the nature of the scheduler in container orchestration tools and the assumption of homogenous images, the scheduler will just fire the respective containers according to the run command and leave the bootstrapping process to the container’s entrypoint logic when starting up. And you do not want to do that for Galera - starting all nodes at once means each node will form a “1-node cluster” and you’ll end up with a disjointed system.

“Homogeneousing” Galera Cluster

That might be a new word, but it holds true for stateful services like MySQL Replication and Galera Cluster. As one might know, the bootstrapping process for Galera Cluster usually requires manual intervention, where you usually have to decide which node is the most advanced node to start bootstrapping from. There is nothing wrong with this step, you need to be aware of the state of each database node before deciding on the sequence of how to start them up. Galera Cluster is a distributed system, and its redundancy model works like that.

However, container orchestration tools like Docker Engine Swarm Mode and Kubernetes are not aware of the redundancy model of Galera. The orchestration tool presumes containers are independent from each other. If they are dependent, then you have to have an external service that monitors the state. The best way to achieve this is to use a key/value store as a reference point for other containers when starting up.

This is where service discovery like etcd comes into the picture. The basic idea is, each node should report its state periodically to the service. This simplifies the decision process when starting up. For Galera Cluster, the node that has wsrep_local_state_comment equal to Synced shall be used as a reference node when constructing the Galera communication address (gcomm) during joining. Otherwise, the most updated node has to get bootstrapped first.

Etcd has a very nice feature called TTL, where you can expire a key after a certain amount of time. This is useful to determine the state of a node, where the key/value entry only exists if an alive node reports to it. As a result, the node won’t have to connect to each other to determine state (which is very troublesome in a dynamic environment) when forming a cluster. For example, consider the following keys:

    {"createdIndex": 10074,"expiration": "2016-11-29T10:55:35.218496083Z","key": "/galera/my_wsrep_cluster/10.255.0.7/wsrep_last_committed","modifiedIndex": 10074,"ttl": 10,"value": "2881"
    },
    {"createdIndex": 10072,"expiration": "2016-11-29T10:55:34.650574629Z","key": "/galera/my_wsrep_cluster/10.255.0.7/wsrep_local_state_comment","modifiedIndex": 10072,"ttl": 10,"value": "Synced"
    }

After 10 seconds (ttl value), those keys will be removed from the entry. Basically, all nodes should report to etcd periodically with an expiring key. Container should report every N seconds when it's alive (wsrep_cluster_state_comment=Synced and wsrep_last_committed=#value) via a background process. If a container is down, it will no longer send the update to etcd, thus the keys are removed after expiration. This simply indicates that the node was registered but is no longer synced with the cluster. It will be skipped when constructing the Galera communication address at a later point.

The overall flow of joining procedure is illustrated in the following flow chart:

We have built a Docker image that follows the above. It is specifically built for running Galera Cluster using Docker’s orchestration tool. It is available at Docker Hub and our Github repository. It requires an etcd cluster as the discovery service (supports multiple etcd hosts) and based on Percona XtraDB Cluster 5.6. The image includes Percona Xtrabackup, jq (JSON processor) and also a shell script tailored for Galera health check called report_status.sh.

You are welcome to fork or contribute to the project. Any bugs can be reported via Github or via our support page.

Deploying etcd Cluster

etcd is a distributed key value store that provides a simple and efficient way to store data across a cluster of machines. It’s open-source and available on GitHub. It provides shared configuration and service discovery. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. It gracefully handles leader elections during network partitions and will tolerate machine failures, including the leader.

Since etcd is the brain of the setup, we are going to deploy it as a cluster daemon, on three nodes, instead of using containers. In this example, we are going to install etcd on each of the Docker hosts and form a three-node etcd cluster for better availability.

We used CentOS 7 as the operating system, with Docker v1.12.3, build 6b644ec. The deployment steps in this blog post are basically similar to the one used in our previous blog post.

Install etcd packages:
```
$ yum install etcd
```

Modify the configuration file accordingly depending on the Docker hosts:

$ vim /etc/etcd/etcd.conf

For docker1 with IP address 192.168.55.111:

ETCD_NAME=etcd1
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.111:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

For docker2 with IP address 192.168.55.112:

ETCD_NAME=etcd2
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.112:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

For docker3 with IP address 192.168.55.113:

ETCD_NAME=etcd3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

Start the service on docker1, followed by docker2 and docker3:
```
$ systemctl enable etcd
$ systemctl start etcd
```

Verify our cluster status using etcdctl:

[docker3 ]$ etcdctl cluster-health
member 2f8ec0a21c11c189 is healthy: got healthy result from http://0.0.0.0:2379
member 589a7883a7ee56ec is healthy: got healthy result from http://0.0.0.0:2379
member fcacfa3f23575abe is healthy: got healthy result from http://0.0.0.0:2379
cluster is healthy

That’s it. Our etcd is now running as a cluster on three nodes. The below illustrates our architecture:

Deploying Galera Cluster

Minimum of 3 containers is recommended for high availability setup. Thus, we are going to create 3 replicas to start with, it can be scaled up and down afterwards. Running standalone is also possible with standard "docker run" command as shown further down.

Before we start, it’s a good idea to remove any sort of keys related to our cluster name in etcd:

$ etcdctl rm /galera/my_wsrep_cluster --recursive

Ephemeral Storage

This is a recommended way if you plan on scaling the cluster out on more nodes (or scale back by removing nodes). To create a three-node Galera Cluster with ephemeral storage (MySQL datadir will be lost if the container is removed), you can use the following command:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Persistent Storage

To create a three-node Galera Cluster with persistent storage (MySQL datadir persists if the container is removed), add the mount option with type=volume:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--mount type=volume,source=galera-vol,destination=/var/lib/mysql \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Custom my.cnf

If you would like to include a customized MySQL configuration file, create a directory on the physical host beforehand:

$ mkdir /mnt/docker/mysql-config # repeat on all Docker hosts

Then, use the mount option with “type=bind” to map the path into the container. In the following example, the custom my.cnf is located at /mnt/docker/mysql-config/my-custom.cnf on each Docker host:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--mount type=volume,source=galera-vol,destination=/var/lib/mysql \
--mount type=bind,src=/mnt/docker/mysql-config,dst=/etc/my.cnf.d \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Wait for a couple of minutes and verify the service is running (CURRENT STATE = Running):

$ docker service ls mysql-galera
ID                         NAME            IMAGE               NODE           DESIRED STATE  CURRENT STATE           ERROR
2vw40cavru9w4crr4d2fg83j4  mysql-galera.1  severalnines/pxc56  docker1.local  Running        Running 5 minutes ago
1cw6jeyb966326xu68lsjqoe1  mysql-galera.2  severalnines/pxc56  docker3.local  Running        Running 12 seconds ago
753x1edjlspqxmte96f7pzxs1  mysql-galera.3  severalnines/pxc56  docker2.local  Running        Running 5 seconds ago

External applications/clients can connect to any Docker host IP address or hostname on port 3306, requests will be load balanced between the Galera containers. The connection gets NATed to a Virtual IP address for each service "task" (container, in this case) using the Linux kernel's built-in load balancing functionality, IPVS. If the application containers reside in the same overlay network (galera-net), then use the assigned virtual IP address instead. You can retrieve it using the inspect option:

$ docker service inspect mysql-galera -f "{{ .Endpoint.VirtualIPs }}"
[{89n5idmdcswqqha7wcswbn6pw 10.255.0.2/16} {1ufbr56pyhhbkbgtgsfy9xkww 10.0.0.2/24}]

Our architecture is now looking like this:

As a side note, you can also run Galera in standalone mode. This is probably useful for testing purposes like backup and restore, testing the impact of queries and so on. To run it just like a standalone MySQL container, use the standard docker run command:

$ docker run -d \
-p 3306 \
--name=galera-single \
-e MYSQL_ROOT_PASSWORD=mypassword \
-e DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
-e CLUSTER_NAME=my_wsrep_cluster \
-e XTRABACKUP_PASSWORD=mypassword \
severalnines/pxc56

MySQL on Docker: Understanding the Basics - The Webinar

Join us on September 27th for a much anticipated free webinar MySQL on Docker: Understanding the Basics

Scaling the Cluster

There are two ways you can do scaling:

Use “docker service scale” command.
Create a new service with same CLUSTER_NAME using “docker service create” command.

Docker’s “scale” Command

The scale command enables you to scale one or more services either up or down to the desired number of replicas. The command will return immediately, but the actual scaling of the service may take some time. Galera needs to be run an odd number of nodes to avoid network partitioning.

So a good number to scale to would be 5 and so on:

$ docker service scale mysql-galera=5

Wait for a couple of minutes to let the new containers reach the desired state. Then, verify the running service:

$ docker service ls
ID            NAME          REPLICAS  IMAGE               COMMAND
bwvwjg248i9u  mysql-galera  5/5       severalnines/pxc56

One drawback of using this method is that you have to use ephemeral storage because Docker will likely schedule the new containers on a Docker host that already has a Galera container running. If this happens, the volume will overlap the existing Galera containers’ volume. If you would like to use persistent storage and scale in Docker Swarm mode, you should create another new service with a couple of different options, as described in the next section.

At this point, our architecture looks like this:

Another Service with Same Cluster Name

Another way to scale is to create another service with the same CLUSTER_NAME and network. However, you can’t really use the exact same command as the first one due to the following reasons:

The service name should be unique.
The port mapping must be other than 3306, since this port has been assigned to the mysql-galera service.
The volume name should be different to distinguish them from the existing Galera containers.

A benefit of doing this is you will got another virtual IP address assigned to the “scaled” service. This allows you to have an additional option for your application or client to connect to the “scaled” IP address for various tasks, e.g. perform a full backup in desync mode, database consistency check or server auditing.

The following example shows the command to add two more nodes to the cluster in a new service called mysql-galera-scale:

$ docker service create \
--name mysql-galera-scale \
--replicas 2 \
-p 3307:3306 \
--network galera-net \
--mount type=volume,source=galera-scale-vol,destination=/var/lib/mysql \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

If we look into the service list, here is what we see:

$ docker service ls
ID            NAME                REPLICAS  IMAGE               COMMAND
0ii5bedv15dh  mysql-galera-scale  2/2       severalnines/pxc56
71pyjdhfg9js  mysql-galera        3/3       severalnines/pxc56

And when you look into the cluster size on one of the container, you should get 5:

[root@docker1 ~]# docker exec -it $(docker ps | grep mysql-galera | awk {'print $1'}) mysql -uroot -pmypassword -e 'show status like "wsrep_cluster_size"'
Warning: Using a password on the command line interface can be insecure.
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 5     |
+--------------------+-------+

At this point, our architecture looks like this:

To get a clearer view of the process, we can simply look at the MySQL error log file (located under Docker’s data volume) on one of the running containers, for example:

$ tail -f /var/lib/docker/volumes/galera-vol/_data/error.log

Scale Down

Scaling down is simple. Just reduce the number of replicas or remove the service that holds the minority number of containers to ensure that Galera is still in quorum. For example, if you have fired two groups of nodes with 3 + 2 containers and reach total of 5, the majority need to survive thus you can only remove the second group with 2 containers. If you have three groups with 3 + 2 + 2 containers, you can lose a maximum of 3 containers. This is due to the fact that the Docker Swarm scheduler simply terminates and removes the containers corresponding to the service. This makes Galera think that there are nodes failing, as they are not shut down in a graceful way.

If you scaled up using “docker service scale” command, you should scale down using the same method by reducing the number of replicas. To scale it down, simply do:

$ docker service scale mysql-galera=3

Otherwise, if you chose to create another service to scale up, then simply remove the respective service to scale down:

$ docker service rm mysql-galera-scale

Known Limitations

There will be no automatic recovery if a split-brain happens (where all nodes are in Non-Primary state). This is because the MySQL service is still running, yet it will refuse to serve any data and will return error to the client. Docker has no capability to detect this since what it cares about is the foreground MySQL process which is not terminated, killed or stopped. Automating this process is risky, especially if the service discovery is co-located with the Docker host (etcd would also lose contact with other members). Although if the service discovery is healthy somewhere else, it is probably unreachable from the Galera containers perspective, preventing each other to see the container’s status correctly during the glitch.

In this case, you will need to intervene manually.

Choose the most advanced node to bootstrap and then run the following command to promote the node as Primary (other nodes shall then rejoin automatically if the network recovers):

$ docker exec -it [container ID] mysql -uroot -pyoursecret -e 'set global wsrep_provider_option="pc.bootstrap=1"'

Also, there is no automatic cleanup for the discovery service registry. You can remove all entries using either the following command (assuming the CLUSTER_NAME is my_wsrep_cluster):

$ curl http://192.168.55.111:2379/v2/keys/galera/my_wsrep_cluster?recursive=true -XDELETE # or
$ etcdctl rm /galera/my_wsrep_cluster --recursive

Conclusion

This combination of technologies opens a door for a more reliable database setup in the Docker ecosystem. Working with service discovery to store state makes it possible to have stateful containers to achieve a homogeneous setup.

In the next blog post, we are going to look into how to manage Galera Cluster on Docker.

Tags:

Severalnines is happy to announce that our new whitepaper “MySQL on Docker - How to Containerize Your Database” is now available to download for free!

While the idea of containers has been around since the early days of Unix, Docker made waves in 2013 when it hit the market with its innovative solution. Docker allows you to add your stacks and applications to containers where they share a common operating system kernel. This lets you have a lightweight virtualized system with almost zero overhead. Docker also lets you bring containers up or down in seconds, making for rapid deployment of your stack.

Download whitepaper

Severalnines has been experimenting with and writing about how to utilize Docker for MySQL in our MySQL on Docker Blog Series since 2014.

This new white paper is the culmination of years of work by our team trying to understand how best to deploy and manage MySQL on Docker while utilizing the advanced monitoring and management features found in ClusterControl.

The topics covered in this white paper are...

An Introduction to Docker
MySQL Docker Images
Networking in Docker
Understanding MySQL Containers & Volume
Monitoring and Management of MySQL in Docker
- Docker Security
- Backup and Restores
Running ClusterControl on Docker

If your organization is or plans on taking advantage of the latest in Docker container technology in conjunction with their open source MySQL databases, this whitepaper will help you better understand what you need to do to get started.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

ClusterControl on Docker

ClusterControl provides advanced management and monitoring functionality to get your MySQL replication and clustered instances up-and-running using proven methodologies that you can depend on to work. Used in conjunction with other orchestration tools for deployment to the containers, ClusterControl makes managing your open source databases easy with point-and-click interfaces and no need to have specialized knowledge about the technology.

ClusterControl delivers on an array of features to help manage and monitor your open source database environments:

Management & Monitoring: ClusterControl provides management features to repair and recover broken nodes, as well as test and automate MySQL upgrades.
Advanced Monitoring: ClusterControl provides a unified view of all MySQL nodes and clusters across all your data centers and lets you drill down into individual nodes for more detailed statistics.
Automatic Failure Detection and Handling: ClusterControl takes care of your replication clusters health. If a master failure is detected, ClusterControl automatically promotes one of the available slaves to ensure your cluster is always up.

Learn more about how ClusterControl can enhance performance here or pull the Docker Image here.

Tags:

Container orchestration tools simplify the running of a distributed system, by deploying and redeploying containers and handling any failures that occur. One might need to move applications around, e.g., to handle updates, scaling, or underlying host failures. While this sounds great, it does not always work well with a strongly consistent database cluster like Galera. You can’t just move database nodes around, they are not stateless applications. Also, the order in which you perform operations on a cluster has high significance. For instance, restarting a Galera cluster has to start from the most advanced node, or else you will lose data. Therefore, we’ll show you how to run Galera Cluster on Docker without a container orchestration tool, so you have total control.

In this blog post, we are going to look into how to run a MariaDB Galera Cluster on Docker containers using the standard Docker image on multiple Docker hosts, without the help of orchestration tools like Swarm or Kubernetes. This approach is similar to running a Galera Cluster on standard hosts, but the process management is configured through Docker.

Before we jump further into details, we assume you have installed Docker, disabled SElinux/AppArmor and cleared up the rules inside iptables, firewalld or ufw (whichever you are using). The following are three dedicated Docker hosts for our database cluster:

host1.local - 192.168.55.161
host2.local - 192.168.55.162
host3.local - 192.168.55.163

Multi-host Networking

First of all, the default Docker networking is bound to the local host. Docker Swarm introduces another networking layer called overlay network, which extends the container internetworking to multiple Docker hosts in a cluster called Swarm. Long before this integration came into place, there were many network plugins developed to support this - Flannel, Calico, Weave are some of them.

Here, we are going to use Weave as the Docker network plugin for multi-host networking. This is mainly due to its simplicity to get it installed and running, and support for DNS resolver (containers running under this network can resolve each other's hostname). There are two ways to get Weave running - systemd or through Docker. We are going to install it as a systemd unit, so it's independent from Docker daemon (otherwise, we would have to start Docker first before Weave gets activated).

Download and install Weave:

$ curl -L git.io/weave -o /usr/local/bin/weave
$ chmod a+x /usr/local/bin/weave

Create a systemd unit file for Weave:

$ cat > /etc/systemd/system/weave.service << EOF
[Unit]
Description=Weave Network
Documentation=http://docs.weave.works/weave/latest_release/
Requires=docker.service
After=docker.service
[Service]
EnvironmentFile=-/etc/sysconfig/weave
ExecStartPre=/usr/local/bin/weave launch --no-restart $PEERS
ExecStart=/usr/bin/docker attach weave
ExecStop=/usr/local/bin/weave stop
[Install]
WantedBy=multi-user.target
EOF

Define IP addresses or hostname of the peers inside /etc/sysconfig/weave:

$ echo 'PEERS="192.168.55.161 192.168.55.162 192.168.55.163"'> /etc/sysconfig/weave

Start and enable Weave on boot:

$ systemctl start weave
$ systemctl enable weave

Repeat the above 4 steps on all Docker hosts. Verify with the following command once done:

$ weave status

The number of peers is what we are looking after. It should be 3:

          ...
          Peers: 3 (with 6 established connections)
          ...

Running a Galera Cluster

Now the network is ready, it's time to fire our database containers and form a cluster. The basic rules are:

Container must be created under --net=weave to have multi-host connectivity.
Container ports that need to be published are 3306, 4444, 4567, 4568.
The Docker image must support Galera. If you'd like to use Oracle MySQL, then get the Codership version. If you'd like Percona's, use this image instead. In this blog post, we are using MariaDB's.

The reasons we chose MariaDB as the Galera cluster vendor are:

Galera is embedded into MariaDB, starting from MariaDB 10.1.
The MariaDB image is maintained by the Docker and MariaDB teams.
One of the most popular Docker images out there.

Bootstrapping a Galera Cluster has to be performed in sequence. Firstly, the most up-to-date node must be started with "wsrep_cluster_address=gcomm://". Then, start the remaining nodes with a full address consisting of all nodes in the cluster, e.g, "wsrep_cluster_address=gcomm://node1,node2,node3". To accomplish these steps using container, we have to do some extra steps to ensure all containers are running homogeneously. So the plan is:

We would need to start with 4 containers in this order - mariadb0 (bootstrap), mariadb2, mariadb3, mariadb1.
Container mariadb0 will be using the same datadir and configdir with mariadb1.
Use mariadb0 on host1 for the first bootstrap, then start mariadb2 on host2, mariadb3 on host3.
Remove mariadb0 on host1 to give way for mariadb1.
Lastly, start mariadb1 on host1.

At the end of the day, you would have a three-node Galera Cluster (mariadb1, mariadb2, mariadb3). The first container (mariadb0) is a transient container for bootstrapping purposes only, using cluster address "gcomm://". It shares the same datadir and configdir with mariadb1 and will be removed once the cluster is formed (mariadb2 and mariadb3 are up) and nodes are synced.

By default, Galera is turned off in MariaDB and needs to be enabled with a flag called wsrep_on (set to ON) and wsrep_provider (set to the Galera library path) plus a number of Galera-related parameters. Thus, we need to define a custom configuration file for the container to configure Galera correctly.

Let's start with the first container, mariadb0. Create a file under /containers/mariadb0/conf.d/my.cnf and add the following lines:

$ mkdir -p /containers/mariadb0/conf.d
$ cat /containers/mariadb0/conf.d/my.cnf
[mysqld]

default_storage_engine          = InnoDB
binlog_format                   = ROW

innodb_flush_log_at_trx_commit  = 0
innodb_flush_method             = O_DIRECT
innodb_file_per_table           = 1
innodb_autoinc_lock_mode        = 2
innodb_lock_schedule_algorithm  = FCFS # MariaDB >10.1.19 and >10.2.3 only

wsrep_on                        = ON
wsrep_provider                  = /usr/lib/galera/libgalera_smm.so
wsrep_sst_method                = xtrabackup-v2

Since the image doesn't come with MariaDB Backup (which is the preferred SST method for MariaDB 10.1 and MariaDB 10.2), we are going to stick with xtrabackup-v2 for the time being.

To perform the first bootstrap for the cluster, run the bootstrap container (mariadb0) on host1:

$ docker run -d \
        --name mariadb0 \
        --hostname mariadb0.weave.local \
        --net weave \
        --publish "3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --env MYSQL_USER=proxysql \
        --env MYSQL_PASSWORD=proxysqlpassword \
        --volume /containers/mariadb1/datadir:/var/lib/mysql \
        --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
        --wsrep_cluster_address=gcomm:// \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb0.weave.local

The parameters used in the the above command are:

--name, creates the container named "mariadb0",
--hostname, assigns the container a hostname "mariadb0.weave.local",
--net, places the container in the weave network for multi-host networing support,
--publish, exposes ports 3306, 4444, 4567, 4568 on the container to the host,
$(weave dns-args), configures DNS resolver for this container. This command can be translated into Docker run as "--dns=172.17.0.1 --dns-search=weave.local.",
--env MYSQL_ROOT_PASSWORD, the MySQL root password,
--env MYSQL_USER, creates "proxysql" user to be used later with ProxySQL for database routing,
--env MYSQL_PASSWORD, the "proxysql" user password,
--volume /containers/mariadb1/datadir:/var/lib/mysql, creates /containers/mariadb1/datadir if does not exist and map it with /var/lib/mysql (MySQL datadir) of the container (for bootstrap node, this could be skipped),
--volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d, mounts the files under directory /containers/mariadb1/conf.d of the Docker host, into the container at /etc/mysql/mariadb.conf.d.
mariadb:10.2.15, uses MariaDB 10.2.15 image from here,
--wsrep_cluster_address, Galera connection string for the cluster. "gcomm://" means bootstrap. For the rest of the containers, we are going to use a full address instead.
--wsrep_sst_auth, authentication string for SST user. Use the same user as root,
--wsrep_node_address, the node hostname, in this case we are going to use the FQDN provided by Weave.

The bootstrap container contains several key things:

The name, hostname and wsrep_node_address is mariadb0, but it uses the volumes of mariadb1.
The cluster address is "gcomm://"
There are two additional --env parameters - MYSQL_USER and MYSQL_PASSWORD. This parameters will create additional user for our proxysql monitoring purpose.

Verify with the following command:

$ docker ps
$ docker logs -f mariadb0

Once you see the following line, it indicates the bootstrap process is completed and Galera is active:

2018-05-30 23:19:30 139816524539648 [Note] WSREP: Synchronized with group, ready for connections

Create the directory to load our custom configuration file in the remaining hosts:

$ mkdir -p /containers/mariadb2/conf.d # on host2
$ mkdir -p /containers/mariadb3/conf.d # on host3

Then, copy the my.cnf that we've created for mariadb0 and mariadb1 to mariadb2 and mariadb3 respectively:

$ scp /containers/mariadb1/conf.d/my.cnf /containers/mariadb2/conf.d/ # on host1
$ scp /containers/mariadb1/conf.d/my.cnf /containers/mariadb3/conf.d/ # on host1

Next, create another 2 database containers (mariadb2 and mariadb3) on host2 and host3 respectively:

$ docker run -d \
        --name ${NAME} \
        --hostname ${NAME}.weave.local \
        --net weave \
        --publish "3306:3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/${NAME}/datadir:/var/lib/mysql \
        --volume /containers/${NAME}/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
    
--wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=${NAME}.weave.local

** Replace ${NAME} with mariadb2 or mariadb3 respectively.

However, there is a catch. The entrypoint script checks the mysqld service in the background after database initialization by using MySQL root user without password. Since Galera automatically performs synchronization through SST or IST when starting up, the MySQL root user password will change, mirroring the bootstrapped node. Thus, you would see the following error during the first start up:

018-05-30 23:27:13 140003794790144 [Warning] Access denied for user 'root'@'localhost' (using password: NO)
MySQL init process in progress…
MySQL init process failed.

The trick is to restart the failed containers once more, because this time, the MySQL datadir would have been created (in the first run attempt) and it would skip the database initialization part:

$ docker start mariadb2 # on host2
$ docker start mariadb3 # on host3

Once started, verify by looking at the following line:

$ docker logs -f mariadb2
…
2018-05-30 23:28:39 139808069601024 [Note] WSREP: Synchronized with group, ready for connections

At this point, there are 3 containers running, mariadb0, mariadb2 and mariadb3. Take note that mariadb0 is started using the bootstrap command (gcomm://), which means if the container is automatically restarted by Docker in the future, it could potentially become disjointed with the primary component. Thus, we need to remove this container and replace it with mariadb1, using the same Galera connection string with the rest and use the same datadir and configdir with mariadb0.

First, stop mariadb0 by sending SIGTERM (to ensure the node is going to be shutdown gracefully):

$ docker kill -s 15 mariadb0

Then, start mariadb1 on host1 using similar command as mariadb2 or mariadb3:

$ docker run -d \
        --name mariadb1 \
        --hostname mariadb1.weave.local \
        --net weave \
        --publish "3306:3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb1/datadir:/var/lib/mysql \
        --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
    
--wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb1.weave.local

This time, you don't need to do the restart trick because MySQL datadir already exists (created by mariadb0). Once the container is started, verify the cluster size is 3, the status must be in Primary and the local state is synced:

$ docker exec -it mariadb3 mysql -uroot "-pPM7%cB43$sd@^1" -e 'select variable_name, variable_value from information_schema.global_status where variable_name in ("wsrep_cluster_size", "wsrep_local_state_comment", "wsrep_cluster_status", "wsrep_incoming_addresses")'
+---------------------------+-------------------------------------------------------------------------------+
| variable_name             | variable_value                                                                |
+---------------------------+-------------------------------------------------------------------------------+
| WSREP_CLUSTER_SIZE        | 3                                                                             |
| WSREP_CLUSTER_STATUS      | Primary                                                                       |
| WSREP_INCOMING_ADDRESSES  | mariadb1.weave.local:3306,mariadb3.weave.local:3306,mariadb2.weave.local:3306 |
| WSREP_LOCAL_STATE_COMMENT | Synced                                                                        |
+---------------------------+-------------------------------------------------------------------------------+

At this point, our architecture is looking something like this:

Although the run command is pretty long, it well describes the container's characteristics. It's probably a good idea to wrap the command in a script to simplify the execution steps, or use a compose file instead.

Database Routing with ProxySQL

Now we have three database containers running. The only way to access to the cluster now is to access the individual Docker host’s published port of MySQL, which is 3306 (map to 3306 to the container). So what happens if one of the database containers fails? You have to manually failover the client's connection to the next available node. Depending on the application connector, you could also specify a list of nodes and let the connector do the failover and query routing for you (Connector/J, PHP mysqlnd). Otherwise, it would be a good idea to unify the database resources into a single resource, that can be called a service.

This is where ProxySQL comes into the picture. ProxySQL can act as the query router, load balancing the database connections similar to what "Service" in Swarm or Kubernetes world can do. We have built a ProxySQL Docker image for this purpose and will maintain the image for every new version with our best effort.

Before we run the ProxySQL container, we have to prepare the configuration file. The following is what we have configured for proxysql1. We create a custom configuration file under /containers/proxysql1/proxysql.cnf on host1:

$ cat /containers/proxysql1/proxysql.cnf
datadir="/var/lib/proxysql"
admin_variables=
{
        admin_credentials="admin:admin"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
}
mysql_variables=
{
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="proxysql"
        monitor_password="proxysqlpassword"
}
mysql_servers =
(
        { address="mariadb1.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb2.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb3.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb1.weave.local" , port=3306 , hostgroup=20, max_connections=100 },
        { address="mariadb2.weave.local" , port=3306 , hostgroup=20, max_connections=100 },
        { address="mariadb3.weave.local" , port=3306 , hostgroup=20, max_connections=100 }
)
mysql_users =
(
        { username = "sbtest" , password = "password" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
        {
                rule_id=100
                active=1
                match_pattern="^SELECT .* FOR UPDATE"
                destination_hostgroup=10
                apply=1
        },
        {
                rule_id=200
                active=1
                match_pattern="^SELECT .*"
                destination_hostgroup=20
                apply=1
        },
        {
                rule_id=300
                active=1
                match_pattern=".*"
                destination_hostgroup=10
                apply=1
        }
)
scheduler =
(
        {
                id = 1
                filename = "/usr/share/proxysql/tools/proxysql_galera_checker.sh"
                active = 1
                interval_ms = 2000
                arg1 = "10"
                arg2 = "20"
                arg3 = "1"
                arg4 = "1"
                arg5 = "/var/lib/proxysql/proxysql_galera_checker.log"
        }
)

The above configuration will:

configure two host groups, the single-writer and multi-writer group, as defined under "mysql_servers" section,
send reads to all Galera nodes (hostgroup 20) while write operations will go to a single Galera server (hostgroup 10),
schedule the proxysql_galera_checker.sh,
use monitor_username and monitor_password as the monitoring credentials created when we first bootstrapped the cluster (mariadb0).

Copy the configuration file to host2, for ProxySQL redundancy:

$ mkdir -p /containers/proxysql2/ # on host2
$ scp /containers/proxysql1/proxysql.cnf /container/proxysql2/ # on host1

Then, run the ProxySQL containers on host1 and host2 respectively:

$ docker run -d \
        --name=${NAME} \
        --publish 6033 \
        --publish 6032 \
        --restart always \
        --net=weave \
        $(weave dns-args) \
        --hostname ${NAME}.weave.local \
        -v /containers/${NAME}/proxysql.cnf:/etc/proxysql.cnf \
        -v /containers/${NAME}/data:/var/lib/proxysql \
        severalnines/proxysql

** Replace ${NAME} with proxysql1 or proxysql2 respectively.

We specified --restart=always to make it always available regardless of the exit status, as well as automatic startup when Docker daemon starts. This will make sure the ProxySQL containers act like a daemon.

Verify the MySQL servers status monitored by both ProxySQL instances (OFFLINE_SOFT is expected for the single-writer host group):

$ docker exec -it proxysql1 mysql -uadmin -padmin -h127.0.0.1 -P6032 -e 'select hostgroup_id,hostname,status from mysql_servers'
+--------------+----------------------+--------------+
| hostgroup_id | hostname             | status       |
+--------------+----------------------+--------------+
| 10           | mariadb1.weave.local | ONLINE       |
| 10           | mariadb2.weave.local | OFFLINE_SOFT |
| 10           | mariadb3.weave.local | OFFLINE_SOFT |
| 20           | mariadb1.weave.local | ONLINE       |
| 20           | mariadb2.weave.local | ONLINE       |
| 20           | mariadb3.weave.local | ONLINE       |
+--------------+----------------------+--------------+

At this point, our architecture is looking something like this:

All connections coming from 6033 (either from the host1, host2 or container's network) will be load balanced to the backend database containers using ProxySQL. If you would like to access an individual database server, use port 3306 of the physical host instead. There is no virtual IP address as single endpoint configured for the ProxySQL service, but we could have that by using Keepalived, which is explained in the next section.

Virtual IP Address with Keepalived

Since we configured ProxySQL containers to be running on host1 and host2, we are going to use Keepalived containers to tie these hosts together and provide virtual IP address via the host network. This allows a single endpoint for applications or clients to connect to the load balancing layer backed by ProxySQL.

As usual, create a custom configuration file for our Keepalived service. Here is the content of /containers/keepalived1/keepalived.conf:

vrrp_instance VI_DOCKER {
   interface ens33               # interface to monitor
   state MASTER
   virtual_router_id 52          # Assign one ID for this route
   priority 101
   unicast_src_ip 192.168.55.161
   unicast_peer {
      192.168.55.162
   }
   virtual_ipaddress {
      192.168.55.160             # the virtual IP
}

Copy the configuration file to host2 for the second instance:

$ mkdir -p /containers/keepalived2/ # on host2
$ scp /containers/keepalived1/keepalived.conf /container/keepalived2/ # on host1

Change the priority from 101 to 100 inside the copied configuration file on host2:

$ sed -i 's/101/100/g' /containers/keepalived2/keepalived.conf

**The higher priority instance will hold the virtual IP address (in this case is host1), until the VRRP communication is interrupted (in case host1 goes down).

Then, run the following command on host1 and host2 respectively:

$ docker run -d \
        --name=${NAME} \
        --cap-add=NET_ADMIN \
        --net=host \
        --restart=always \
        --volume /containers/${NAME}/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf \ osixia/keepalived:1.4.4

** Replace ${NAME} with keepalived1 and keepalived2.

The run command tells Docker to:

--name, create a container with
--cap-add=NET_ADMIN, add Linux capabilities for network admin scope
--net=host, attach the container into the host network. This will provide virtual IP address on the host interface, ens33
--restart=always, always keep the container running,
--volume=/containers/${NAME}/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf, map the custom configuration file for container's usage.

After both containers are started, verify the virtual IP address existence by looking at the physical network interface of the MASTER node:

$ ip a | grep ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    inet 192.168.55.161/24 brd 192.168.55.255 scope global ens33
    inet 192.168.55.160/32 scope global ens33

The clients and applications may now use the virtual IP address, 192.168.55.160 to access the database service. This virtual IP address exists on host1 at this moment. If host1 goes down, keepalived2 will take over the IP address and bring it up on host2. Take note that the configuration for this keepalived does not monitor the ProxySQL containers. It only monitors the VRRP advertisement of the Keepalived peers.

At this point, our architecture is looking something like this:

Summary

So, now we have a MariaDB Galera Cluster fronted by a highly available ProxySQL service, all running on Docker containers.

In part two, we are going to look into how to manage this setup. We’ll look at how to perform operations like graceful shutdown, bootstrapping, detecting the most advanced node, failover, recovery, scaling up/down, upgrades, backup and so on. We will also discuss the pros and cons of having this setup for our clustered database service.

Happy containerizing!

Tags:

Introduction

Docker modernized the way we build and deploy the application. It allows us to create lightweight, portable, self sufficient containers that can run any application easily.

This blog intended to explain how to use Docker to run PostgreSQL database. It doesn’t cover installation or configuration of docker. Please refer docker installation instructions here. Some additional background can be found in our previous blog on MySQL and Docker.

Before going into the details, let’s review some terminology.

Dockerfile
It contains the set of instructions/commands to install or configure the application/software.
Docker Image
Docker image is built up from series of layers which represent instructions from the Dockerfile. Docker image is used as a template to create a container.
Linking of containers and user defined networking
Docker used bridge as a default networking mechanism and use the --links to link the containers to each other. For accessing PostgreSQL container from an application container, one should link both containers at creation time. Here in this article we are using user defined networks as link feature will soon be deprecated.
Data persistence in Docker
By default, data inside a container is ephemeral. Whenever the container gets restarted, data will be lost. Volumes are the preferred mechanism to persist data generated and used by a Docker container. Here, we are mounting a host directory inside the container where all the data is stored.

Let’s start to build our PostgreSQL image and use it to run a container.

PostgreSQL Dockerfile

# example Dockerfile for https://docs.docker.com/engine/examples/postgresql_service/


FROM ubuntu:14.04

# Add the PostgreSQL PGP key to verify their Debian packages.
# It should be the same key as https://www.postgresql.org/media/keys/ACCC4CF8.asc
RUN apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8

# Add PostgreSQL's repository. It contains the most recent stable release
#     of PostgreSQL, ``9.3``.
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main"> /etc/apt/sources.list.d/pgdg.list

# Install ``python-software-properties``, ``software-properties-common`` and PostgreSQL 9.3
#  There are some warnings (in red) that show up during the build. You can hide
#  them by prefixing each apt-get statement with DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python-software-properties software-properties-common postgresql-9.3 postgresql-client-9.3 postgresql-contrib-9.3

# Note: The official Debian and Ubuntu images automatically ``apt-get clean``
# after each ``apt-get``

# Run the rest of the commands as the ``postgres`` user created by the ``postgres-9.3`` package when it was ``apt-get installed``
USER postgres

# Create a PostgreSQL role named ``postgresondocker`` with ``postgresondocker`` as the password and
# then create a database `postgresondocker` owned by the ``postgresondocker`` role.
# Note: here we use ``&&\`` to run commands one after the other - the ``\``
#       allows the RUN command to span multiple lines.
RUN    /etc/init.d/postgresql start &&\
    psql --command "CREATE USER postgresondocker WITH SUPERUSER PASSWORD 'postgresondocker';"&&\
    createdb -O postgresondocker postgresondocker

# Adjust PostgreSQL configuration so that remote connections to the
# database are possible.
RUN echo "host all  all    0.0.0.0/0  md5">> /etc/postgresql/9.3/main/pg_hba.conf

# And add ``listen_addresses`` to ``/etc/postgresql/9.3/main/postgresql.conf``
RUN echo "listen_addresses='*'">> /etc/postgresql/9.3/main/postgresql.conf

# Expose the PostgreSQL port
EXPOSE 5432

# Add VOLUMEs to allow backup of config, logs and databases
VOLUME  ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql"]

# Set the default command to run when starting the container
CMD ["/usr/lib/postgresql/9.3/bin/postgres", "-D", "/var/lib/postgresql/9.3/main", "-c", "config_file=/etc/postgresql/9.3/main/postgresql.conf"]

If you look at the Dockerfile closely, it consists of commands which are used to install PostgreSQL and perform some configuration changes on ubuntu OS.

Building PostgreSQL Image

We can build a PostgreSQL image from Dockerfile using the docker build command.

# sudo docker build -t postgresondocker:9.3 .

Here, we can specify the tag (-t) to the image like name and version. Dot (.) at the end specifies the current directory and it uses the Dockerfile present in the current directory.Docker file name should be “Dockerfile”. If you want to specify a custom name for your docker file then you should use -f <your_dockerfile_name> in the docker build command.

# sudo docker build -t postgresondocker:9.3 -f <your_docker_file_name>

Output: (Optional use scroll bar text window if possible)

Sending build context to Docker daemon  4.096kB
Step 1/11 : FROM ubuntu:14.04
14.04: Pulling from library/ubuntu
324d088ce065: Pull complete 
2ab951b6c615: Pull complete 
9b01635313e2: Pull complete 
04510b914a6c: Pull complete 
83ab617df7b4: Pull complete 
Digest: sha256:b8855dc848e2622653ab557d1ce2f4c34218a9380cceaa51ced85c5f3c8eb201
Status: Downloaded newer image for ubuntu:14.04
 ---> 8cef1fa16c77
Step 2/11 : RUN apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8
 ---> Running in ba933d07e226
.
.
.
fixing permissions on existing directory /var/lib/postgresql/9.3/main ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in /var/lib/postgresql/9.3/main/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok

Success. You can now start the database server using:

    /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main
or
    /usr/lib/postgresql/9.3/bin/pg_ctl -D /var/lib/postgresql/9.3/main -l logfile start

Ver Cluster Port Status Owner    Data directory               Log file
9.3 main    5432 down   postgres /var/lib/postgresql/9.3/main /var/log/postgresql/postgresql-9.3-main.log
update-alternatives: using /usr/share/postgresql/9.3/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
invoke-rc.d: policy-rc.d denied execution of start.
Setting up postgresql-contrib-9.3 (9.3.22-0ubuntu0.14.04) ...
Setting up python-software-properties (0.92.37.8) ...
Setting up python3-software-properties (0.92.37.8) ...
Setting up software-properties-common (0.92.37.8) ...
Processing triggers for libc-bin (2.19-0ubuntu6.14) ...
Processing triggers for ca-certificates (20170717~14.04.1) ...
Updating certificates in /etc/ssl/certs... 148 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d....done.
Processing triggers for sgml-base (1.26+nmu4ubuntu1) ...
Removing intermediate container fce692f180bf
 ---> 9690b681044b
Step 5/11 : USER postgres
 ---> Running in ff8864c1147d
Removing intermediate container ff8864c1147d
 ---> 1f669efeadfa
Step 6/11 : RUN    /etc/init.d/postgresql start &&    psql --command "CREATE USER postgresondocker WITH SUPERUSER PASSWORD 'postgresondocker';"&&    createdb -O postgresondocker postgresondocker
 ---> Running in 79042024b5e8
 * Starting PostgreSQL 9.3 database server
   ...done.
CREATE ROLE
Removing intermediate container 79042024b5e8
 ---> 70c43a9dd5ab
Step 7/11 : RUN echo "host all  all    0.0.0.0/0  md5">> /etc/postgresql/9.3/main/pg_hba.conf
 ---> Running in c4d03857cdb9
Removing intermediate container c4d03857cdb9
 ---> 0cc2ed249aab
Step 8/11 : RUN echo "listen_addresses='*'">> /etc/postgresql/9.3/main/postgresql.conf
 ---> Running in fde0f721c846
Removing intermediate container fde0f721c846
 ---> 78263aef9a56
Step 9/11 : EXPOSE 5432
 ---> Running in a765f854a274
Removing intermediate container a765f854a274
 ---> d205f9208162
Step 10/11 : VOLUME  ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql"]
 ---> Running in ae0b9f30f3d0
Removing intermediate container ae0b9f30f3d0
 ---> 0de941f8687c
Step 11/11 : CMD ["/usr/lib/postgresql/9.3/bin/postgres", "-D", "/var/lib/postgresql/9.3/main", "-c", "config_file=/etc/postgresql/9.3/main/postgresql.conf"]
 ---> Running in 976d283ea64c
Removing intermediate container 976d283ea64c
 ---> 253ee676278f
Successfully built 253ee676278f
Successfully tagged postgresondocker:9.3

Container Network Creation

Use below command to create a user defined network with bridge driver.

# sudo docker network create --driver bridge postgres-network

Confirm Network Creation

# sudo docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
a553e5727617        bridge              bridge              local
0c6e40305851        host                host                local
4cca2679d3c0        none                null                local
83b23e0af641        postgres-network    bridge              local

Container Creation

We need to use “docker run” command to create a container from the docker image. We are running postgres container in daemonize mode with the help of -d option.

# sudo docker run --name postgresondocker --network postgres-network -d postgresondocker:9.3

Use below command to confirm the container creation.

# sudo docker container ls 
CONTAINER ID        IMAGE                  COMMAND                  CREATED              STATUS              PORTS               NAMES
06a5125f5e11        postgresondocker:9.3   "/usr/lib/postgresql…"   About a minute ago   Up About a minute   5432/tcp            postgresondocker

We have not specified any port to expose, so it will expose the default postgres port 5432 for internal use. PostgreSQL is available only from inside the Docker network, we will not able to access this Postgres container on a host port.

We will see how to access Postgres container on host port in a later section in this article.

Connecting to PostgreSQL container inside Docker network

Let’s try to connect to the Postgres container from another container within the same Docker network which we created earlier.Here, we have used psql client to connect to the Postgres. We used the Postgres container name as a hostname, user and password present in the Docker file.

# docker run -it --rm --network postgres-network postgresondocker:9.3 psql -h postgresondocker -U postgresondocker --password
Password for user postgresondocker: 
psql (9.3.22)
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.

postgresondocker=#

The --rm option in run command will remove the container once we terminate the psql process.

# sudo docker container ls 
CONTAINER ID        IMAGE                  COMMAND                  CREATED              STATUS              PORTS               NAMES
2fd91685d1ea        postgresondocker:9.3   "psql -h postgresond…"   29 seconds ago       Up 30 seconds       5432/tcp            brave_spence
06a5125f5e11        postgresondocker:9.3   "/usr/lib/postgresql…"   About a minute ago   Up About a minute   5432/tcp            postgresondocker

Data persistence

Docker containers are ephemeral in nature, i.e. data which is used or generated by the container is not stored anywhere implicitly. We lose the data whenever the container gets restarted or deleted. Docker provides volumes on which we can store the persistent data. It is a useful feature by which we can provision another container using the same volume or data in case of disaster.

Let's create a data volume and confirm its creation.

# sudo docker volume create pgdata
pgdata

# sudo docker volume ls
DRIVER              VOLUME NAME
local                   pgdata

Now we have to use this data volume while running the Postgres container. Make sure you delete the older postgres container which is running without volumes.

# sudo docker container rm postgresondocker -f 
postgresondocker

# sudo docker run --name postgresondocker --network postgres-network -v pgdata:/var/lib/postgresql/9.3/main -d postgresondocker:9.3

We have ran the Postgres container with a data volume attached to it.

Create a new table in Postgres to check data persistence.

# docker run -it --rm --network postgres-network postgresondocker:9.3 psql -h postgresondocker -U postgresondocker --password
Password for user postgresondocker: 
psql (9.3.22)
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.

postgresondocker=# \dt
No relations found.
postgresondocker=# create table test(id int);
CREATE TABLE
postgresondocker=# \dt 
            List of relations
 Schema | Name | Type  |      Owner       
--------+------+-------+------------------
 public | test | table | postgresondocker
(1 row)

Delete the Postgres container.

# sudo docker container rm postgresondocker -f 
postgresondocker

Create a new Postgres container and confirm the test table present or not.

# sudo docker run --name postgresondocker --network postgres-network -v pgdata:/var/lib/postgresql/9.3/main -d postgresondocker:9.3


# docker run -it --rm --network postgres-network postgresondocker:9.3 psql -h postgresondocker -U postgresondocker --password
Password for user postgresondocker: 
psql (9.3.22)
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.

postgresondocker=# \dt
            List of relations
 Schema | Name | Type  |      Owner       
--------+------+-------+------------------
 public | test | table | postgresondocker
(1 row)

PostgreSQL Management & Automation with ClusterControl

Learn about what you need to know to deploy, monitor, manage and scale PostgreSQL

Download the Whitepaper

Expose PostgreSQL service to the host

You may have noticed that we have not exposed any port of the PostgreSQL container earlier. This means that PostgreSQL is only accessible to the containers that are in the postgres-network we created earlier.

To use PostgreSQL service we need to expose container port using --port option. Here, we have exposed the Postgres container port 5432 on 5432 port of the host.

# sudo docker run --name postgresondocker --network postgres-network -v pgdata:/var/lib/postgresql/9.3/main -p 5432:5432 -d postgresondocker:9.3
# sudo docker container ls
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                    NAMES
997580c86188        postgresondocker:9.3   "/usr/lib/postgresql…"   8 seconds ago       Up 10 seconds       0.0.0.0:5432->5432/tcp   postgresondocker

Now you can connect PostgreSQL on localhost directly.

# psql -h localhost -U postgresondocker --password
Password for user postgresondocker: 
psql (9.3.22)
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.

postgresondocker=#

Container Deletion

To delete the container, we need to stop the running container first and then delete the container using rm command.

# sudo docker container stop postgresondocker 

# sudo docker container rm postgresondocker
postgresondocker

Use -f (--force) option to directly delete the running container.

# sudo docker container rm postgresondocker -f
postgresondocker

Hopefully, you now have your own dockerized local environment for PostgreSQL.

Note: This article provides an overview about how we can use PostgreSQL on docker for development/POC environment. Running PostgreSQL in production environment may require additional changes in the PostgreSQL or docker configurations.

Conclusion

There is a simple way to run PostgreSQL database inside a Docker container. Docker effectively encapsulates deployment, configuration and certain administration procedures. Docker is a good choice to deploy PostgreSQL with minimum efforts. All you need to do is start a pre-built Docker container and you will have PostgreSQL database ready for your service.

References

Docker installation: https://docs.docker.com/install
Volumes: https://docs.docker.com/storage/volumes
User Defined networks: https://docs.docker.com/network/
Postgres Docker File: https://docs.docker.com/engine/examples/postgresql_service
MySQL on Docker: Understanding the Basics: https://severalnines.com/blog/mysql-docker-containers-understanding-basics

Tags:

PostgreSQL

docker

container

As we saw in the first part of this blog, a strongly consistent database cluster like Galera does not play well with container orchestration tools like Kubernetes or Swarm. We showed you how to deploy Galera and configure process management for Docker, so you retain full control of the behaviour. This blog post is the continuation of that, we are going to look into operation and maintenance of the cluster.

To recap some of the main points from the part 1 of this blog, we deployed a three-node Galera cluster, with ProxySQL and Keepalived on three different Docker hosts, where all MariaDB instances run as Docker containers. The following diagram illustrates the final deployment:

Graceful Shutdown

To perform a graceful MySQL shutdown, the best way is to send SIGTERM (signal 15) to the container:

$ docker kill -s 15 {db_container_name}

If you would like to shutdown the cluster, repeat the above command on all database containers, one node at a time. The above is similar to performing "systemctl stop mysql" in systemd service for MariaDB. Using "docker stop" command is pretty risky for database service because it waits for 10 seconds timeout and Docker will force SIGKILL if this duration is exceeded (unless you use a proper --timeout value).

The last node that shuts down gracefully will have the seqno not equal to -1 and safe_to_bootstrap flag is set to 1 in the /{datadir volume}/grastate.dat of the Docker host, for example on host2:

$ cat /containers/mariadb2/datadir/grastate.dat
# GALERA saved state
version: 2.1
uuid:    e70b7437-645f-11e8-9f44-5b204e58220b
seqno:   7099
safe_to_bootstrap: 1

Detecting the Most Advanced Node

If the cluster didn't shut down gracefully, or the node that you were trying to bootstrap wasn't the last node to leave the cluster, you probably wouldn't be able to bootstrap one of the Galera node and might encounter the following error:

2016-11-07 01:49:19 5572 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node.
It was not the last one to leave the cluster and may not contain all the updates.
To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .

Galera honours the node that has safe_to_bootstrap flag set to 1 as the first reference node. This is the safest way to avoid data loss and ensure the correct node always gets bootstrapped.

If you got the error, we have to find out the most advanced node first before picking up the node as the first to be bootstrapped. Create a transient container (with --rm flag), map it to the same datadir and configuration directory of the actual database container with two MySQL command flags, --wsrep_recover and --wsrep_cluster_address. For example, if we want to know mariadb1 last committed number, we need to run:

$ docker run --rm --name mariadb-recover \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb1/datadir:/var/lib/mysql \
        --volume /containers/mariadb1/conf.d:/etc/mysql/conf.d \
        mariadb:10.2.15 \
        --wsrep_recover \
        --wsrep_cluster_address=gcomm://
2018-06-12  4:46:35 139993094592384 [Note] mysqld (mysqld 10.2.15-MariaDB-10.2.15+maria~jessie) starting as process 1 ...
2018-06-12  4:46:35 139993094592384 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
...
2018-06-12  4:46:35 139993094592384 [Note] Plugin 'FEEDBACK' is disabled.
2018-06-12  4:46:35 139993094592384 [Note] Server socket created on IP: '::'.
2018-06-12  4:46:35 139993094592384 [Note] WSREP: Recovered position: e70b7437-645f-11e8-9f44-5b204e58220b:7099

The last line is what we are looking for. MariaDB prints out the cluster UUID and the sequence number of the most recently committed transaction. The node which holds the highest number is deemed as the most advanced node. Since we specified --rm, the container will be removed automatically once it exits. Repeat the above step on every Docker host by replacing the --volume path to the respective database container volumes.

Once you have compared the value reported by all database containers and decided which container is the most up-to-date node, change the safe_to_bootstrap flag to 1 inside /{datadir volume}/grastate.dat manually. Let's say all nodes are reporting the same exact sequence number, we can just pick mariadb3 to be bootstrapped by changing the safe_to_bootstrap value to 1:

$ vim /containers/mariadb3/datadir/grasate.dat
...
safe_to_bootstrap: 1

Save the file and start bootstrapping the cluster from that node, as described in the next chapter.

Bootstrapping the Cluster

Bootstrapping the cluster is similar to the first docker run command we used when starting up the cluster for the first time. If mariadb1 is the chosen bootstrap node, we can simply re-run the created bootstrap container:

$ docker start mariadb0 # on host1

Otherwise, if the bootstrap container does not exist on the chosen node, let's say on host2, run the bootstrap container command and map the existing mariadb2's volumes. We are using mariadb0 as the container name on host2 to indicate it is a bootstrap container:

$ docker run -d \
        --name mariadb0 \
        --hostname mariadb0.weave.local \
        --net weave \
        --publish "3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb2/datadir:/var/lib/mysql \
        --volume /containers/mariadb2/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
        --wsrep_cluster_address=gcomm:// \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb0.weave.local

You may notice that this command is slightly shorter as compared to the previous bootstrap command described in this guide. Since we already have the proxysql user created in our first bootstrap command, we may skip these two environment variables:

--env MYSQL_USER=proxysql
--env MYSQL_PASSWORD=proxysqlpassword

Then, start the remaining MariaDB containers, remove the bootstrap container and start the existing MariaDB container on the bootstrapped host. Basically the order of commands would be:

$ docker start mariadb1 # on host1
$ docker start mariadb3 # on host3
$ docker stop mariadb0 # on host2
$ docker start mariadb2 # on host2

At this point, the cluster is started and is running at full capacity.

Resource Control

Memory is a very important resource in MySQL. This is where the buffers and caches are stored, and it's critical for MySQL to reduce the impact of hitting the disk too often. On the other hand, swapping is bad for MySQL performance. By default, there will be no resource constraints on the running containers. Containers use as much of a given resource as the host’s kernel will allow. Another important thing is file descriptor limit. You can increase the limit of open file descriptor, or "nofile" to something higher to cater for the number of files MySQL server can open simultaneously. Setting this to a high value won't hurt.

To cap memory allocation and increase the file descriptor limit to our database container, one would append --memory, --memory-swap and --ulimit parameters into the "docker run" command:

$ docker kill -s 15 mariadb1
$ docker rm -f mariadb1
$ docker run -d \
        --name mariadb1 \
        --hostname mariadb1.weave.local \
        --net weave \
        --publish "3306:3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --memory 16g \
        --memory-swap 16g \
        --ulimit nofile:16000:16000 \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb1/datadir:/var/lib/mysql \
        --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
        --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb1.weave.local

Take note that if --memory-swap is set to the same value as --memory, and --memory is set to a positive integer, the container will not have access to swap. If --memory-swap is not set, container swap will default to --memory multiply by 2. If --memory and --memory-swap are set to the same value, this will prevent containers from using any swap. This is because --memory-swap is the amount of combined memory and swap that can be used, while --memory is only the amount of physical memory that can be used.

Some of the container resources like memory and CPU can be controlled dynamically through "docker update" command, as shown in the following example to upgrade the memory of container mariadb1 to 32G on-the-fly:

$ docker update \
    --memory 32g \
    --memory-swap 32g \
    mariadb1

Do not forget to tune the my.cnf accordingly to suit the new specs. Configuration management is explained in the next section.

Configuration Management

Most of the MySQL/MariaDB configuration parameters can be changed during runtime, which means you don't need to restart to apply the changes. Check out the MariaDB documentation page for details. The parameter listed with "Dynamic: Yes" means the variable is loaded immediately upon changing without the necessity to restart MariaDB server. Otherwise, set the parameters inside the custom configuration file in the Docker host. For example, on mariadb3, make the changes to the following file:

$ vim /containers/mariadb3/conf.d/my.cnf

And then restart the database container to apply the change:

$ docker restart mariadb3

Verify the container starts up the process by looking at the docker logs. Perform this operation on one node at a time if you would like to make cluster-wide changes.

Backup

Taking a logical backup is pretty straightforward because the MariaDB image also comes with mysqldump binary. You simply use the "docker exec" command to run the mysqldump and send the output to a file relative to the host path. The following command performs mysqldump backup on mariadb2 and saves it to /backups/mariadb2 inside host2:

$ docker exec -it mariadb2 mysqldump -uroot -p --single-transaction > /backups/mariadb2/dump.sql

Binary backup like Percona Xtrabackup or MariaDB Backup requires the process to access the MariaDB data directory directly. You have to either install this tool inside the container, or through the machine host or use a dedicated image for this purpose like "perconalab/percona-xtrabackup" image to create the backup and stored it inside /tmp/backup on the Docker host:

$ docker run --rm -it \
    -v /containers/mariadb2/datadir:/var/lib/mysql \
    -v /tmp/backup:/xtrabackup_backupfiles \
    perconalab/percona-xtrabackup \
    --backup --host=mariadb2 --user=root --password=mypassword

You can also stop the container with innodb_fast_shutdown set to 0 and copy over the datadir volume to another location in the physical host:

$ docker exec -it mariadb2 mysql -uroot -p -e 'SET GLOBAL innodb_fast_shutdown = 0'
$ docker kill -s 15 mariadb2
$ cp -Rf /containers/mariadb2/datadir /backups/mariadb2/datadir_copied
$ docker start mariadb2

Restore

Restoring is pretty straightforward for mysqldump. You can simply redirect the stdin into the container from the physical host:

$ docker exec -it mariadb2 mysql -uroot -p < /backups/mariadb2/dump.sql

You can also use the standard mysql client command line remotely with proper hostname and port value instead of using this "docker exec" command:

$ mysql -uroot -p -h127.0.0.1 -P3306 < /backups/mariadb2/dump.sql

For Percona Xtrabackup and MariaDB Backup, we have to prepare the backup beforehand. This will roll forward the backup to the time when the backup was finished. Let's say our Xtrabackup files are located under /tmp/backup of the Docker host, to prepare it, simply:

$ docker run --rm -it \
    -v mysql-datadir:/var/lib/mysql \
    -v /tmp/backup:/xtrabackup_backupfiles \
    perconalab/percona-xtrabackup \
    --prepare --target-dir /xtrabackup_backupfiles

The prepared backup under /tmp/backup of the Docker host then can be used as the MariaDB datadir for a new container or cluster. Let's say we just want to verify restoration on a standalone MariaDB container, we would run:

$ docker run -d \
    --name mariadb-restored \
    --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
    -v /tmp/backup:/var/lib/mysql \
    mariadb:10.2.15

If you performed a backup using stop and copy approach, you can simply duplicate the datadir and use the duplicated directory as a volume maps to MariaDB datadir to run on another container. Let's say the backup was copied over under /backups/mariadb2/datadir_copied, we can run a new container by running:

$ mkdir -p /containers/mariadb-restored/datadir
$ cp -Rf /backups/mariadb2/datadir_copied /containers/mariadb-restored/datadir
$ docker run -d \
    --name mariadb-restored \
    --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
    -v /containers/mariadb-restored/datadir:/var/lib/mysql \
    mariadb:10.2.15

The MYSQL_ROOT_PASSWORD must match the actual root password for that particular backup.

MySQL on Docker: How to Containerize Your Database

Discover all you need to understand when considering to run a MySQL service on top of Docker container virtualization

Download the White[]aper

Database Version Upgrade

There are two types of upgrade - in-place upgrade or logical upgrade.

In-place upgrade involves shutting down the MariaDB server, replacing the old binaries with the new binaries and then starting the server on the old data directory. Once started, you have to run mysql_upgrade script to check and upgrade all system tables and also to check the user tables.

The logical upgrade involves exporting SQL from the current version using a logical backup utility such as mysqldump, running the new container with the upgraded version binaries, and then applying the SQL to the new MySQL/MariaDB version. It is similar to backup and restore approach described in the previous section.

Nevertheless, it's a good approach to always backup your database before performing any destructive operations. The following steps are required when upgrading from the current image, MariaDB 10.1.33 to another major version, MariaDB 10.2.15 on mariadb3 resides on host3:

Backup the database. It doesn't matter physical or logical backup but the latter using mysqldump is recommended.
Download the latest image that we would like to upgrade to:
```
$ docker pull mariadb:10.2.15
```

Set innodb_fast_shutdown to 0 for our database container:

$ docker exec -it mariadb3 mysql -uroot -p -e 'SET GLOBAL innodb_fast_shutdown = 0'

Graceful shut down the database container:
```
$ docker kill --signal=TERM mariadb3
```

Create a new container with the new image for our database container. Keep the rest of the parameters intact except using the new container name (otherwise it would conflict):

$ docker run -d \
        --name mariadb3-new \
        --hostname mariadb3.weave.local \
        --net weave \
        --publish "3306:3306" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb3/datadir:/var/lib/mysql \
        --volume /containers/mariadb3/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
        --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb3.weave.local

Run mysql_upgrade script:

$ docker exec -it mariadb3-new mysql_upgrade -uroot -p

If no errors occurred, remove the old container, mariadb3 (the new one is mariadb3-new):
```
$ docker rm -f mariadb3
```
Otherwise, if the upgrade process fails in between, we can fall back to the previous container:
```
$ docker stop mariadb3-new
$ docker start mariadb3
```

Major version upgrade can be performed similarly to the minor version upgrade, except you have to keep in mind that MySQL/MariaDB only supports major upgrade from the previous version. If you are on MariaDB 10.0 and would like to upgrade to 10.2, you have to upgrade to MariaDB 10.1 first, followed by another upgrade step to MariaDB 10.2.

Take note on the configuration changes being introduced and deprecated between major versions.

Failover

In Galera, all nodes are masters and hold the same role. With ProxySQL in the picture, connections that pass through this gateway will be failed over automatically as long as there is a primary component running for Galera Cluster (that is, a majority of nodes are up). The application won't notice any difference if one database node goes down because ProxySQL will simply redirect the connections to the other available nodes.

If the application connects directly to the MariaDB bypassing ProxySQL, failover has to be performed on the application-side by pointing to the next available node, provided the database node meets the following conditions:

Status wsrep_local_state_comment is Synced (The state "Desynced/Donor" is also possible, only if wsrep_sst_method is xtrabackup, xtrabackup-v2 or mariabackup).
Status wsrep_cluster_status is Primary.

In Galera, an available node doesn't mean it's healthy until the above status are verified.

Scaling Out

To scale out, we can create a new container in the same network and use the same custom configuration file for the existing container on that particular host. For example, let's say we want to add the fourth MariaDB container on host3, we can use the same configuration file mounted for mariadb3, as illustrated in the following diagram:

Run the following command on host3 to scale out:

$ docker run -d \
        --name mariadb4 \
        --hostname mariadb4.weave.local \
        --net weave \
        --publish "3306:3307" \
        --publish "4444" \
        --publish "4567" \
        --publish "4568" \
        $(weave dns-args) \
        --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \
        --volume /containers/mariadb4/datadir:/var/lib/mysql \
        --volume /containers/mariadb3/conf.d:/etc/mysql/mariadb.conf.d \
        mariadb:10.2.15 \
        --wsrep_cluster_address=gcomm://mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local,mariadb4.weave.local \
        --wsrep_sst_auth="root:PM7%cB43$sd@^1" \
        --wsrep_node_address=mariadb4.weave.local

Once the container is created, it will join the cluster and perform SST. It can be accessed on port 3307 externally or outside of the Weave network, or port 3306 within the host or within the Weave network. It's not necessary to include mariadb0.weave.local into the cluster address anymore. Once the cluster is scaled out, we need to add the new MariaDB container into the ProxySQL load balancing set via admin console:

$ docker exec -it proxysql1 mysql -uadmin -padmin -P6032
mysql> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES (10,'mariadb4.weave.local',3306);
mysql> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES (20,'mariadb4.weave.local',3306);
mysql> LOAD MYSQL SERVERS TO RUNTIME;
mysql> SAVE MYSQL SERVERS TO DISK;

Repeat the above commands on the second ProxySQL instance.

Finally for the the last step, (you may skip this part if you already ran "SAVE .. TO DISK" statement in ProxySQL), add the following line into proxysql.cnf to make it persistent across container restart on host1 and host2:

$ vim /containers/proxysql1/proxysql.cnf # host1
$ vim /containers/proxysql2/proxysql.cnf # host2

And append mariadb4 related lines under mysql_server directive:

mysql_servers =
(
        { address="mariadb1.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb2.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb3.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb4.weave.local" , port=3306 , hostgroup=10, max_connections=100 },
        { address="mariadb1.weave.local" , port=3306 , hostgroup=20, max_connections=100 },
        { address="mariadb2.weave.local" , port=3306 , hostgroup=20, max_connections=100 },
        { address="mariadb3.weave.local" , port=3306 , hostgroup=20, max_connections=100 },
        { address="mariadb4.weave.local" , port=3306 , hostgroup=20, max_connections=100 }
)

Save the file and we should be good on the next container restart.

Scaling Down

To scale down, simply shuts down the container gracefully. The best command would be:

$ docker kill -s 15 mariadb4
$ docker rm -f mariadb4

Remember, if the database node left the cluster ungracefully, it was not part of scaling down and would affect the quorum calculation.

To remove the container from ProxySQL, run the following commands on both ProxySQL containers. For example, on proxysql1:

$ docker exec -it proxysql1 mysql -uadmin -padmin -P6032
mysql> DELETE FROM mysql_servers WHERE hostname="mariadb4.weave.local";
mysql> LOAD MYSQL SERVERS TO RUNTIME;
mysql> SAVE MYSQL SERVERS TO DISK;

You can then either remove the corresponding entry inside proxysql.cnf or just leave it like that. It will be detected as OFFLINE from ProxySQL point-of-view anyway.

Summary

With Docker, things get a bit different from the conventional way on handling MySQL or MariaDB servers. Handling stateful services like Galera Cluster is not as easy as stateless applications, and requires proper testing and planning.

In our next blog on this topic, we will evaluate the pros and cons of running Galera Cluster on Docker without any orchestration tools.

Tags:

Introduction

Kubernetes is an open source container orchestration system for automating deployment, scaling and management of containerized applications. Running a PostgreSQL database on Kubernetes is a topic of discussion nowadays as Kubernetes provides ways to provision stateful container using persistent volumes, statefulsets, etc.

This blog intended to provide steps to run PostgreSQL database on Kubernetes cluster. It doesn’t cover the installation or configuration of Kubernetes cluster, although we previously wrote about it in this blog on MySQL Galera Cluster on Kubernetes.

Prerequisites

Working Kubernetes Cluster
Basic understanding of Docker

You can provision the Kubernetes cluster on any public cloud provider like AWS, Azure or Google cloud, etc. Refer Kubernetes cluster installation and configuration steps for CentOS here. You can also check the earlier blog post for basics about Deploying PostgreSQL on Docker container.

To Deploy PostgreSQL on Kubernetes we need to follow below steps:

Postgres Docker Image
Config Maps for storing Postgres configurations
Persistent Storage Volume
PostgreSQL Deployment
PostgreSQL Service

PostgreSQL Docker Image

We are using PostgreSQL 10.4 Docker image from the public registry. This image will provide the functionality of providing custom configurations/environment variables of PostgreSQL like username, password, database name and path, etc.

Config Maps for PostgreSQL Configurations

We will be using config maps for storing PostgreSQL related information. Here, we are using the database, user and password in the config map which will be used by the PostgreSQL pod in the deployment template.

File: postgres-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-config
  labels:
    app: postgres
data:
  POSTGRES_DB: postgresdb
  POSTGRES_USER: postgresadmin
  POSTGRES_PASSWORD: admin123

Create Postgres config maps resource

$ kubectl create -f postgres-configmap.yaml 
configmap "postgres-config" created

Persistent Storage Volume

As you all know that Docker containers are ephemeral in nature. All the data which is generated by or in the container will be lost after termination of the container instance.

To save the data, we will be using Persistent volumes and persistent volume claim resource within Kubernetes to store the data on persistent storages.

Here, we are using local directory/path as Persistent storage resource (/mnt/data)

File: postgres-storage.yaml

kind: PersistentVolume
apiVersion: v1
metadata:
  name: postgres-pv-volume
  labels:
    type: local
    app: postgres
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: "/mnt/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: postgres-pv-claim
  labels:
    app: postgres
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi

Create storage related deployments

$ kubectl create -f postgres-storage.yaml 
persistentvolume "postgres-pv-volume" created
persistentvolumeclaim "postgres-pv-claim" created

PostgreSQL Deployment

PostgreSQL manifest for deployment of PostgreSQL container uses PostgreSQL 10.4 image. It is using PostgreSQL configuration like username, password, database name from the configmap that we created earlier. It also mounts the volume created from the persistent volumes and claims to make PostgreSQL container’s data persists.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:10.4
          imagePullPolicy: "IfNotPresent"
          ports:
            - containerPort: 5432
          envFrom:
            - configMapRef:
                name: postgres-config
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: postgredb
      volumes:
        - name: postgredb
          persistentVolumeClaim:
            claimName: postgres-pv-claim

Create Postgres deployment

$ kubectl create -f postgres-deployment.yaml 
deployment "postgres" created

PostgreSQL Management & Automation with ClusterControl

Learn about what you need to know to deploy, monitor, manage and scale PostgreSQL

Download the Whitepaper

PostgreSQL Service

To access the deployment or container, we need to expose PostgreSQL service. Kubernetes provides different type of services like ClusterIP, NodePort and LoadBalancer.

With ClusterIP we can access PostgreSQL service within Kubernetes. NodePort gives the ability to expose service endpoint on the Kubernetes nodes. For accessing PostgreSQL externally, we need to use a Load Balancer service type which exposes the service externally.

File: postgres-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  type: NodePort
  ports:
   - port: 5432
  selector:
   app: postgres

Create Postgres Service

$ kubectl create -f postgres-service.yaml 
service "postgres" created

Connect to PostgreSQL

For connecting PostgreSQL, we need to get the Node port from the service deployment.

$ kubectl get svc postgres
NAME       TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
postgres   NodePort   10.107.71.253   <none>        5432:31070/TCP   5m

We need to use port 31070 to connect to PostgreSQL from machine/node present in kubernetes cluster with credentials given in the configmap earlier.

$ psql -h localhost -U postgresadmin1 --password -p 31070 postgresdb
Password for user postgresadmin1: 
psql (10.4)
Type "help" for help.
 
postgresdb=#

Delete PostgreSQL Deployments

For deletion of PostgreSQL resources, we need to use below commands.

# kubectl delete service postgres 
# kubectl delete deployment postgres
# kubectl delete configmap postgres-config
# kubectl delete persistentvolumeclaim postgres-pv-claim
# kubectl delete persistentvolume postgres-pv-volume

Hopefully, by using the above steps you are able to provision a standalone PostgreSQL instance on a Kubernetes Cluster.

Conclusion

Running PostgreSQL on Kubernetes help to utilize resources in a better way than when just using virtual machines. Kubernetes also provides isolation of other applications using PostgreSQL within the same virtual machine or Kubernetes cluster.

This article provides an overview about how we can use PostgreSQL on Kubernetes for development/POC environment. You may explore/setup PostgreSQL cluster using statefulsets of Kubernetes.

StatefulSets required?

In Kubernetes, StatefulSets are required to scale stateful applications. PostgreSQL can be easily scaled using StatefulSets with a single command.

References

Kubernetes Installation on CentOS: https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-oen-centos-7
Kubectl setup: https://kubernetes.io/docs/tasks/tools/install-kubectl
PostgreSQL Using Docker: https://severalnines.com/blog/deploying-postgresql-docker-container
Kubernetes: https://kubernetes.io
PostgreSQL cluster using statefulsets https://kubernetes.io/blog/2017/02/postgresql-clusters-kubernetes-statefulsets

Tags:

Monitoring is a concern for containers, as the infrastructure is dynamic. Containers can be routinely created and destroyed, and are ephemeral. So how do you keep track of your MySQL instances running on Docker?

As with any software component, there are many options out there that can be used. We’ll look at Prometheus as a solution built for distributed infrastructure, and works very well with Docker.

This is a two-part blog. In this part 1 blog, we are going to cover the deployment aspect of our MySQL containers with Prometheus and its components, running as standalone Docker containers and Docker Swarm services. In part 2, we will look at the important metrics to monitor from our MySQL containers, as well as integration with the paging and notification systems.

Introduction to Prometheus

Prometheus is a full monitoring and trending system that includes built-in and active scraping, storing, querying, graphing, and alerting based on time series data. Prometheus collects metrics through pull mechanism from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true. It supports all the target metrics that we want to measure if one would like to run MySQL as Docker containers. Those metrics include physical hosts metrics, Docker container metrics and MySQL server metrics.

Take a look at the following diagram which illustrates Prometheus architecture (taken from Prometheus official documentation):

We are going to deploy some MySQL containers (standalone and Docker Swarm) complete with a Prometheus server, MySQL exporter (i.e., a Prometheus agent to expose MySQL metrics, that can then be scraped by the Prometheus server) and also Alertmanager to handle alerts based on the collected metrics.

For more details check out the Prometheus documentation. In this example, we are going to use the official Docker images provided by the Prometheus team.

Standalone Docker

Deploying MySQL Containers

Let's run two standalone MySQL servers on Docker to simplify our deployment walkthrough. One container will be using the latest MySQL 8.0 and the other one is MySQL 5.7. Both containers are in the same Docker network called "db_network":

$ docker network create db_network
$ docker run -d \
--name mysql80 \
--publish 3306 \
--network db_network \
--restart unless-stopped \
--env MYSQL_ROOT_PASSWORD=mypassword \
--volume mysql80-datadir:/var/lib/mysql \
mysql:8 \
--default-authentication-plugin=mysql_native_password

MySQL 8 defaults to a new authentication plugin called caching_sha2_password. For compatibility with Prometheus MySQL exporter container, let's use the widely-used mysql_native_password plugin whenever we create a new MySQL user on this server.

For the second MySQL container running 5.7, we execute the following:

$ docker run -d \
--name mysql57 \
--publish 3306 \
--network db_network \
--restart unless-stopped \
--env MYSQL_ROOT_PASSWORD=mypassword \
--volume mysql57-datadir:/var/lib/mysql \
mysql:5.7

Verify if our MySQL servers are running OK:

[root@docker1 mysql]# docker ps | grep mysql
cc3cd3c4022a        mysql:5.7           "docker-entrypoint.s…"   12 minutes ago      Up 12 minutes       0.0.0.0:32770->3306/tcp   mysql57
9b7857c5b6a1        mysql:8             "docker-entrypoint.s…"   14 minutes ago      Up 14 minutes       0.0.0.0:32769->3306/tcp   mysql80

At this point, our architecture is looking something like this:

Let's get started to monitor them.

Exposing Docker Metrics to Prometheus

Docker has built-in support as Prometheus target, where we can use to monitor the Docker engine statistics. We can simply enable it by creating a text file called "daemon.json" inside the Docker host:

$ vim /etc/docker/daemon.json

And add the following lines:

{
  "metrics-addr" : "12.168.55.161:9323",
  "experimental" : true
}

Where 192.168.55.161 is the Docker host primary IP address. Then, restart Docker daemon to load the change:

$ systemctl restart docker

Since we have defined --restart=unless-stopped in our MySQL containers' run command, the containers will be automatically started after Docker is running.

Deploying MySQL Exporter

Before we move further, the mysqld exporter requires a MySQL user to be used for monitoring purposes. On our MySQL containers, create the monitoring user:

$ docker exec -it mysql80 mysql -uroot -p
Enter password:

mysql> CREATE USER 'exporter'@'%' IDENTIFIED BY 'exporterpassword' WITH MAX_USER_CONNECTIONS 3;
mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%';

Take note that it is recommended to set a max connection limit for the user to avoid overloading the server with monitoring scrapes under heavy load. Repeat the above statements onto the second container, mysql57:

$ docker exec -it mysql57 mysql -uroot -p
Enter password:

mysql> CREATE USER 'exporter'@'%' IDENTIFIED BY 'exporterpassword' WITH MAX_USER_CONNECTIONS 3;
mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%';

Let's run the mysqld exporter container called "mysql8-exporter" to expose the metrics for our MySQL 8.0 instance as below:

$ docker run -d \
--name mysql80-exporter \
--publish 9104 \
--network db_network \
--restart always \
--env DATA_SOURCE_NAME="exporter:exporterpassword@(mysql80:3306)/" \
prom/mysqld-exporter:latest \
--collect.info_schema.processlist \
--collect.info_schema.innodb_metrics \
--collect.info_schema.tablestats \
--collect.info_schema.tables \
--collect.info_schema.userstats \
--collect.engine_innodb_status

And also another exporter container for our MySQL 5.7 instance:

$ docker run -d \
--name mysql57-exporter \
--publish 9104 \
--network db_network \
--restart always \
-e DATA_SOURCE_NAME="exporter:exporterpassword@(mysql57:3306)/" \
prom/mysqld-exporter:latest \
--collect.info_schema.processlist \
--collect.info_schema.innodb_metrics \
--collect.info_schema.tablestats \
--collect.info_schema.tables \
--collect.info_schema.userstats \
--collect.engine_innodb_status

We enabled a bunch of collector flags for the container to expose the MySQL metrics. You can also enable --collect.slave_status, --collect.slave_hosts if you have a MySQL replication running on containers.

We should be able to retrieve the MySQL metrics via curl from the Docker host directly (port 32771 is the published port assigned automatically by Docker for container mysql80-exporter):

$ curl 127.0.0.1:32771/metrics
...
mysql_info_schema_threads_seconds{state="waiting for lock"} 0
mysql_info_schema_threads_seconds{state="waiting for table flush"} 0
mysql_info_schema_threads_seconds{state="waiting for tables"} 0
mysql_info_schema_threads_seconds{state="waiting on cond"} 0
mysql_info_schema_threads_seconds{state="writing to net"} 0
...
process_virtual_memory_bytes 1.9390464e+07

At this point, our architecture is looking something like this:

We are now good to setup the Prometheus server.

Deploying Prometheus Server

Firstly, create Prometheus configuration file at ~/prometheus.yml and add the following lines:

$ vim ~/prometheus.yml
global:
  scrape_interval:     5s
  scrape_timeout:      3s
  evaluation_interval: 5s

# Our alerting rule files
rule_files:
  - "alert.rules"

# Scrape endpoints
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'mysql'
    static_configs:
      - targets: ['mysql57-exporter:9104','mysql80-exporter:9104']

  - job_name: 'docker'
    static_configs:
      - targets: ['192.168.55.161:9323']

From the Prometheus configuration file, we have defined three jobs - "prometheus", "mysql" and "docker". The first one is the job to monitor the Prometheus server itself. The next one is the job to monitor our MySQL containers named "mysql". We define the endpoints on our MySQL exporters on port 9104, which exposed the Prometheus-compatible metrics from the MySQL 8.0 and 5.7 instances respectively. The "alert.rules" is the rule file that we will include later in the next blog post for alerting purposes.

We can then map the configuration with the Prometheus container. We also need to create a Docker volume for Prometheus data for persistency and also expose port 9090 publicly:

$ docker run -d \
--name prometheus-server \
--publish 9090:9090 \
--network db_network \
--restart unless-stopped \
--mount type=volume,src=prometheus-data,target=/prometheus \
--mount type=bind,src="$(pwd)"/prometheus.yml,target=/etc/prometheus/prometheus.yml \
--mount type=bind,src="$(pwd)
prom/prometheus

Now our Prometheus server is already running and can be accessed directly on port 9090 of the Docker host. Open a web browser and go to http://192.168.55.161:9090/ to access the Prometheus web UI. Verify the target status under Status -> Targets and make sure they are all green:

At this point, our container architecture is looking something like this:

Our Prometheus monitoring system for our standalone MySQL containers are now deployed.

Docker Swarm

Deploying a 3-node Galera Cluster

Supposed we want to deploy a three-node Galera Cluster in Docker Swarm, we would have to create 3 different services, each service representing one Galera node. Using this approach, we can keep a static resolvable hostname for our Galera container, together with MySQL exporter containers that will accompany each of them. We will be using MariaDB 10.2 image maintained by the Docker team to run our Galera cluster.

Firstly, create a MySQL configuration file to be used by our Swarm service:

$ vim ~/my.cnf
[mysqld]

default_storage_engine          = InnoDB
binlog_format                   = ROW

innodb_flush_log_at_trx_commit  = 0
innodb_flush_method             = O_DIRECT
innodb_file_per_table           = 1
innodb_autoinc_lock_mode        = 2
innodb_lock_schedule_algorithm  = FCFS # MariaDB >10.1.19 and >10.2.3 only

wsrep_on                        = ON
wsrep_provider                  = /usr/lib/galera/libgalera_smm.so
wsrep_sst_method                = mariabackup

Create a dedicated database network in our Swarm called "db_swarm":

$ docker network create --driver overlay db_swarm

Import our MySQL configuration file into Docker config so we can load it into our Swarm service when we create it later:

$ cat ~/my.cnf | docker config create my-cnf -

Create the first Galera bootstrap service, with "gcomm://" as the cluster address called "galera0". This is a transient service for bootstrapping process only. We will delete this service once we have gotten 3 other Galera services running:

$ docker service create \
--name galera0 \
--replicas 1 \
--hostname galera0 \
--network db_swarm \
--publish 3306 \
--publish 4444 \
--publish 4567 \
--publish 4568 \
--config src=my-cnf,target=/etc/mysql/mariadb.conf.d/my.cnf \
--env MYSQL_ROOT_PASSWORD=mypassword \
--mount type=volume,src=galera0-datadir,dst=/var/lib/mysql \
mariadb:10.2 \
--wsrep_cluster_address=gcomm:// \
--wsrep_sst_auth="root:mypassword" \
--wsrep_node_address=galera0

At this point, our database architecture can be illustrated as below:

Then, repeat the following command for 3 times to create 3 different Galera services. Replace {name} with galera1, galera2 and galera3 respectively:

$ docker service create \
--name {name} \
--replicas 1 \
--hostname {name} \
--network db_swarm \
--publish 3306 \
--publish 4444 \
--publish 4567 \
--publish 4568 \
--config src=my-cnf,target=/etc/mysql/mariadb.conf.d/my.cnf \
--env MYSQL_ROOT_PASSWORD=mypassword \
--mount type=volume,src={name}-datadir,dst=/var/lib/mysql \
mariadb:10.2 \
--wsrep_cluster_address=gcomm://galera0,galera1,galera2,galera3 \
--wsrep_sst_auth="root:mypassword" \
--wsrep_node_address={name}

Verify our current Docker services:

$ docker service ls 
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
wpcxye3c4e9d        galera0             replicated          1/1                 mariadb:10.2        *:30022->3306/tcp, *:30023->4444/tcp, *:30024-30025->4567-4568/tcp
jsamvxw9tqpw        galera1             replicated          1/1                 mariadb:10.2        *:30026->3306/tcp, *:30027->4444/tcp, *:30028-30029->4567-4568/tcp
otbwnb3ridg0        galera2             replicated          1/1                 mariadb:10.2        *:30030->3306/tcp, *:30031->4444/tcp, *:30032-30033->4567-4568/tcp
5jp9dpv5twy3        galera3             replicated          1/1                 mariadb:10.2        *:30034->3306/tcp, *:30035->4444/tcp, *:30036-30037->4567-4568/tcp

Our architecture is now looking something like this:

We need to remove the Galera bootstrap Swarm service, galera0, to stop it from running because if the container is being rescheduled by Docker Swarm, a new replica will be started with a fresh new volume. We run the risk of data loss because the --wsrep_cluster_address contains "galera0" in the other Galera nodes (or Swarm services). So, let's remove it:

$ docker service rm galera0

At this point, we have our three-node Galera Cluster:

We are now ready to deploy our MySQL exporter and Prometheus Server.

MySQL Exporter Swarm Service

$ docker exec -it {galera1} mysql -uroot -p
Enter password:

mysql> CREATE USER 'exporter'@'%' IDENTIFIED BY 'exporterpassword' WITH MAX_USER_CONNECTIONS 3;
mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%';

Then, create the exporter service for each of the Galera services (replace {name} with galera1, galera2 and galera3 respectively):

$ docker service create \
--name {name}-exporter \
--network db_swarm \
--replicas 1 \
-p 9104 \
-e DATA_SOURCE_NAME="exporter:exporterpassword@({name}:3306)/" \
prom/mysqld-exporter:latest \
--collect.info_schema.processlist \
--collect.info_schema.innodb_metrics \
--collect.info_schema.tablestats \
--collect.info_schema.tables \
--collect.info_schema.userstats \
--collect.engine_innodb_status

At this point, our architecture is looking something like this with exporter services in the picture:

Prometheus Server Swarm Service

Finally, let's deploy our Prometheus server. Similar to the Galera deployment, we have to prepare the Prometheus configuration file first before importing it into Swarm using Docker config command:

$ vim ~/prometheus.yml
global:
  scrape_interval:     5s
  scrape_timeout:      3s
  evaluation_interval: 5s

# Our alerting rule files
rule_files:
  - "alert.rules"

# Scrape endpoints
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'galera'
    static_configs:
      - targets: ['galera1-exporter:9104','galera2-exporter:9104', 'galera3-exporter:9104']

From the Prometheus configuration file, we have defined three jobs - "prometheus" and "galera". The first one is the job to monitor the Prometheus server itself. The next one is the job to monitor our MySQL containers named "galera". We define the endpoints on our MySQL exporters on port 9104, which expose the Prometheus-compatible metrics from the three Galera nodes respectively. The "alert.rules" is the rule file that we will include later in the next blog post for alerting purposes.

Import the configuration file into Docker config to be used with Prometheus container later:

$ cat ~/prometheus.yml | docker config create prometheus-yml -

Let's run the Prometheus server container, and publish port 9090 of all Docker hosts for the Prometheus web UI service:

$ docker service create \
--name prometheus-server \
--publish 9090:9090 \
--network db_swarm \
--replicas 1 \    
--config src=prometheus-yml,target=/etc/prometheus/prometheus.yml \
--mount type=volume,src=prometheus-data,dst=/prometheus \
prom/prometheus

Verify with the Docker service command that we have 3 Galera services, 3 exporter services and 1 Prometheus service:

$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                         PORTS
jsamvxw9tqpw        galera1             replicated          1/1                 mariadb:10.2                  *:30026->3306/tcp, *:30027->4444/tcp, *:30028-30029->4567-4568/tcp
hbh1dtljn535        galera1-exporter    replicated          1/1                 prom/mysqld-exporter:latest   *:30038->9104/tcp
otbwnb3ridg0        galera2             replicated          1/1                 mariadb:10.2                  *:30030->3306/tcp, *:30031->4444/tcp, *:30032-30033->4567-4568/tcp
jq8i77ch5oi3        galera2-exporter    replicated          1/1                 prom/mysqld-exporter:latest   *:30039->9104/tcp
5jp9dpv5twy3        galera3             replicated          1/1                 mariadb:10.2                  *:30034->3306/tcp, *:30035->4444/tcp, *:30036-30037->4567-4568/tcp
10gdkm1ypkav        galera3-exporter    replicated          1/1                 prom/mysqld-exporter:latest   *:30040->9104/tcp
gv9llxrig30e        prometheus-server   replicated          1/1                 prom/prometheus:latest        *:9090->9090/tcp

Now our Prometheus server is already running and can be accessed directly on port 9090 from any Docker node. Open a web browser and go to http://192.168.55.161:9090/ to access the Prometheus web UI. Verify the target status under Status -> Targets and make sure they are all green:

At this point, our Swarm architecture is looking something like this:

To be continued..

We now have our database and monitoring stack deployed on Docker. In part 2 of the blog, we will look into the different MySQL metrics to keep an eye on. We’ll also see how to configure alerting with Prometheus.

Tags:

ProxySQL commonly sits between the application and database tiers, in so called reverse-proxy tier. When your application containers are orchestrated and managed by Kubernetes, you might want to use ProxySQL in front of your database servers.

In this post, we’ll show you how to run ProxySQL on Kubernetes as a helper container in a pod. We are going to use Wordpress as an example application. The data service is provided by our two-node MySQL Replication, deployed using ClusterControl and sitting outside of the Kubernetes network on a bare-metal infrastructure, as illustrated in the following diagram:

ProxySQL Docker Image

In this example, we are going to use ProxySQL Docker image maintained by Severalnines, a general public image built for multi-purpose usage. The image comes with no entrypoint script and supports Galera Cluster (in addition to built-in support for MySQL Replication), where an extra script is required for health check purposes.

Basically, to run a ProxySQL container, simply execute the following command:

$ docker run -d -v /path/to/proxysql.cnf:/etc/proxysql.cnf severalnines/proxysql

This image recommends you to bind a ProxySQL configuration file to the mount point, /etc/proxysql.cnf, albeit you can skip this and configure it later using ProxySQL Admin console. Example configurations are provided in the Docker Hub page or the Github page.

ProxySQL on Kubernetes

Designing the ProxySQL architecture is a subjective topic and highly dependent on the placement of the application and database containers as well as the role of ProxySQL itself. ProxySQL does not only route queries, it can also be used to rewrite and cache queries. Efficient cache hits might require a custom configuration tailored specifically for the application database workload.

Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

ProxySQL as a Kubernetes service (centralized deployment).
ProxySQL as a helper container in a pod (distributed deployment).

The first option is pretty straightforward, where we create a ProxySQL pod and attach a Kubernetes service to it. Applications will then connect to the ProxySQL service via networking on the configured ports. Default to 6033 for MySQL load-balanced port and 6032 for ProxySQL administration port. This deployment will be covered in the upcoming blog post.

The second option is a bit different. Kubernetes has a concept called "pod". You can have one or more containers per pod, these are relatively tightly coupled. A pod’s contents are always co-located and co-scheduled, and run in a shared context. A pod is the smallest manageable container unit in Kubernetes.

Both deployments can be distinguished easily by looking at the following diagram:

The primary reason that pods can have multiple containers is to support helper applications that assist a primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. Helper and primary applications often need to communicate with each other. Typically this is done through a shared filesystem, as shown in this exercise, or through the loopback network interface, localhost. An example of this pattern is a web server along with a helper program that polls a Git repository for new updates.

This blog post will cover the second configuration - running ProxySQL as a helper container in a pod.

ProxySQL as Helper in a Pod

In this setup, we run ProxySQL as a helper container to our Wordpress container. The following diagram illustrates our high-level architecture:

In this setup, ProxySQL container is tightly coupled with the Wordpress container, and we named it as "blog" pod. If rescheduling happens e.g, the Kubernetes worker node goes down, these two containers will always be rescheduled together as one logical unit on the next available host. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

ProxySQL role is to provide a database abstraction layer to the application container. Since we are running a two-node MySQL Replication as the backend database service, read-write splitting is vital to maximize resource consumption on both MySQL servers. ProxySQL excels at this and requires minimal to no changes to the application.

There are a number of other benefits running ProxySQL in this setup:

Bring query caching capability closest to the application layer running in Kubernetes.
Secure implementation by connecting through ProxySQL UNIX socket file. It is like a pipe that the server and the clients can use to connect and exchange requests and data.
Distributed reverse proxy tier with shared nothing architecture.
Less network overhead due to "skip-networking" implementation.
Stateless deployment approach by utilizing Kubernetes ConfigMaps.

Preparing the Database

Create the wordpress database and user on the master and assign with correct privilege:

mysql-master> CREATE DATABASE wordpress;
mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

Also, create the ProxySQL monitoring user:

mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

Then, reload the grant table:

mysql-master> FLUSH PRIVILEGES;

Preparing the Pod

Now, copy paste the following lines into a file called blog-deployment.yml on the host where kubectl is configured:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: blog
  labels:
    app: blog
spec:
  replicas: 1
  selector:
    matchLabels:
      app: blog
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: blog
        tier: frontend
    spec:

      restartPolicy: Always

      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: localhost:/tmp/proxysql.sock
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html
        - name: shared-data
          mountPath: /tmp

      - image: severalnines/proxysql
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: shared-data
          mountPath: /tmp

      volumes:
      - name: wordpress-persistent-storage
        persistentVolumeClaim:
          claimName: wp-pv-claim
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
      - name: shared-data
        emptyDir: {}

The YAML file has many lines and let's look the interesting part only. The first section:

apiVersion: apps/v1
kind: Deployment

The first line is the apiVersion. Our Kubernetes cluster is running on v1.12 so we should refer to the Kubernetes v1.12 API documentation and follow the resource declaration according to this API. The next one is the kind, which tells what type of resource that we want to deploy. Deployment, Service, ReplicaSet, DaemonSet, PersistentVolume are some of the examples.

The next important section is the "containers" section. Here we define all containers that we would like to run together in this pod. The first part is the Wordpress container:

      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: localhost:/tmp/proxysql.sock
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html
        - name: shared-data
          mountPath: /tmp

In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". We also want Kubernetes to pass a number of environment variables:

WORDPRESS_DB_HOST - The database host. Since our ProxySQL container resides in the same Pod with the Wordpress container, it's more secure to use a ProxySQL socket file instead. The format to use socket file in Wordpress is "localhost:{path to the socket file}". By default, it's located under /tmp directory of the ProxySQL container. This /tmp path is shared between Wordpress and ProxySQL containers by using "shared-data" volumeMounts as shown further down. Both containers have to mount this volume to share the same content under /tmp directory.
WORDPRESS_DB_USER - Specify the wordpress database user.
WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

We also want to publish port 80 of the container for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS.

Next, we define the ProxySQL container:

      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: shared-data
          mountPath: /tmp
        ports:
        - containerPort: 6033
          name: proxysql

In the above section, we are telling Kubernetes to deploy a ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. There will be a volume called "shared-data" which map to /tmp directory to share with the Wordpress image - a temporary directory that shares a pod's lifetime. This allows ProxySQL socket file (/tmp/proxysql.sock) to be used by the Wordpress container when connecting to the database, bypassing the TCP/IP networking.

The last part is the "volumes" section:

      volumes:
      - name: wordpress-persistent-storage
        persistentVolumeClaim:
          claimName: wp-pv-claim
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
      - name: shared-data
        emptyDir: {}

Kubernetes will have to create three volumes for this pod:

wordpress-persistent-storage - Use the PersistentVolumeClaim resource to map NFS export into the container for persistent data storage for Wordpress content.
proxysql-config - Use the ConfigMap resource to map the ProxySQL configuration file.
shared-data - Use the emptyDir resource to mount a shared directory for our containers inside the Pod. emptyDir resource is a temporary directory that shares a pod's lifetime.

Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "blog" pod:

PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
Secrets - To hide the Wordpress database user password inside the YAML file.
ConfigMap - To map the configuration file to ProxySQL container, so when it's being rescheduled to other node, Kubernetes can automatically remount it again.

MySQL on Docker: How to Containerize Your Database

Discover all you need to understand when considering to run a MySQL service on top of Docker container virtualization

Download the Whitepaper

PersistentVolume and PersistentVolumeClaim

A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

/nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

$ sudo apt-install nfs-common #Ubuntu/Debian
$ yum install nfs-utils #RHEL/CentOS

Also, make sure on the NFS server, the target directory exists:

(nfs-server)$ mkdir /nfs/kubernetes/wordpress

Then, create a file called wordpress-pv-pvc.yml and add the following lines:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: wp-pv
  labels:
    app: blog
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 3Gi
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /nfs/kubernetes/wordpress
    server: 192.168.55.200
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: wp-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  selector:
    matchLabels:
      app: blog
      tier: frontend

In the above definition, we would like Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

Create the PV and PVC resources:

$ kubectl create -f wordpress-pv-pvc.yml

Verify if those resources are created and the status must be "Bound":

$ kubectl get pv,pvc
NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h

NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

Secrets

The first one is to create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

Create a secret resource called mysql-pass and pass the password accordingly:

$ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

Verify that our secret is created:

$ kubectl get secrets mysql-pass
NAME         TYPE     DATA   AGE
mysql-pass   Opaque   1      7h12m

ConfigMap

We also need to create a ConfigMap resource for our ProxySQL container. A Kubernetes ConfigMap file holds key-value pairs of configuration data that can be consumed in pods or used to store configuration data. ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable.

Since our database server is already running on bare-metal servers with a static hostname and IP address plus static monitoring username and password, in this use case the ConfigMap file will store pre-configured configuration information about the ProxySQL service that we want to use.

First create a text file called proxysql.cnf and add the following lines:

datadir="/var/lib/proxysql"
admin_variables=
{
        admin_credentials="admin:adminpassw0rd"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
}
mysql_variables=
{
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="proxysql"
        monitor_password="proxysqlpassw0rd"
}
mysql_servers =
(
        { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
)
mysql_users =
(
        { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
        {
                rule_id=100
                active=1
                match_pattern="^SELECT .* FOR UPDATE"
                destination_hostgroup=10
                apply=1
        },
        {
                rule_id=200
                active=1
                match_pattern="^SELECT .*"
                destination_hostgroup=20
                apply=1
        },
        {
                rule_id=300
                active=1
                match_pattern=".*"
                destination_hostgroup=10
                apply=1
        }
)
mysql_replication_hostgroups =
(
        { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
)

Pay extra attention to the "mysql_servers" and "mysql_users" sections, where you might need to modify the values to suit your database cluster setup. In this case, we have two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

All writes should go to the master node while reads are forwarded to hostgroup 20, as defined under "mysql_query_rules" section. That's the basic of read/write splitting and we want to utilize them altogether.

Then, import the configuration file into ConfigMap:

$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
configmap/proxysql-configmap created

Verify if the ConfigMap is loaded into Kubernetes:

$ kubectl get configmap
NAME                 DATA   AGE
proxysql-configmap   1      45s

Deploying the Pod

Now we should be good to deploy the blog pod. Send the deployment job to Kubernetes:

$ kubectl create -f blog-deployment.yml

Verify the pod status:

$ kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-t4cb7          2/2     Running             0          100s

It must show 2/2 under the READY column, indicating there are two containers running inside the pod. Use the -c option flag to check the Wordpress and ProxySQL containers inside the blog pod:

$ kubectl logs blog-54755cbcb5-t4cb7 -c wordpress
$ kubectl logs blog-54755cbcb5-t4cb7 -c proxysql

From the ProxySQL container log, you should see the following lines:

2018-10-20 08:57:14 [INFO] Dumping current MySQL Servers structures for hostgroup ALL
HID: 10 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 10 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: OFFLINE_HARD , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 20 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 20 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:

HID 10 (writer hostgroup) must have only one ONLINE node (indicating a single master) and the other host must be in at least in OFFLINE_HARD status. For HID 20, it's expected to be ONLINE for all nodes (indicating multiple read replicas).

To get a summary of the deployment, use the describe flag:

$ kubectl describe deployments blog

Our blog is now running, however we can't access it from outside of the Kubernetes network without configuring the service, as explained in the next section.

Creating the Blog Service

The last step is to create attach a service to our pod. This to ensure that our Wordpress blog pod is accessible from the outside world. Create a file called blog-svc.yml and paste the following line:

apiVersion: v1
kind: Service
metadata:
  name: blog
  labels:
    app: blog
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: blog
    nodePort: 30080
    port: 80
  selector:
    app: blog
    tier: frontend

Create the service:

$ kubectl create -f blog-svc.yml

Verify if the service is created correctly:

root@kube1:~/proxysql-blog# kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
blog         NodePort    10.96.140.37   <none>        80:30080/TCP   26s
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        43h

Port 80 published by the blog pod is now mapped to the outside world via port 30080. We can access our blog post at http://{any_kubernetes_host}:30080/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

Our deployment is now complete.

Managing ProxySQL Container inside a Pod

Failover and recovery are expected to be handled automatically by Kubernetes. For example, if Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

Scaling Up and Down

In the above configuration, we were deploying one replica in our deployment. To scale up, simply change the spec.replicas value accordingly by using kubectl edit command:

$ kubectl edit deployment blog

It will open up the deployment definition in a default text file and simply change the spec.replicas value to something higher, for example, "replicas: 3". Then, save the file and immediately check the rollout status by using the following command:

$ kubectl rollout status deployment blog
Waiting for deployment "blog" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "blog" rollout to finish: 2 of 3 updated replicas are available...
deployment "blog" successfully rolled out

At this point, we have three blog pods (Wordpress + ProxySQL) running simultanouesly in Kubernetes:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          11m
blog-54755cbcb5-cwpdj            2/2     Running             0          11m
blog-54755cbcb5-jxtvc            2/2     Running             0          22m

At this point, our architecture is looking something like this:

Take note that it might require more customization than our current configuration to run Wordpress smoothly in a horizontal-scaled production environment (think about static contents, session management and others). Those are actually beyond the scope of this blog post.

Scaling down procedures are similar.

Configuration Management

Configuration management is important in ProxySQL. This is where the magic happens where you can define your own set of query rules to do query caching, firewalling and rewriting. Contrary to the common practice, where ProxySQL would be configured via Admin console and push into persistency by using "SAVE .. TO DISK", we will stick with configuration files only to make things more portable in Kubernetes. That's the reason we are using ConfigMaps.

Since we are relying on our centralized configuration stored by Kubernetes ConfigMaps, there are a number of ways to perform configuration changes. Firstly, by using the kubectl edit command:

$ kubectl edit configmap proxysql-configmap

It will open up the configuration in a default text editor and you can directly make changes to it and save the text file once done. Otherwise, recreate the configmaps should also do:

$ vi proxysql.cnf # edit the configuration first
$ kubectl delete configmap proxysql-configmap
$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf

After the configuration is pushed into ConfigMap, restart the pod or container as shown in the Service Control section. Configuring the container via ProxySQL admin interface (port 6032) won't make it persistent after pod rescheduling by Kubernetes.

Service Control

Since the two containers inside a pod are tightly coupled, the best way to apply the ProxySQL configuration changes is to force Kubernetes to do pod replacement. Consider we are having three blog pods now after we scaled up:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          31m
blog-54755cbcb5-cwpdj            2/2     Running             0          31m
blog-54755cbcb5-jxtvc            2/2     Running             1          22m

Use the following command to replace one pod at a time:

$ kubectl get pod blog-54755cbcb5-6fnqn -n default -o yaml | kubectl replace --force -f -
pod "blog-54755cbcb5-6fnqn" deleted
pod/blog-54755cbcb5-6fnqn

Then, verify with the following:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          31m
blog-54755cbcb5-cwpdj            2/2     Running             0          31m
blog-54755cbcb5-qs6jm            2/2     Running             1          2m26s

You will notice the most recent pod has been restarted by looking at the AGE and RESTART column, it came up with a different pod name. Repeat the same steps for the remaining pods. Otherwise, you can also use "docker kill" command to kill the ProxySQL container manually inside the Kubernetes worker node. For example:

(kube-worker)$ docker kill $(docker ps | grep -i proxysql_blog | awk {'print $1'})

Kubernetes will then replace the killed ProxySQL container with a new one.

Monitoring

Use kubectl exec command to execute SQL statement via mysql client. For example, to monitor query digestion:

$ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032
mysql> SELECT * FROM stats_mysql_query_digest;

Or with a one-liner:

$ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032 -e 'SELECT * FROM stats_mysql_query_digest'

By changing the SQL statement, you can monitor other ProxySQL components or perform any administration tasks via this Admin console. Again, it will only persist during the ProxySQL container lifetime and won't get persisted if the pod is rescheduled.

Final Thoughts

ProxySQL holds a key role if you want to scale your application containers and and have an intelligent way to access a distributed database backend. There are a number of ways to deploy ProxySQL on Kubernetes to support our application growth when running at scale. This blog post only covers one of them.

In an upcoming blog post, we are going to look at how to run ProxySQL in a centralized approach by using it as a Kubernetes service.

Tags:

When running distributed database clusters, it is quite common to front them with load balancers. The advantages are clear - load balancing, connection failover and decoupling of the application tier from the underlying database topologies. For more intelligent load balancing, a database-aware proxy like ProxySQL or MaxScale would be the way to go. In our previous blog, we showed you how to run ProxySQL as a helper container in Kubernetes. In this blog post, we’ll show you how to deploy ProxySQL as a Kubernetes service. We’ll use Wordpress as an example application and the database backend is running on a two-node MySQL Replication deployed using ClusterControl. The following diagram illustrates our infrastructure:

Since we are going to deploy a similar setup as in this previous blog post, do expect duplication in some parts of the blog post to keep the post more readable.

ProxySQL on Kubernetes

Let’s start with a bit of recap. Designing a ProxySQL architecture is a subjective topic and highly dependent on the placement of the application, database containers as well as the role of ProxySQL itself. Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

ProxySQL as a Kubernetes service (centralized deployment)
ProxySQL as a helper container in a pod (distributed deployment)

Both deployments can be distinguished easily by looking at the following diagram:

This blog post will cover the first configuration - running ProxySQL as a Kubernetes service. The second configuration is already covered here. In contrast to the helper container approach, running as a service makes ProxySQL pods live independently from the applications and can be easily scaled and clustered together with the help of Kubernetes ConfigMap. This is definitely a different clustering approach than ProxySQL native clustering support which relies on configuration checksum across ProxySQL instances (a.k.a proxysql_servers). Check out this blog post if you want to learn about ProxySQL clustering made easy with ClusterControl.

In Kubernetes, ProxySQL's multi-layer configuration system makes pod clustering possible with ConfigMap. However, there are a number of shortcomings and workarounds to make it work smoothly as what ProxySQL's native clustering feature does. At the moment, signalling a pod upon ConfigMap update is a feature in the works. We will cover this topic in much greater detail in an upcoming blog post.

Basically, we need to create ProxySQL pods and attach a Kubernetes service to be accessed by the other pods within the Kubernetes network or externally. Applications will then connect to the ProxySQL service via TCP/IP networking on the configured ports. Default to 6033 for MySQL load-balanced connections and 6032 for ProxySQL administration console. With more than one replica, the connections to the pod will be load balanced automatically by Kubernetes kube-proxy component running on every Kubernetes node.

ProxySQL as Kubernetes Service

In this setup, we run both ProxySQL and Wordpress as pods and services. The following diagram illustrates our high-level architecture:

In this setup, we will deploy two pods and services - "wordpress" and "proxysql". We will merge Deployment and Service declaration in one YAML file per application and manage them as one unit. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

Deploying ProxySQL as a service brings a couple of good things over the helper container approach:

Using Kubernetes ConfigMap approach, ProxySQL can be clustered with immutable configuration.
Kubernetes handles ProxySQL recovery and balance the connections to the instances automatically.
Single endpoint with Kubernetes Virtual IP address implementation called ClusterIP.
Centralized reverse proxy tier with shared nothing architecture.
Can be used with external applications outside Kubernetes.

We will start the deployment as two replicas for ProxySQL and three for Wordpress to demonstrate running at scale and load-balancing capabilities that Kubernetes offers.

Preparing the Database

Create the wordpress database and user on the master and assign with correct privilege:

mysql-master> CREATE DATABASE wordpress;
mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

Also, create the ProxySQL monitoring user:

mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

Then, reload the grant table:

mysql-master> FLUSH PRIVILEGES;

ProxySQL Pod and Service Definition

The next one is to prepare our ProxySQL deployment. Create a file called proxysql-rs-svc.yml and add the following lines:

apiVersion: v1
kind: Deployment
metadata:
  name: proxysql
  labels:
    app: proxysql
spec:
  replicas: 2
  selector:
    matchLabels:
      app: proxysql
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: proxysql
        tier: frontend
    spec:
      restartPolicy: Always
      containers:
      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
---
apiVersion: v1
kind: Service
metadata:
  name: proxysql
  labels:
    app: proxysql
    tier: frontend
spec:
  type: NodePort
  ports:
  - nodePort: 30033
    port: 6033
    name: proxysql-mysql
  - nodePort: 30032
    port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
    tier: frontend

Let's see what those definitions are all about. The YAML consists of two resources combined in a file, separated by "---" delimiter. The first resource is the Deployment, which we define the following specification:

spec:
  replicas: 2
  selector:
    matchLabels:
      app: proxysql
      tier: frontend
  strategy:
    type: RollingUpdate

The above means we would like to deploy two ProxySQL pods as a ReplicaSet that matches containers labelled with "app=proxysql,tier=frontend". The deployment strategy specifies the strategy used to replace old pods by new ones. In this deployment, we picked RollingUpdate which means the pods will be updated in a rolling update fashion, one pod at a time.

The next part is the container's template:

      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap

In the spec.templates.spec.containers.* section, we are telling Kubernetes to deploy ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. The running pods will publish two ports - 6033 and 6032. We also define the "volumes" section, where we instruct Kubernetes to mount the ConfigMap as a volume inside the ProxySQL pods to be mounted by volumeMounts.

The second resource is the service. A Kubernetes service is an abstraction layer which defines the logical set of pods and a policy by which to access them. In this section, we define the following:

apiVersion: v1
kind: Service
metadata:
  name: proxysql
  labels:
    app: proxysql
    tier: frontend
spec:
  type: NodePort
  ports:
  - nodePort: 30033
    port: 6033
    name: proxysql-mysql
  - nodePort: 30032
    port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
    tier: frontend

In this case, we want our ProxySQL to be accessed from the external network thus NodePort type is the chosen type. This will publish the nodePort on every Kubernetes nodes in the cluster. The range of valid ports for NodePort resource is 30000-32767. We chose port 30033 for MySQL-load balanced connections which is mapped to port 6033 of the ProxySQL pods and port 30032 for ProxySQL Administration port mapped to 6032.

Therefore, based on our YAML definition above, we have to prepare the following Kubernetes resource before we can begin to deploy the "proxysql" pod:

ConfigMap - To store ProxySQL configuration file as a volume so it can be mounted to multiple pods and can be remounted again if the pod is being rescheduled to the other Kubernetes node.

Preparing ConfigMap for ProxySQL

Similar to the previous blog post, we are going to use ConfigMap approach to decouple the configuration file from the container and also for scalability purpose. Take note that in this setup, we consider our ProxySQL configuration is immutable.

Firstly, create the ProxySQL configuration file, proxysql.cnf and add the following lines:

datadir="/var/lib/proxysql"
admin_variables=
{
        admin_credentials="proxysql-admin:adminpassw0rd"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
}
mysql_variables=
{
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="proxysql"
        monitor_password="proxysqlpassw0rd"
}
mysql_replication_hostgroups =
(
        { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
)
mysql_servers =
(
        { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
)
mysql_users =
(
        { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
        {
                rule_id=100
                active=1
                match_pattern="^SELECT .* FOR UPDATE"
                destination_hostgroup=10
                apply=1
        },
        {
                rule_id=200
                active=1
                match_pattern="^SELECT .*"
                destination_hostgroup=20
                apply=1
        },
        {
                rule_id=300
                active=1
                match_pattern=".*"
                destination_hostgroup=10
                apply=1
        }
)

Pay attention on the admin_variables.admin_credentials variable where we used non-default user which is "proxysql-admin". ProxySQL reserves the default "admin" user for local connection via localhost only. Therefore, we have to use other users to access the ProxySQL instance remotely. Otherwise, you would get the following error:

ERROR 1040 (42000): User 'admin' can only connect locally

Our ProxySQL configuration is based on our two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

Then, import the configuration file into ConfigMap:

$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
configmap/proxysql-configmap created

Verify if the ConfigMap is loaded into Kubernetes:

$ kubectl get configmap
NAME                 DATA   AGE
proxysql-configmap   1      45s

Wordpress Pod and Service Definition

Now, paste the following lines into a file called wordpress-rs-svc.yml on the host where kubectl is configured:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  replicas: 3
  selector:
    matchLabels:
      app: wordpress
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: wordpress
        tier: frontend
    spec:
      restartPolicy: Always
      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: proxysql:6033 # proxysql.default.svc.cluster.local:6033
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_DATABASE
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
---
apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: wordpress
    nodePort: 30088
    port: 80
  selector:
    app: wordpress
    tier: frontend

Similar to our ProxySQL definition, the YAML consists of two resources, separated by "---" delimiter combined in a file. The first one is the Deployment resource, which will be deployed as a ReplicaSet, as shown under the "spec.*" section:

spec:
  replicas: 3
  selector:
    matchLabels:
      app: wordpress
      tier: frontend
  strategy:
    type: RollingUpdate

This section provides the Deployment specification - 3 pods to start that matches label "app=wordpress,tier=backend". The deployment strategy is RollingUpdate which means the way Kubernetes will replace the pod is by using rolling update fashion, same with our ProxySQL deployment.

The next part is the "spec.template.spec.*" section:

      restartPolicy: Always
      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: proxysql:6033
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html

In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". The container will be restarted every time it is down, regardless of the status. We also want Kubernetes to pass a number of environment variables:

WORDPRESS_DB_HOST - The MySQL database host. Since we are using ProxySQL as a service, the service name will be the value of metadata.name which is "proxysql". ProxySQL listens on port 6033 for MySQL load balanced connections while ProxySQL administration console is on 6032.
WORDPRESS_DB_USER - Specify the wordpress database user that have been created under "Preparing the Database" section.
WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

We also want to publish port 80 of the pod for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS. We will use the PersistentVolume and PersistentVolumeClaim resources for this purpose as shown under "Preparing Persistent Storage for Wordpress" section.

After the "---" break line, we define another resource called Service:

apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: wordpress
    nodePort: 30088
    port: 80
  selector:
    app: wordpress
    tier: frontend

In this configuration, we would like Kubernetes to create a service called "wordpress", listen on port 30088 on all nodes (a.k.a. NodePort) to the external network and forward it to port 80 on all pods labelled with "app=wordpress,tier=frontend".

Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "wordpress" pod and service:

PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
Secrets - To hide the Wordpress database user password inside the YAML file.

Preparing Persistent Storage for Wordpress

A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network (as shown in the first architecture diagram) and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

/nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

$ sudo apt-install nfs-common #Ubuntu/Debian
$ yum install nfs-utils #RHEL/CentOS

Also, make sure on the NFS server, the target directory exists:

(nfs-server)$ mkdir /nfs/kubernetes/wordpress

Then, create a file called wordpress-pv-pvc.yml and add the following lines:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: wp-pv
  labels:
    app: wordpress
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 3Gi
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /nfs/kubernetes/wordpress
    server: 192.168.55.200
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: wp-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  selector:
    matchLabels:
      app: wordpress
      tier: frontend

In the above definition, we are telling Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

Create the PV and PVC resources:

$ kubectl create -f wordpress-pv-pvc.yml

Verify if those resources are created and the status must be "Bound":

$ kubectl get pv,pvc
NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h


NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

Preparing Secrets for Wordpress

Create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

Create a secret resource called mysql-pass and pass the password accordingly:

$ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

Verify that our secret is created:

$ kubectl get secrets mysql-pass
NAME         TYPE     DATA   AGE
mysql-pass   Opaque   1      7h12m

Deploying ProxySQL and Wordpress

Finally, we can begin the deployment. Deploy ProxySQL first, followed by Wordpress:

$ kubectl create -f proxysql-rs-svc.yml
$ kubectl create -f wordpress-rs-svc.yml

We can then list out all pods and services that have been created under "frontend" tier:

$ kubectl get pods,services -l tier=frontend -o wide
NAME                             READY   STATUS    RESTARTS   AGE   IP          NODE          NOMINATED NODE
pod/proxysql-95b8d8446-qfbf2     1/1     Running   0          12m   10.36.0.2   kube2.local   <none>
pod/proxysql-95b8d8446-vljlr     1/1     Running   0          12m   10.44.0.6   kube3.local   <none>
pod/wordpress-59489d57b9-4dzvk   1/1     Running   0          37m   10.36.0.1   kube2.local   <none>
pod/wordpress-59489d57b9-7d2jb   1/1     Running   0          30m   10.44.0.4   kube3.local   <none>
pod/wordpress-59489d57b9-gw4p9   1/1     Running   0          30m   10.36.0.3   kube2.local   <none>

NAME                TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE   SELECTOR
service/proxysql    NodePort   10.108.195.54    <none>        6033:30033/TCP,6032:30032/TCP   10m   app=proxysql,tier=frontend
service/wordpress   NodePort   10.109.144.234   <none>        80:30088/TCP                    37m   app=wordpress,tier=frontend
  kube2.local   <none>

The above output verifies our deployment architecture where we are currently having three Wordpress pods, exposed on port 30088 publicly as well as our ProxySQL instance which is exposed on port 30033 and 30032 externally plus 6033 and 6032 internally.

At this point, our architecture is looking something like this:

Port 80 published by the Wordpress pods is now mapped to the outside world via port 30088. We can access our blog post at http://{any_kubernetes_host}:30088/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

Our deployment is now complete.

ProxySQL Pods and Service Management

Failover and recovery are expected to be handled automatically by Kubernetes. For example, if a Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

Connecting to ProxySQL

While ProxySQL is exposed externally on port 30033 (MySQL) and 30032 (Admin), it is also accessible internally via the published ports, 6033 and 6032 respectively. Thus, to access the ProxySQL instances within the Kubernetes network, use the CLUSTER-IP, or the service name "proxysql" as the host value. For example, within Wordpress pod, you may access the ProxySQL admin console by using the following command:

$ mysql -uproxysql-admin -p -hproxysql -P6032

If you want to connect externally, use the port defined under nodePort value he service YAML and pick any of the Kubernetes node as the host value:

$ mysql -uproxysql-admin -p -hkube3.local -P30032

The same applies to the MySQL load-balanced connection on port 30033 (external) and 6033 (internal).

Scaling Up and Down

Scaling up is easy with Kubernetes:

$ kubectl scale deployment proxysql --replicas=5
deployment.extensions/proxysql scaled

Verify the rollout status:

$ kubectl rollout status deployment proxysql
deployment "proxysql" successfully rolled out

Scaling down is also similar. Here we want to revert back from 5 to 2 replicas:

$ kubectl scale deployment proxysql --replicas=2
deployment.extensions/proxysql scaled

We can also look at the deployment events for ProxySQL to get a better picture of what has happened for this deployment by using the "describe" option:

$ kubectl describe deployment proxysql
...
Events:
  Type    Reason             Age    From                   Message
  ----    ------             ----   ----                   -------
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0
  Normal  ScalingReplicaSet  7m10s  deployment-controller  Scaled up replica set proxysql-6c55f647cb to 1
  Normal  ScalingReplicaSet  7m     deployment-controller  Scaled down replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  7m     deployment-controller  Scaled up replica set proxysql-6c55f647cb to 2
  Normal  ScalingReplicaSet  6m53s  deployment-controller  Scaled down replica set proxysql-769895fbf7 to 0
  Normal  ScalingReplicaSet  54s    deployment-controller  Scaled up replica set proxysql-6c55f647cb to 5
  Normal  ScalingReplicaSet  21s    deployment-controller  Scaled down replica set proxysql-6c55f647cb to 2

The connections to the pods will be load balanced automatically by Kubernetes.

Configuration Changes

One way to make configuration changes on our ProxySQL pods is by versioning our configuration using another ConfigMap name. Firstly, modify our configuration file directly via your favourite text editor:

$ vim /root/proxysql.cnf

Then, load it up into Kubernetes ConfigMap with a different name. In this example, we append "-v2" in the resource name:

$ kubectl create configmap proxysql-configmap-v2 --from-file=proxysql.cnf

Verify if the ConfigMap is loaded correctly:

$ kubectl get configmap
NAME                    DATA   AGE
proxysql-configmap      1      3d15h
proxysql-configmap-v2   1      19m

Open the ProxySQL deployment file, proxysql-rs-svc.yml and change the following line under configMap section to the new version:

      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap-v2 #change this line

Then, apply the changes to our ProxySQL deployment:

$ kubectl apply -f proxysql-rs-svc.yml
deployment.apps/proxysql configured
service/proxysql configured

Verify the rollout by using looking at the ReplicaSet event using the "describe" flag:

$ kubectl describe proxysql
...
Pod Template:
  Labels:  app=proxysql
           tier=frontend
  Containers:
   proxysql:
    Image:        severalnines/proxysql:1.4.12
    Ports:        6033/TCP, 6032/TCP
    Host Ports:   0/TCP, 0/TCP
    Environment:  <none>
    Mounts:
      /etc/proxysql.cnf from proxysql-config (rw)
  Volumes:
   proxysql-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      proxysql-configmap-v2
    Optional:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   proxysql-769895fbf7 (2/2 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  53s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  46s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
  Normal  ScalingReplicaSet  46s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
  Normal  ScalingReplicaSet  41s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0

Pay attention on the "Volumes" section with the new ConfigMap name. You can also see the deployment events at the bottom of the output. At this point, our new configuration has been loaded into all ProxySQL pods, where Kubernetes scaled down the ProxySQL ReplicaSet to 0 (obeying RollingUpdate strategy) and bring them back to the desired state of 2 replicas.

Final Thoughts

Up until this point, we have covered possible deployment approach for ProxySQL in Kubernetes. Running ProxySQL with the help of Kubernetes ConfigMap opens a new possibility of ProxySQL clustering, where it is somewhat different as compared to the native clustering support built-in inside ProxySQL.

In the upcoming blog post, we will explore ProxySQL Clustering using Kubernetes ConfigMap and how to do it the right way. Stay tuned!

Tags:

Delayed replication allows a replication slave to deliberately lag behind the master by at least a specified amount of time. Before executing an event, the slave will first wait, if necessary, until the given time has passed since the event was created on the master. The result is that the slave will reflect the state of the master some time back in the past. This feature is supported since MySQL 5.6 and MariaDB 10.2.3. It can come in handy in case of accidental data deletion, and should be part of your disaster recovery plan.

The problem when setting up a delayed replication slave is how much delay we should put on. Too short of time and you risk the bad query getting to your delayed slave before you can get to it, thus wasting the point of having the delayed slave. Optionally, you can have your delayed time to be so long that it take hours for your delayed slave to catch up to where the master was at the time of the error.

Luckily with Docker, container isolation is its strength. Running multiple MySQL instances is pretty convenient with Docker. It allows us to have multiple delayed slaves within a single physical host to improve our recovery time and save hardware resources. If you think a 15-minute delay is too short, we can have another instance with 1-hour delay or 6-hour for an even older snapshot of our database.

In this blog post, we are going to deploy multiple MySQL delayed slaves on one single physical host with Docker, and show some of recovery scenarios. The following diagram illustrates our final architecture that we want to build:

Our architecture consists of an already deployed 2-node MySQL Replication running on physical servers (blue) and we would like to set up another three MySQL slaves (green) with following behaviour:

15 minutes delay
1 hour delay
6 hours delay

Take note that we are going to have 3 copies of the exact same data on the same physical server. Ensure our Docker host has the storage required, so do allocate sufficient disk space beforehand.

MySQL Master Preparation

Firstly, login to the master server and create the replication user:

mysql> GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'YlgSH6bLLy';

Then, create a PITR-compatible backup on the master:

$ mysqldump -uroot -p --flush-privileges --hex-blob --opt --master-data=1 --single-transaction --skip-lock-tables --skip-lock-tables --triggers --routines --events --all-databases | gzip -6 -c > mysqldump_complete.sql.gz

If you are using ClusterControl, you can make a PITR-compatible backup easily. Go to Backups -> Create Backup and pick "Complete PITR-compatible" under the "Dump Type" dropdown:

Finally, transfer this backup to the Docker host:

$ scp mysqldump_complete.sql.gz root@192.168.55.200:~

This backup file will be used by the MySQL slave containers during the slave bootstrapping process, as shown in the next section.

Delayed Slave Deployment

Prepare our Docker container directories. Create 3 directories (mysql.conf.d, datadir and sql) for every MySQL container that we are going to launch (you can use loop to simplify the commands below):

$ mkdir -p /storage/mysql-slave-15m/mysql.conf.d
$ mkdir -p /storage/mysql-slave-15m/datadir
$ mkdir -p /storage/mysql-slave-15m/sql
$ mkdir -p /storage/mysql-slave-1h/mysql.conf.d
$ mkdir -p /storage/mysql-slave-1h/datadir
$ mkdir -p /storage/mysql-slave-1h/sql
$ mkdir -p /storage/mysql-slave-6h/mysql.conf.d
$ mkdir -p /storage/mysql-slave-6h/datadir
$ mkdir -p /storage/mysql-slave-6h/sql

"mysql.conf.d" directory will store our custom MySQL configuration file and will be mapped into the container under /etc/mysql.conf.d. "datadir" is where we want Docker to store the MySQL data directory, which maps to /var/lib/mysql of the container and "sql" directory stores our SQL files - backup files in .sql or .sql.gz format to stage the slave before replicating and also .sql files to automate the replication configuration and startup.

15-minute Delayed Slave

Prepare the MySQL configuration file for our 15-minute delayed slave:

$ vim /storage/mysql-slave-15m/mysql.conf.d/my.cnf

And add the following lines:

[mysqld]
server_id=10015
binlog_format=ROW
log_bin=binlog
log_slave_updates=1
gtid_mode=ON
enforce_gtid_consistency=1
relay_log=relay-bin
expire_logs_days=7
read_only=ON

** The server-id value we used for this slave is 10015.

Next, under /storage/mysql-slave-15m/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql).

Create a text file 1reset_master.sql and add the following line:

RESET MASTER;

Create a text file 3setup_slave.sql and add the following lines:

CHANGE MASTER TO MASTER_HOST = '192.168.55.171', MASTER_USER = 'rpl_user', MASTER_PASSWORD = 'YlgSH6bLLy', MASTER_AUTO_POSITION = 1, MASTER_DELAY=900;
START SLAVE;

MASTER_DELAY=900 is equal to 15 minutes (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the "sql" directory and renamed it as 2mysqldump_complete.sql.gz:

$ cp ~/mysqldump_complete.tar.gz /storage/mysql-slave-15m/sql/2mysqldump_complete.tar.gz

The final look of our "sql" directory should be something like this:

$ pwd
/storage/mysql-slave-15m/sql
$ ls -1
1reset_master.sql
2mysqldump_complete.sql.gz
3setup_slave.sql

Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container.

Once everything is in place, run the MySQL container for our 15-minute delayed slave:

$ docker run -d \
--name mysql-slave-15m \
-e MYSQL_ROOT_PASSWORD=password \
--mount type=bind,source=/storage/mysql-slave-15m/datadir,target=/var/lib/mysql \
--mount type=bind,source=/storage/mysql-slave-15m/mysql.conf.d,target=/etc/mysql/mysql.conf.d \
--mount type=bind,source=/storage/mysql-slave-15m/sql,target=/docker-entrypoint-initdb.d \
mysql:5.7

** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master.

The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171):

$ docker logs -f mysql-slave-15m
...
2018-12-04T04:05:24.890244Z 0 [Note] mysqld: ready for connections.
Version: '5.7.24-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
2018-12-04T04:05:25.010032Z 2 [Note] Slave I/O thread for channel '': connected to master 'rpl_user@192.168.55.171:3306',replication started in log 'FIRST' at position 4

You can then verify the replication status with following statement:

$ docker exec -it mysql-slave-15m mysql -uroot -p -e 'show slave status\G'
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 900
                Auto_Position: 1
...

At this point, our 15-minute delayed slave container is replicating correctly and our architecture is looking something like this:

1-hour Delayed Slave

Prepare the MySQL configuration file for our 1-hour delayed slave:

$ vim /storage/mysql-slave-1h/mysql.conf.d/my.cnf

And add the following lines:

[mysqld]
server_id=10060
binlog_format=ROW
log_bin=binlog
log_slave_updates=1
gtid_mode=ON
enforce_gtid_consistency=1
relay_log=relay-bin
expire_logs_days=7
read_only=ON

** The server-id value we used for this slave is 10060.

Next, under /storage/mysql-slave-1h/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql).

Create a text file 1reset_master.sql and add the following line:

RESET MASTER;

Create a text file 3setup_slave.sql and add the following lines:

CHANGE MASTER TO MASTER_HOST = '192.168.55.171', MASTER_USER = 'rpl_user', MASTER_PASSWORD = 'YlgSH6bLLy', MASTER_AUTO_POSITION = 1, MASTER_DELAY=3600;
START SLAVE;

MASTER_DELAY=3600 is equal to 1 hour (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the "sql" directory and renamed it as 2mysqldump_complete.sql.gz:

$ cp ~/mysqldump_complete.tar.gz /storage/mysql-slave-1h/sql/2mysqldump_complete.tar.gz

The final look of our "sql" directory should be something like this:

$ pwd
/storage/mysql-slave-1h/sql
$ ls -1
1reset_master.sql
2mysqldump_complete.sql.gz
3setup_slave.sql

Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container.

Once everything is in place, run the MySQL container for our 1-hour delayed slave:

$ docker run -d \
--name mysql-slave-1h \
-e MYSQL_ROOT_PASSWORD=password \
--mount type=bind,source=/storage/mysql-slave-1h/datadir,target=/var/lib/mysql \
--mount type=bind,source=/storage/mysql-slave-1h/mysql.conf.d,target=/etc/mysql/mysql.conf.d \
--mount type=bind,source=/storage/mysql-slave-1h/sql,target=/docker-entrypoint-initdb.d \
mysql:5.7

** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master.

The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171):

$ docker logs -f mysql-slave-1h
...
2018-12-04T04:05:24.890244Z 0 [Note] mysqld: ready for connections.
Version: '5.7.24-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
2018-12-04T04:05:25.010032Z 2 [Note] Slave I/O thread for channel '': connected to master 'rpl_user@192.168.55.171:3306',replication started in log 'FIRST' at position 4

You can then verify the replication status with following statement:

$ docker exec -it mysql-slave-1h mysql -uroot -p -e 'show slave status\G'
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 3600
                Auto_Position: 1
...

At this point, our 15-minute and 1-hour MySQL delayed slave containers are replicating from the master and our architecture is looking something like this:

6-hour Delayed Slave

Prepare the MySQL configuration file for our 6-hour delayed slave:

$ vim /storage/mysql-slave-15m/mysql.conf.d/my.cnf

And add the following lines:

[mysqld]
server_id=10006
binlog_format=ROW
log_bin=binlog
log_slave_updates=1
gtid_mode=ON
enforce_gtid_consistency=1
relay_log=relay-bin
expire_logs_days=7
read_only=ON

** The server-id value we used for this slave is 10006.

Next, under /storage/mysql-slave-6h/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql).

Create a text file 1reset_master.sql and add the following line:

RESET MASTER;

Create a text file 3setup_slave.sql and add the following lines:

CHANGE MASTER TO MASTER_HOST = '192.168.55.171', MASTER_USER = 'rpl_user', MASTER_PASSWORD = 'YlgSH6bLLy', MASTER_AUTO_POSITION = 1, MASTER_DELAY=21600;
START SLAVE;

MASTER_DELAY=21600 is equal to 6 hours (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the "sql" directory and renamed it as 2mysqldump_complete.sql.gz:

$ cp ~/mysqldump_complete.tar.gz /storage/mysql-slave-6h/sql/2mysqldump_complete.tar.gz

The final look of our "sql" directory should be something like this:

$ pwd
/storage/mysql-slave-6h/sql
$ ls -1
1reset_master.sql
2mysqldump_complete.sql.gz
3setup_slave.sql

Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container.

Once everything is in place, run the MySQL container for our 6-hour delayed slave:

$ docker run -d \
--name mysql-slave-6h \
-e MYSQL_ROOT_PASSWORD=password \
--mount type=bind,source=/storage/mysql-slave-6h/datadir,target=/var/lib/mysql \
--mount type=bind,source=/storage/mysql-slave-6h/mysql.conf.d,target=/etc/mysql/mysql.conf.d \
--mount type=bind,source=/storage/mysql-slave-6h/sql,target=/docker-entrypoint-initdb.d \
mysql:5.7

** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master.

The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171):

$ docker logs -f mysql-slave-6h
...
2018-12-04T04:05:24.890244Z 0 [Note] mysqld: ready for connections.
Version: '5.7.24-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
2018-12-04T04:05:25.010032Z 2 [Note] Slave I/O thread for channel '': connected to master 'rpl_user@192.168.55.171:3306',replication started in log 'FIRST' at position 4

You can then verify the replication status with following statement:

$ docker exec -it mysql-slave-6h mysql -uroot -p -e 'show slave status\G'
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 21600
                Auto_Position: 1
...

At this point, our 5 minutes, 1-hour and 6-hour delayed slave containers are replicating correctly and our architecture is looking something like this:

Disaster Recovery Scenario

Let's say a user has a accidentally drop a wrong column on a big table. Consider the following statement was executed on the master:

mysql> ALTER TABLE settings DROP COLUMN status;

If you are lucky enough to realize it immediately, you could use the 15-minute delayed slave to catch up to the moment before the disaster happens and promote it to become master, or export the missing data out and restore it on the master.

Firstly, we have to find the binary log position before the disaster happened. Grab the time now() on the master:

mysql> SELECT now();
+---------------------+
| now()               |
+---------------------+
| 2018-12-04 14:55:41 |
+---------------------+

Then, get the active binary log file on the master:

mysql> SHOW MASTER STATUS;
+---------------+----------+--------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                                                                                                                     |
+---------------+----------+--------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| binlog.000004 | 20260658 |              |                  | 1560665e-ed2b-11e8-93fa-000c29b7f985:1-12031,
1b235f7a-d37b-11e8-9c3e-000c29bafe8f:1-62519,
1d8dc60a-e817-11e8-82ff-000c29bafe8f:1-326575,
791748b3-d37a-11e8-b03a-000c29b7f985:1-374 |
+---------------+----------+--------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Using the same date format, extract the information that we want from the binary log, binlog.000004. We estimate the start time to read from the binlog around 20 minutes ago (2018-12-04 14:35:00) and filter the output to show 25 lines before the "drop column" statement:

$ mysqlbinlog --start-datetime="2018-12-04 14:35:00" --stop-datetime="2018-12-04 14:55:41" /var/lib/mysql/binlog.000004 | grep -i -B 25 "drop column"'/*!*/;
# at 19379172
#181204 14:54:45 server id 1  end_log_pos 19379232 CRC32 0x0716e7a2     Table_map: `shop`.`settings` mapped to number 766
# at 19379232
#181204 14:54:45 server id 1  end_log_pos 19379460 CRC32 0xa6187edd     Write_rows: table id 766 flags: STMT_END_F

BINLOG '
tSQGXBMBAAAAPAAAACC0JwEAAP4CAAAAAAEABnNidGVzdAAHc2J0ZXN0MgAFAwP+/gME/nj+PBCi
5xYH
tSQGXB4BAAAA5AAAAAS1JwEAAP4CAAAAAAEAAgAF/+AYwwAAysYAAHc0ODYyMjI0NjI5OC0zNDE2
OTY3MjY5OS02MDQ1NTQwOTY1Ny01MjY2MDQ0MDcwOC05NDA0NzQzOTUwMS00OTA2MTAxNzgwNC05
OTIyMzM3NzEwOS05NzIwMzc5NTA4OC0yODAzOTU2NjQ2MC0zNzY0ODg3MTYzOTswMTM0MjAwNTcw
Ni02Mjk1ODMzMzExNi00NzQ1MjMxODA1OS0zODk4MDQwMjk5MS03OTc4MTA3OTkwNQEAAADdfhim
'/*!*/;
# at 19379460
#181204 14:54:45 server id 1  end_log_pos 19379491 CRC32 0x71f00e63     Xid = 622405
COMMIT/*!*/;
# at 19379491
#181204 14:54:46 server id 1  end_log_pos 19379556 CRC32 0x62b78c9e     GTID    last_committed=11507    sequence_number=11508   rbr_only=no
SET @@SESSION.GTID_NEXT= '1560665e-ed2b-11e8-93fa-000c29b7f985:11508'/*!*/;
# at 19379556
#181204 14:54:46 server id 1  end_log_pos 19379672 CRC32 0xc222542a     Query   thread_id=3162  exec_time=1     error_code=0
SET TIMESTAMP=1543906486/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
ALTER TABLE settings DROP COLUMN status

In the bottom few lines of the mysqlbinlog output, you should have the erroneous command that was executed at position 19379556. The position that we should restore is one step before this, which is in position 19379491. This is the binlog position where we want our delayed slave to be up to.

Then, on the chosen delayed slave, stop the delayed replication slave and start again the slave to a fixed end position that we figured out above:

$ docker exec -it mysql-slave-15m mysql -uroot -p
mysql> STOP SLAVE;
mysql> START SLAVE UNTIL MASTER_LOG_FILE = 'binlog.000004', MASTER_LOG_POS = 19379491;

Monitor the replication status and wait until Exec_Master_Log_Pos is equal to Until_Log_Pos value. This could take some time. Once caught up, you should see the following:

$ docker exec -it mysql-slave-15m mysql -uroot -p -e 'SHOW SLAVE STATUS\G'
... 
          Exec_Master_Log_Pos: 19379491
              Relay_Log_Space: 50552186
              Until_Condition: Master
               Until_Log_File: binlog.000004
                Until_Log_Pos: 19379491
...

Finally verify if the missing data that we were looking for is there (column "status" still exists):

mysql> DESCRIBE shop.settings;
+--------+------------------+------+-----+---------+----------------+
| Field  | Type             | Null | Key | Default | Extra          |
+--------+------------------+------+-----+---------+----------------+
| id     | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| sid    | int(10) unsigned | NO   | MUL | 0       |                |
| param  | varchar(100)     | NO   |     |         |                |
| value  | varchar(255)     | NO   |     |         |                |
| status | int(11)          | YES  |     | 1       |                |
+--------+------------------+------+-----+---------+----------------+

Then export the table from our slave container and transfer it to the master server:

$ docker exec -it mysql-slave-1h mysqldump -uroot -ppassword --single-transaction shop settings > shop_settings.sql

Drop the problematic table and restore it back on the master:

$ mysql -uroot -p -e 'DROP TABLE shop.settings'
$ mysqldump -uroot -p -e shop < shop_setttings.sql

We have now recovered our table back to its original state before the disastrous event. To summarize, delayed replication can be used for several purposes:

To protect against user mistakes on the master. A DBA can roll back a delayed slave to the time just before the disaster.
To test how the system behaves when there is a lag. For example, in an application, a lag might be caused by a heavy load on the slave. However, it can be difficult to generate this load level. Delayed replication can simulate the lag without having to simulate the load. It can also be used to debug conditions related to a lagging slave.
To inspect what the database looked like in the past, without having to reload a backup. For example, if the delay is one week and the DBA needs to see what the database looked like before the last few days' worth of development, the delayed slave can be inspected.

Final Thoughts

With Docker, running multiple MySQL instances on a same physical host can be done efficiently. You may use Docker orchestration tools like Docker Compose and Swarm to simplify the multi-container deployment as opposed to the steps shown in this blog post.

Tags:

The main advantage of using MongoDB is that it’s easy to use. One can easily install MongoDB and start working on it in minutes. Docker makes this process even easier.

One cool thing about Docker is that, with very little effort and some configuration, we can spin up a container and start working on any technology. In this article, we will spin up a MongoDB container using Docker and learn how to attach the storage volume from a host system to a container.

Prerequisites for Deploying MongoDB on Docker

We will only need Docker installed in the system for this tutorial.

Creating a MongoDB Image

First create a folder and create a file with the name Dockerfile inside that folder:

$ mkdir mongo-with-docker
$ cd mongo-with-docker
$ vi Dockerfile

Paste this content in your Dockerfile:

FROM debian:jessie-slim
RUN apt-get update && \
apt-get install -y ca-certificates && \
rm -rf /var/lib/apt/lists/*
RUN gpg --keyserver ha.pool.sks-keyservers.net --recv-keys 0C49F3730359A14518585931BC711F9BA15703C6 && \
gpg --export $GPG_KEYS > /etc/apt/trusted.gpg.d/mongodb.gpg
ARG MONGO_PACKAGE=mongodb-org
ARG MONGO_REPO=repo.mongodb.org
ENV MONGO_PACKAGE=${MONGO_PACKAGE} MONGO_REPO=${MONGO_REPO}
ENV MONGO_MAJOR 3.4
ENV MONGO_VERSION 3.4.18
RUN echo "deb http://$MONGO_REPO/apt/debian jessie/${MONGO_PACKAGE%-unstable}/$MONGO_MAJOR main" | tee "/etc/apt/sources.list.d/${MONGO_PACKAGE%-unstable}.list"
RUN echo "/etc/apt/sources.list.d/${MONGO_PACKAGE%-unstable}.list"
RUN apt-get update
RUN apt-get install -y ${MONGO_PACKAGE}=$MONGO_VERSION
VOLUME ["/data/db"]
WORKDIR /data
EXPOSE 27017
CMD ["mongod", "--smallfiles"]

Then run this command to build your own MongoDB Docker image:

docker build -t hello-mongo:latest .

Understanding the Docker File Content

The structure of each line in docker file is as follows:

INSTRUCTIONS arguments

FROM: Base image from which we’ll start building the container
RUN: This commands executes all instructions to install MongoDB in the base image.
ARG: Stores some default values for the Docker build. These values are not available to the container. Can be overridden during the building process of the image using the --build-arg argument.
ENV: These values are available during the build phase as well as after launching the container. Can be overridden by passing the -e argument to docker run command.
VOLUME: Attaches the data/db volume to container.
WORKDIR: Sets the work directory to execute any RUN or CMD commands.
EXPOSE: Exposes the container’s port to host the system (outside world).
CMD: Starts the mongod instance in the container.

Become a MongoDB DBA - Bringing MongoDB to Production

Learn about what you need to know to deploy, monitor, manage and scale MongoDB

Download for Free

Starting the MongoDB Container From the Image

You can start the MongoDB container by issuing the following command:

docker run --name my-mongo -d -v /tmp/mongodb:/data/db -p 27017:27017 hello-mongo

--name: Name of the container.
-d: Will start the container as a background (daemon) process. Don’t specify this argument to run the container as foreground process.
-v: Attach the /tmp/mongodb volume of the host system to /data/db volume of the container.
-p: Map the host port to the container port.
Last argument is the name/id of the image.

To check whether the container is running or not, issue the following command:

docker ps

Output of this command should look like the following:

CONTAINER ID        IMAGE               COMMAND                 CREATED             STATUS              PORTS                      NAMES
a7e04bae0c53        hello-mongo         "mongod --smallfiles"   7 seconds ago       Up 6 seconds        0.0.0.0:27017->27017/tcp   my-mongo

Accessing MongoDB From the Host

Once the container is up and running, we can access it the same way as accessing the remote MongoDB instance. You can use any utility like Compass or Robomongo to connect to this instance. For now, I’ll use mongo command to connect. Run the following command in your terminal:

mongo 27017

It will open mongo shell where you can execute any mongo commands. Now we’ll create one database and add some data in it.

use mydb
db.myColl.insert({“name”: “severalnines”})
quit()

Now to check whether our volume mapping is correct or not, we will restart the container and check whether it has our data or not.

Docker restart <container_id>

Now again connect to mongo shell and run this command:

db.myColl.find().pretty()

You should see this result:

{ "_id" : ObjectId("5be7e05d20aab8d0622adf46"), "name" : "severalnines" }

This means our container is persisting the database data even after restarting it. This is possible because of volume mapping. The container will store all our data in /tmp/mongodb directory in the host system. So when you restart the container, all data inside the container will be erased and a new container will access the data from the host tmp/mongodb directory.

Accessing MongoDB Container Shell

$ docker exec -it <container-name> /bin/bash

Accessing MongoDB Container Logs

$ docker logs <container-name>

Connecting to the MongoDB Container From Another Container

You can connect to the MongoDB container from any other container using --link argument which follows the following structure.

--link <Container Name/Id>:<Alias>

Where Alias is an alias for link name. Run this command to link our Mongo container with express-mongo container.

docker run --link my-mongo:mongo -p 8081:8081 mongo-express

This command will pull the mongo-express image from dockerhub and start a new container. Mongo-express is an admin UI for MongoDB. Now go to http://localhost:8081 to access this interface.

Mongo-express Admin UI

Conclusion

In this article, we learned how to deploy a MongoDB image from scratch and how to create a MongoDB container using Docker. We also went through some important concepts like volume mapping and connecting to a MongoDB container from another container using links.

Docker eases the process of deploying multiple MongoDB instances. We can use the same MongoDB image to build any number of containers which can be used for creating Replica Sets. To make this process even smoother, we can write a YAML file (configuration file) and use docker-compose utility to deploy all the containers with the single command.

Tags:

ProxySQL is an intelligent and high-performance SQL proxy which supports MySQL, MariaDB and ClickHouse. Recently, ProxySQL 2.0 has become GA and it comes with new exciting features such as GTID consistent reads, frontend SSL, Galera and MySQL Group Replication native support.

It is relatively easy to run ProxySQL as Docker container. We have previously written about how to run ProxySQL on Kubernetes as a helper container or as a Kubernetes service, which is based on ProxySQL 1.x. In this blog post, we are going to use the new version ProxySQL 2.x which uses a different approach for Galera Cluster configuration.

ProxySQL 2.x Docker Image

We have released a new ProxySQL 2.0 Docker image container and it's available in Docker Hub. The README provides a number of configuration examples particularly for Galera and MySQL Replication, pre and post v2.x. The configuration lines can be defined in a text file and mapped into the container's path at /etc/proxysql.cnf to be loaded into ProxySQL service.

The image "latest" tag still points to 1.x until ProxySQL 2.0 officially becomes GA (we haven't seen any official release blog/article from ProxySQL team yet). Which means, whenever you install ProxySQL image using latest tag from Severalnines, you will still get version 1.x with it. Take note the new example configurations also enable ProxySQL web stats (introduced in 1.4.4 but still in beta) - a simple dashboard that summarizes the overall configuration and status of ProxySQL itself.

ProxySQL 2.x Support for Galera Cluster

Let's talk about Galera Cluster native support in greater detail. The new mysql_galera_hostgroups table consists of the following fields:

writer_hostgroup: ID of the hostgroup that will contain all the members that are writers (read_only=0).
backup_writer_hostgroup: If the cluster is running in multi-writer mode (i.e. there are multiple nodes with read_only=0) and max_writers is set to a smaller number than the total number of nodes, the additional nodes are moved to this backup writer hostgroup.
reader_hostgroup: ID of the hostgroup that will contain all the members that are readers (i.e. nodes that have read_only=1)
offline_hostgroup: When ProxySQL monitoring determines a host to be OFFLINE, the host will be moved to the offline_hostgroup.
active: a boolean value (0 or 1) to activate a hostgroup
max_writers: Controls the maximum number of allowable nodes in the writer hostgroup, as mentioned previously, additional nodes will be moved to the backup_writer_hostgroup.
writer_is_also_reader: When 1, a node in the writer_hostgroup will also be placed in the reader_hostgroup so that it will be used for reads. When set to 2, the nodes from backup_writer_hostgroup will be placed in the reader_hostgroup, instead of the node(s) in the writer_hostgroup.
max_transactions_behind: determines the maximum number of writesets a node in the cluster can have queued before the node is SHUNNED to prevent stale reads (this is determined by querying the wsrep_local_recv_queue Galera variable).
comment: Text field that can be used for any purposes defined by the user

Here is an example configuration for mysql_galera_hostgroups in table format:

Admin> select * from mysql_galera_hostgroups\G
*************************** 1. row ***************************
       writer_hostgroup: 10
backup_writer_hostgroup: 20
       reader_hostgroup: 30
      offline_hostgroup: 9999
                 active: 1
            max_writers: 1
  writer_is_also_reader: 2
max_transactions_behind: 20
                comment:

ProxySQL performs Galera health checks by monitoring the following MySQL status/variable:

read_only - If ON, then ProxySQL will group the defined host into reader_hostgroup unless writer_is_also_reader is 1.
wsrep_desync - If ON, ProxySQL will mark the node as unavailable, moving it to offline_hostgroup.
wsrep_reject_queries - If this variable is ON, ProxySQL will mark the node as unavailable, moving it to the offline_hostgroup (useful in certain maintenance situations).
wsrep_sst_donor_rejects_queries - If this variable is ON, ProxySQL will mark the node as unavailable while the Galera node is serving as an SST donor, moving it to the offline_hostgroup.
wsrep_local_state - If this status returns other than 4 (4 means Synced), ProxySQL will mark the node as unavailable and move it into offline_hostgroup.
wsrep_local_recv_queue - If this status is higher than max_transactions_behind, the node will be shunned.
wsrep_cluster_status - If this status returns other than Primary, ProxySQL will mark the node as unavailable and move it into offline_hostgroup.

Having said that, by combining these new parameters in mysql_galera_hostgroups together with mysql_query_rules, ProxySQL 2.x has the flexibility to fit into much more Galera use cases. For example, one can have a single-writer, multi-writer and multi-reader hostgroups defined as the destination hostgroup of a query rule, with the ability to limit the number of writers and finer control on the stale reads behaviour.

Contrast this to ProxySQL 1.x, where the user had to explicitly define a scheduler to call an external script to perform the backend health checks and update the database servers state. This requires some customization to the script (user has to update the ProxySQL admin user/password/port) plus it depended on an additional tool (MySQL client) to connect to ProxySQL admin interface.

Here is an example configuration of Galera health check script scheduler in table format for ProxySQL 1.x:

Admin> select * from scheduler\G
*************************** 1. row ***************************
         id: 1
     active: 1
interval_ms: 2000
   filename: /usr/share/proxysql/tools/proxysql_galera_checker.sh
       arg1: 10
       arg2: 20
       arg3: 1
       arg4: 1
       arg5: /var/lib/proxysql/proxysql_galera_checker.log
    comment:

Besides, since ProxySQL scheduler thread executes any script independently, there are many versions of health check scripts available out there. All ProxySQL instances deployed by ClusterControl uses the default script provided by the ProxySQL installer package.

In ProxySQL 2.x, max_writers and writer_is_also_reader variables can determine how ProxySQL dynamically groups the backend MySQL servers and will directly affect the connection distribution and query routing. For example, consider the following MySQL backend servers:

Admin> select hostgroup_id, hostname, status, weight from mysql_servers;
+--------------+--------------+--------+--------+
| hostgroup_id | hostname     | status | weight |
+--------------+--------------+--------+--------+
| 10           | DB1          | ONLINE | 1      |
| 10           | DB2          | ONLINE | 1      |
| 10           | DB3          | ONLINE | 1      |
+--------------+--------------+--------+--------+

Together with the following Galera hostgroups definition:

Admin> select * from mysql_galera_hostgroups\G
*************************** 1. row ***************************
       writer_hostgroup: 10
backup_writer_hostgroup: 20
       reader_hostgroup: 30
      offline_hostgroup: 9999
                 active: 1
            max_writers: 1
  writer_is_also_reader: 2
max_transactions_behind: 20
                comment:

Considering all hosts are up and running, ProxySQL will most likely group the hosts as below:

Let's look at them one by one:

Configuration	Description
writer_is_also_reader=0	Groups the hosts into 2 hostgroups (writer and backup_writer). Writer is part of the backup_writer. Since the writer is not a reader, nothing in hostgroup 30 (reader) because none of the hosts are set with read_only=1. It is not a common practice in Galera to enable the read-only flag.
writer_is_also_reader=1	Groups the hosts into 3 hostgroups (writer, backup_writer and reader). Variable read_only=0 in Galera has no affect thus writer is also in hostgroup 30 (reader) Writer is not part of backup_writer.
writer_is_also_reader=2	Similar with writer_is_also_reader=1 however, writer is part of backup_writer.

With this configuration, one can have various choices for hostgroup destination to cater for specific workloads. "Hotspot" writes can be configured to go to only one server to reduce multi-master conflicts, non-conflicting writes can be distributed equally on the other masters, most reads can be distributed evenly on all MySQL servers or non-writers, critical reads can be forwarded to the most up-to-date servers and analytical reads can be forwarded to a slave replica.

ProxySQL Deployment for Galera Cluster

In this example, suppose we already have a three-node Galera Cluster deployed by ClusterControl as shown in the following diagram:

Our Wordpress applications are running on Docker while the Wordpress database is hosted on our Galera Cluster running on bare-metal servers. We decided to run a ProxySQL container alongside our Wordpress containers to have a better control on Wordpress database query routing and fully utilize our database cluster infrastructure. Since the read-write ratio is around 80%-20%, we want to configure ProxySQL to:

Forward all writes to one Galera node (less conflict, focus on write)
Balance all reads to the other two Galera nodes (better distribution for the majority of the workload)

Firstly, create a ProxySQL configuration file inside the Docker host so we can map it into our container:

$ mkdir /root/proxysql-docker
$ vim /root/proxysql-docker/proxysql.cnf

Then, copy the following lines (we will explain the configuration lines further down):

datadir="/var/lib/proxysql"

admin_variables=
{
    admin_credentials="admin:admin"
    mysql_ifaces="0.0.0.0:6032"
    refresh_interval=2000
    web_enabled=true
    web_port=6080
    stats_credentials="stats:admin"
}

mysql_variables=
{
    threads=4
    max_connections=2048
    default_query_delay=0
    default_query_timeout=36000000
    have_compress=true
    poll_timeout=2000
    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
    default_schema="information_schema"
    stacksize=1048576
    server_version="5.1.30"
    connect_timeout_server=10000
    monitor_history=60000
    monitor_connect_interval=200000
    monitor_ping_interval=200000
    ping_interval_server_msec=10000
    ping_timeout_server=200
    commands_stats=true
    sessions_sort=true
    monitor_username="proxysql"
    monitor_password="proxysqlpassword"
    monitor_galera_healthcheck_interval=2000
    monitor_galera_healthcheck_timeout=800
}

mysql_galera_hostgroups =
(
    {
        writer_hostgroup=10
        backup_writer_hostgroup=20
        reader_hostgroup=30
        offline_hostgroup=9999
        max_writers=1
        writer_is_also_reader=1
        max_transactions_behind=30
        active=1
    }
)

mysql_servers =
(
    { address="db1.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db2.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db3.cluster.local" , port=3306 , hostgroup=10, max_connections=100 }
)

mysql_query_rules =
(
    {
        rule_id=100
        active=1
        match_pattern="^SELECT .* FOR UPDATE"
        destination_hostgroup=10
        apply=1
    },
    {
        rule_id=200
        active=1
        match_pattern="^SELECT .*"
        destination_hostgroup=20
        apply=1
    },
    {
        rule_id=300
        active=1
        match_pattern=".*"
        destination_hostgroup=10
        apply=1
    }
)

mysql_users =
(
    { username = "wordpress", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 },
    { username = "sbtest", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 }
)

Now, let's pay a visit to some of the most configuration sections. Firstly, we define the Galera hostgroups configuration as below:

mysql_galera_hostgroups =
(
    {
        writer_hostgroup=10
        backup_writer_hostgroup=20
        reader_hostgroup=30
        offline_hostgroup=9999
        max_writers=1
        writer_is_also_reader=1
        max_transactions_behind=30
        active=1
    }
)

Hostgroup 10 will be the writer_hostgroup, hostgroup 20 for backup_writer and hostgroup 30 for reader. We set max_writers to 1 so we can have a single-writer hostgroup for hostgroup 10 where all writes should be sent to. Then, we define writer_is_also_reader to 1 which will make all Galera nodes as reader as well, suitable for queries that can be equally distributed to all nodes. Hostgroup 9999 is reserved for offline_hostgroup if ProxySQL detects unoperational Galera nodes.

Then, we configure our MySQL servers with default to hostgroup 10:

mysql_servers =
(
    { address="db1.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db2.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db3.cluster.local" , port=3306 , hostgroup=10, max_connections=100 }
)

With the above configurations, ProxySQL will "see" our hostgroups as below:

Then, we define the query routing through query rules. Based on our requirement, all reads should be sent to all Galera nodes except the writer (hostgroup 20) and everything else is forwarded to hostgroup 10 for single writer:

mysql_query_rules =
(
    {
        rule_id=100
        active=1
        match_pattern="^SELECT .* FOR UPDATE"
        destination_hostgroup=10
        apply=1
    },
    {
        rule_id=200
        active=1
        match_pattern="^SELECT .*"
        destination_hostgroup=20
        apply=1
    },
    {
        rule_id=300
        active=1
        match_pattern=".*"
        destination_hostgroup=10
        apply=1
    }
)

Finally, we define the MySQL users that will be passed through ProxySQL:

mysql_users =
(
    { username = "wordpress", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 },
    { username = "sbtest", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 }
)

We set transaction_persistent to 0 so all connections coming from these users will respect the query rules for reads and writes routing. Otherwise, the connections would end up hitting one hostgroup which defeats the purpose of load balancing. Do not forget to create those users first on all MySQL servers. For ClusterControl user, you may use Manage -> Schemas and Users feature to create those users.

We are now ready to start our container. We are going to map the ProxySQL configuration file as bind mount when starting up the ProxySQL container. Thus, the run command will be:

$ docker run -d \
--name proxysql2 \
--hostname proxysql2 \
--publish 6033:6033 \
--publish 6032:6032 \
--publish 6080:6080 \
--restart=unless-stopped \
-v /root/proxysql/proxysql.cnf:/etc/proxysql.cnf \
severalnines/proxysql:2.0

Finally, change the Wordpress database pointing to ProxySQL container port 6033, for instance:

$ docker run -d \
--name wordpress \
--publish 80:80 \
--restart=unless-stopped \
-e WORDPRESS_DB_HOST=proxysql2:6033 \
-e WORDPRESS_DB_USER=wordpress \
-e WORDPRESS_DB_HOST=passw0rd \
wordpress

At this point, our architecture is looking something like this:

If you want ProxySQL container to be persistent, map /var/lib/proxysql/ to a Docker volume or bind mount, for example:

$ docker run -d \
--name proxysql2 \
--hostname proxysql2 \
--publish 6033:6033 \
--publish 6032:6032 \
--publish 6080:6080 \
--restart=unless-stopped \
-v /root/proxysql/proxysql.cnf:/etc/proxysql.cnf \
-v proxysql-volume:/var/lib/proxysql \
severalnines/proxysql:2.0

Keep in mind that running with persistent storage like the above will make our /root/proxysql/proxysql.cnf obsolete on the second restart. This is due to ProxySQL multi-layer configuration whereby if /var/lib/proxysql/proxysql.db exists, ProxySQL will skip loading options from configuration file and load whatever is in the SQLite database instead (unless you start proxysql service with --initial flag). Having said that, the next ProxySQL configuration management has to be performed via ProxySQL admin console on port 6032, instead of using configuration file.

Monitoring

ProxySQL process log by default logging to syslog and you can view them by using standard docker command:

$ docker ps
$ docker logs proxysql2

To verify the current hostgroup, query the runtime_mysql_servers table:

$ docker exec -it proxysql2 mysql -uadmin -padmin -h127.0.0.1 -P6032 --prompt='Admin> '
Admin> select hostgroup_id,hostname,status from runtime_mysql_servers;
+--------------+--------------+--------+
| hostgroup_id | hostname     | status |
+--------------+--------------+--------+
| 10           | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.22 | ONLINE |
| 30           | 192.168.0.23 | ONLINE |
| 20           | 192.168.0.22 | ONLINE |
| 20           | 192.168.0.23 | ONLINE |
+--------------+--------------+--------+

If the selected writer goes down, it will be transferred to the offline_hostgroup (HID 9999):

Admin> select hostgroup_id,hostname,status from runtime_mysql_servers;
+--------------+--------------+--------+
| hostgroup_id | hostname     | status |
+--------------+--------------+--------+
| 10           | 192.168.0.22 | ONLINE |
| 9999         | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.22 | ONLINE |
| 30           | 192.168.0.23 | ONLINE |
| 20           | 192.168.0.23 | ONLINE |
+--------------+--------------+--------+

The above topology changes can be illustrated in the following diagram:

We have also enabled the web stats UI with admin-web_enabled=true.To access the web UI, simply go to the Docker host in port 6080, for example: http://192.168.0.200:8060 and you will be prompted with username/password pop up. Enter the credentials as defined under admin-stats_credentials and you should see the following page:

By monitoring MySQL connection pool table, we can get connection distribution overview for all hostgroups:

Admin> select hostgroup, srv_host, status, ConnUsed, MaxConnUsed, Queries from stats.stats_mysql_connection_pool order by srv_host;
+-----------+--------------+--------+----------+-------------+---------+
| hostgroup | srv_host     | status | ConnUsed | MaxConnUsed | Queries |
+-----------+--------------+--------+----------+-------------+---------+
| 20        | 192.168.0.23 | ONLINE | 5        | 24          | 11458   |
| 30        | 192.168.0.23 | ONLINE | 0        | 0           | 0       |
| 20        | 192.168.0.22 | ONLINE | 2        | 24          | 11485   |
| 30        | 192.168.0.22 | ONLINE | 0        | 0           | 0       |
| 10        | 192.168.0.21 | ONLINE | 32       | 32          | 9746    |
| 30        | 192.168.0.21 | ONLINE | 0        | 0           | 0       |
+-----------+--------------+--------+----------+-------------+---------+

The output above shows that hostgroup 30 does not process anything because our query rules do not have this hostgroup configured as destination hostgroup.

The statistics related to the Galera nodes can be viewed in the mysql_server_galera_log table:

Admin>  select * from mysql_server_galera_log order by time_start_us desc limit 3\G
*************************** 1. row ***************************
                       hostname: 192.168.0.23
                           port: 3306
                  time_start_us: 1552992553332489
                success_time_us: 2045
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL
*************************** 2. row ***************************
                       hostname: 192.168.0.22
                           port: 3306
                  time_start_us: 1552992553329653
                success_time_us: 2799
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL
*************************** 3. row ***************************
                       hostname: 192.168.0.21
                           port: 3306
                  time_start_us: 1552992553329013
                success_time_us: 2715
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL

The resultset returns the related MySQL variable/status state for every Galera node for a particular timestamp. In this configuration, we configured the Galera health check to run every 2 seconds (monitor_galera_healthcheck_interval=2000). Hence, the maximum failover time would be around 2 seconds if a topology change happens to the cluster.

References

Tags:

Users of open source databases often have to use a mixture of tools and homegrown scripts to manage their production database environments. However, even while having own homegrown scripts in the solution, it’s hard to maintain it and keep up with new database features, security requirements or upgrades. With new major versions of a database, including MySQL 8.0, this task can become even harder.

At the heart of ClusterControl is its automation functionality that lets you automate the database tasks you have to perform regularly, like deploying new databases, adding and scaling new nodes, managing backups, high availability and failover, topology changes, upgrades, and more. Automated procedures are accurate, consistent, and repeatable so you can minimize the risk of changes on the production environments.

Moreover, with ClusterControl, MySQL users are no longer subject to vendor lock-in; something that was questioned by many recently. You can deploy and import a variety of MySQL versions and vendors from a single console for free.

In this article, we will show you how to deploy MySQL 8.0 with a battle tested configuration and manage it in an automated way. You will find here how to do:

ClusterControl installation
MySQL deployment process
- Deploy a new cluster
- Import existing cluster
Scaling MySQL
Securing MySQL
Monitoring and Trending
Backup and Recovery
Node and Cluster autorecovery (auto failover)

ClusterControl installation

To start with ClusterControl you need a dedicated virtual machine or host. The VM and supported systems requirements are described here. The base VM can start from 2 GB, 2 cores and Disk space 20 GB storage space, either on-prem or in the cloud.

The installation is well described in the documentation but basically, you download an installation script which walks you through the steps. The wizard script sets up the internal database, installs necessary packages, repositories, and other necessary tweaks. For environments without internet access, you can use the offline installation process.

ClusterControl requires SSH access to the database hosts, and monitoring can be agent-based or agentless. Management is agentless.

To setup passwordless SSH to all target nodes (ClusterControl and all database hosts), run the following commands on the ClusterControl server:

$ ssh-keygen -t rsa # press enter on all prompts
$ ssh-copy-id -i ~/.ssh/id_rsa [ClusterControl IP address]
$ ssh-copy-id -i ~/.ssh/id_rsa [Database nodes IP address] # repeat this to all target database nodes

One of the most convenient ways to try out cluster control maybe the option to run it in docker container.

docker run -d --name clustercontrol \
--network db-cluster \
--ip 192.168.10.10 \
-h clustercontrol \
-p 5000:80 \
-p 5001:443 \
-v /storage/clustercontrol/cmon.d:/etc/cmon.d \
-v /storage/clustercontrol/datadir:/var/lib/mysql \
-v /storage/clustercontrol/sshkey:/root/.ssh \
-v /storage/clustercontrol/cmonlib:/var/lib/cmon \
-v /storage/clustercontrol/backups:/root/backups \
severalnines/clustercontrol

After successful deployment, you should be able to access the ClusterControl Web UI at {host's IP address}:{host's port}, for example:

HTTP: http://192.168.10.100:5000/clustercontrol
HTTPS: https://192.168.10.100:5001/clustercontrol

Deployment and Scaling

Deploy MySQL 8.0

Once we enter the ClusterControl interface, the first thing to do is to deploy a new database or import an existing one. The new version 1.7.2 introduces support for version 8.0 of Oracle Community Edition and Percona Server. At the time of writing this blog, the current versions are Oracle MySQL Server 8.0.15 and Percona Server for MySQL 8.0-15. Select the option “Deploy Database Cluster” and follow the instructions that appear.

ClusterControl: Deploy Database Cluster

When choosing MySQL, we must specify User, Key or Password and port to connect by SSH to our servers. We also need a name for our new cluster and if we want ClusterControl to install the corresponding software and configurations for us.

After setting up the SSH access information, we must enter the data to access our database. We can also specify which repository to use. Repository configuration is an important aspect for database servers and clusters. You can have three types of repositories when deploying database server/cluster using ClusterControl:

Use Vendor Repository
Provision software by setting up and using the database vendor’s preferred software repository. ClusterControl will install the latest version of what is provided by the database vendor repository.
Do Not Setup Vendor Repositories
Provision software by using the pre-existing software repository already set up on the nodes. The user has to set up the software repository manually on each database node and ClusterControl will use this repository for deployment. This is good if the database nodes are running without internet connection.
Use Mirrored Repositories (Create new repository)
Create and mirror the current database vendor’s repository and then deploy using the local mirrored repository. This allows you to “freeze” the current versions of the software packages.

In the next step, we need to add our servers to the cluster that we are going to create. When adding our servers, we can enter IP or hostname then choose network interface. For the latter, we must have a DNS server or have added our MySQL servers to the local resolution file (/etc/hosts) of our ClusterControl, so it can resolve the corresponding name that you want to add.

On the screen we can see an example deployment with one master and two slave servers. The server list is dynamic and allows you to create sophisticated topologies which can be extended after the initial installation.

ClusterControl: Define Topology

When all is set hit the deploy button. You can monitor the status of the creation of our new replication setup from the ClusterControl activity monitor. The deployment process will also take care of installation of popular mysql tools like percona toolkit and percona-xtradb-backup.

ClusterControl: Deploy Cluster Details

Once the task is finished, we can see our cluster in the main ClusterControl screen and on the topology view. Note that we also added a load balancer (ProxySQL) in front of the database instances.

ClusterControl: Topology

As we can see in the image, once we have our cluster created, we can perform several tasks on it, directly from the topology section.

ClusterControl: Topology Management

Import a New Cluster

We also have the option to manage an existing setup by importing it into ClusterControl. Such an environment can be created by ClusterControl or other methods (puppet, chef, ansible, docker …). The process is simple and doesn't require specialized knowledge.

ClusterControl: Import Existing Cluster

First, we must enter the SSH access credentials to our servers. Then we enter the access credentials to our database, the server data directory, and the version. We add the nodes by IP or hostname, in the same way as when we deploy, and press on Import. Once the task is finished, we are ready to manage our cluster from ClusterControl. At this point we can also define the options for node or cluster auto recovery.

Scaling MySQL

With ClusterControl, adding more servers to the server is an easy step. You can do that from the GUI or CLI. For more advanced users you can use ClusterControl Developer Studio and write a resource base condition to expand your cluster automatically.

When adding a new node to the setup, you have an option to use existing backup so there is no need to overwhelm the production master node with additional work.

ClusterControl Scaling MySQL

With the built-in support for load balancers (ProxySQL, Maxscale, HAProxy), you can add and remove MySQL nodes dynamically. If you wish to know more in-depth about how best to manage MySQL replication and clustering, please read the MySQL replication for HA replication whitepaper.

Securing MySQL

MySQL comes with very little security out of the box. This has been improved with the recent version however production grade systems still require tweeks in the default my.cnf configuration.

ClusterControl removes human error and provides access to a suite of security features, to automatically protect your databases from hacks and other threats.

ClusterControl enables SSL support for MySQL connections. Enabling SSL adds another level of security for communication between the applications (including ClusterControl) and database. MySQL clients open encrypted connections to the database servers and verify the identity of those servers before transferring any sensitive information.

ClusterControl will execute all necessary steps, including creating certificates on all database nodes. Such certificates can be maintained later on in the Key Management tab.

ClusterControl: Manager SSL keys

The Percona server installations comes with additional support for an audit plugin. Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. The guided process lets you choose what should be audited and how to maintain the audit log files.

ClusterControl: Enable Audit Log for Percona Server 8.0

Monitoring

When working with database systems, you should be able to monitor them. That will enable you to identify trends, plan for upgrades or improvements or react effectively to any problems or errors that may arise.

The new ClusterControl 1.7.2 comes with updated high-resolution monitoring for MySQL 8.0. It's using Prometheus as the data store with PromQL query language. The list of dashboards includes MySQL Server General, MySQL Server Caches, MySQL InnoDB Metrics, MySQL Replication Master, MySQL Replication Slave, System Overview, and Cluster Overview Dashboards.

ClusterControl installs Prometheus agents, configures metrics and maintains access to Prometheus exporters configuration via its GUI, so you can better manage parameter configuration like collector flags for the exporters (Prometheus). We described in details what can be monitored recently in the article How to Monitor MySQL with Prometheus & ClusterControl.

ClusterControl: Dashboard

Alerting

As a database operator, we need to be informed whenever something critical occurs on our database. The three main methods in ClusterControl to get an alert includes:

email notifications
integrations
advisors

You can set the email notifications on a user level. Go to Settings > Email Notifications. Where you can choose between criticality and type of alert to be sent.

ClusterControl: Notification

The next method is to use Integration services. This is to pass the specific category of events to the other service like ServiceNow tickets, Slack, PagerDuty etc. so you can create an advanced notification methods and integrations within your organization.

ClusterControl: Integration

The last one is to involve sophisticated metrics analysis in Advisor section, where you can build intelligent checks and triggers.

ClusterControl: Automatic Advisors

Backup and Recovery

Now that you have your MySQL up and running, and have your monitoring in place, it is time for the next step: ensure you have a backup of your data.

ClusterControl: Create Backup

ClusterControl provides an interface for MySQL backup management with support for scheduling and creative reports. It gives you two options for backup methods.

Logical: mysqldump
Binary: xtrabackup/mariabackup

ClusterControl: Create Backup Options

A good backup strategy is a critical part of any database management system. ClusterControl offers many options for backups and recovery/restore.

ClusterControl: Backup schedule and Backup Repository

ClusterControl backup retention is configurable; you can choose to retain your backup for any time period or to never delete backups. AES256 encryption is employed to secure your backups against rogue elements. For rapid recovery, backups can be restored directly into a new cluster - ClusterControl handles the full restore process from launch of a new database setup to recovery of data, removing error-prone manual steps from the process.

Backups can be automatically verified upon completion, and then uploaded to cloud storage services (AWS, Azure and Google). Different retention policies can be defined for local backups in the datacenter as well as backups that are uploaded in the cloud.

Node and cluster autorecovery

ClusterControl provides advanced support for failure detection and handling. It also allows you to deploy different proxies to integrate them with your HA stack so there is no need to adjust application connection string or dns entry to redirect application to the new master node.

When master server is down, ClusterControl will create a job to perform automatic failover. ClusterControl does all the background work to elect a new master, deploy fail-over slave servers, and configure load balancers.

ClusterControl: Node autorecovery

ClusterControl automatic failover was designed with the following principles:

Make sure the master is really dead before you failover
Failover only once
Do not failover to an inconsistent slave
Only write to the master
Do not automatically recover the failed master

With the built-in algorithms, failover can often be performed pretty quickly so you can assure the highest SLA’s for your database environment.

The process is highly configurable. It comes with multiple parameters which you can use to adopt recovery to the specifics of your environment. Among the different options you can find replication_stop_on_error, replication_auto_rebuild_slave, replication_failover_blacklist, replication_failover_whitelist, replication_skip_apply_missing_txs, replication_onfail_failover_script and many others.

Tags:

Docker has become the most common tool to create, deploy, and run applications by using containers. It allows us to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package. Docker could be considered as a virtual machine, but instead of creating a whole virtual operating system, Docker allows applications to use the same Linux kernel as the system that they're running on and only requires applications to be shipped with things not already running on the host computer. This gives a significant performance boost and reduces the size of the application.

In this blog, we’ll see how we can easily deploy a PostgreSQL setup via Docker, and how to turn our setup in a primary/standby replication setup by using ClusterControl.

How to Deploy PostgreSQL with Docker

First, let’s see how to deploy PostgreSQL with Docker manually by using a PostgreSQL Docker Image.

The image is available on Docker Hub and you can find it from the command line:

$ docker search postgres
NAME                                         DESCRIPTION                                     STARS               OFFICIAL            AUTOMATED
postgres                                     The PostgreSQL object-relational database sy…   6519                [OK]

We’ll take the first result. The Official one. So, we need to pull the image:

$ docker pull postgres

And run the node containers mapping a local port to the database port into the container:

$ docker run -d --name node1 -p 6551:5432 postgres
$ docker run -d --name node2 -p 6552:5432 postgres
$ docker run -d --name node3 -p 6553:5432 postgres

After running these commands, you should have this Docker environment created:

$ docker ps
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS                 PORTS                                                                                     NAMES
51038dbe21f8        postgres                      "docker-entrypoint.s…"   About an hour ago   Up About an hour       0.0.0.0:6553->5432/tcp                                                                    node3
b7a4211744e3        postgres                      "docker-entrypoint.s…"   About an hour ago   Up About an hour       0.0.0.0:6552->5432/tcp                                                                    node2
229c6bd23ff4        postgres                      "docker-entrypoint.s…"   About an hour ago   Up About an hour       0.0.0.0:6551->5432/tcp                                                                    node1

Now, you can access each node with the following command:

$ docker exec -ti [db-container] bash
$ su postgres
$ psql
psql (11.2 (Debian 11.2-1.pgdg90+1))
Type "help" for help.
postgres=#

Then, you can create a database user, change the configuration according to your requirements or configure replication between the nodes manually.

How to Import Your PostgreSQL Containers into ClusterControl

Now that you've setup your PostgreSQL cluster, you still need to monitor it, alert in case of performance issues, manage backups, detect failures and automatically failover to a healthy server.

If you already have a PostgreSQL cluster running on Docker and you want ClusterControl to manage it, you can simply run the ClusterControl container in the same Docker network as the database containers. The only requirement is to ensure the target containers have SSH related packages installed (openssh-server, openssh-clients). Then allow passwordless SSH from ClusterControl to the database containers. Once done, use the “Import Existing Server/Cluster” feature and the cluster should be imported into ClusterControl.

First, let’s Install OpenSSH related packages on the database containers, allow the root login, start it up and set the root password:

$ docker exec -ti [db-container] apt-get update
$ docker exec -ti [db-container] apt-get install -y openssh-server openssh-client
$ docker exec -it [db-container] sed -i 's|^PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
$ docker exec -it [db-container] sed -i 's|^#PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
$ docker exec -ti [db-container] service ssh start
$ docker exec -it [db-container] passwd

Start the ClusterControl container (if it’s not started) and forward port 80 on the container to port 5000 on the host:

$ docker run -d --name clustercontrol -p 5000:80 severalnines/clustercontrol

Verify the ClusterControl container is up:

$ docker ps | grep clustercontrol
7eadb6bb72fb        severalnines/clustercontrol   "/entrypoint.sh"         4 hours ago         Up 4 hours (healthy)   22/tcp, 443/tcp, 3306/tcp, 9500-9501/tcp, 9510-9511/tcp, 9999/tcp, 0.0.0.0:5000->80/tcp   clustercontrol

Open a web browser, go to http://[Docker_Host]:5000/clustercontrol and create a default admin user and password. You should now see the ClusterControl main page.

The last step is setting up the passwordless SSH to all database containers. For this, we need to know the IP Address for each database node. To know it, we can run the following command for each node:

$ docker inspect [db-container] |grep IPAddress
            "IPAddress": "172.17.0.6",

Then, attach to the ClusterControl container interactive console:

$ docker exec -it clustercontrol bash

Copy the SSH key to all database containers:

$ ssh-copy-id 172.17.0.6
$ ssh-copy-id 172.17.0.7
$ ssh-copy-id 172.17.0.8

Now, we can start to import the cluster into ClusterControl. Open a web browser and go to Docker’s physical host IP address with the mapped port, e.g http://192.168.100.150:5000/clustercontrol, click on “Import Existing Server/Cluster”, and then add the following information.

We must specify User, Key or Password and port to connect by SSH to our servers. We also need a name for our new cluster.

After setting up the SSH access information, we must define the database user, version, basedir and the IP Address or Hostname for each database node.

Make sure you get the green tick when entering the hostname or IP address, indicating ClusterControl is able to communicate with the node. Then, click the Import button and wait until ClusterControl finishes its job. You can monitor the process in the ClusterControl Activity Section.

The database cluster will be listed under the ClusterControl dashboard once imported.

Note that, if you only have a PostgreSQL master node, you could add it into ClusterControl. Then you can add the standby nodes from the ClusterControl UI to allow ClusterControl to configure them for you.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

How to Deploy Your PostgreSQL Containers with ClusterControl

Now, let’s see how to deploy PostgreSQL with Docker by using a CentOS Docker Image (severalnines/centos-ssh) and a ClusterControl Docker Image (severalnines/clustercontrol).

First, we’ll deploy a ClusterControl Docker Container using the latest version, so we need to pull the severalnines/clustercontrol Docker Image.

$ docker pull severalnines/clustercontrol

Then, we’ll run the ClusterControl container and publish the port 5000 to access it.

$ docker run -d --name clustercontrol -p 5000:80 severalnines/clustercontrol

Now you can open the ClusterControl UI at http://[Docker_Host]:5000/clustercontrol and create a default admin user and password.

The severalnines/centos-ssh comes with, in addition to the SSH service enabled, an Auto Deployment feature, but it’s only valid for Galera Cluster. PostgreSQL is not supported yet. So, we’ll set the AUTO_DEPLOYMENT variable in 0 in the docker run command to create the databases nodes.

$ docker run -d --name postgres1 -p 5551:5432 --link clustercontrol:clustercontrol -e AUTO_DEPLOYMENT=0 severalnines/centos-ssh
$ docker run -d --name postgres2 -p 5552:5432 --link clustercontrol:clustercontrol -e AUTO_DEPLOYMENT=0 severalnines/centos-ssh
$ docker run -d --name postgres3 -p 5553:5432 --link clustercontrol:clustercontrol -e AUTO_DEPLOYMENT=0 severalnines/centos-ssh

After running these commands, we should have the following Docker environment:

$ docker ps
CONTAINER ID        IMAGE                         COMMAND             CREATED             STATUS                    PORTS                                                                                     NAMES
0df916b918a9        severalnines/centos-ssh       "/entrypoint.sh"    4 seconds ago       Up 3 seconds              22/tcp, 3306/tcp, 9999/tcp, 27107/tcp, 0.0.0.0:5553->5432/tcp                             postgres3
4c1829371b5e        severalnines/centos-ssh       "/entrypoint.sh"    11 seconds ago      Up 10 seconds             22/tcp, 3306/tcp, 9999/tcp, 27107/tcp, 0.0.0.0:5552->5432/tcp                             postgres2
79d4263dd7a1        severalnines/centos-ssh       "/entrypoint.sh"    32 seconds ago      Up 31 seconds             22/tcp, 3306/tcp, 9999/tcp, 27107/tcp, 0.0.0.0:5551->5432/tcp                             postgres1
7eadb6bb72fb        severalnines/clustercontrol   "/entrypoint.sh"    38 minutes ago      Up 38 minutes (healthy)   22/tcp, 443/tcp, 3306/tcp, 9500-9501/tcp, 9510-9511/tcp, 9999/tcp, 0.0.0.0:5000->80/tcp   clustercontrol

We need to know the IP Address for each database node. To know it, we can run the following command for each node:

$ docker inspect [db-container] |grep IPAddress
            "IPAddress": "172.17.0.3",

Now we have the server nodes up and running, we need to deploy our database cluster. To make it in an easy way we’ll use ClusterControl.

To perform a deployment from ClusterControl, open the ClusterControl UI at http://[Docker_Host]:5000/clustercontrol, then select the option “Deploy” and follow the instructions that appear.

When selecting PostgreSQL, we must specify User, Key or Password and port to connect by SSH to our servers. We also need a name for our new cluster and if we want ClusterControl to install the corresponding software and configurations for us.

After setting up the SSH access information, we must define the database user, version and datadir (optional). We can also specify which repository to use.

In the next step, we need to add our servers to the cluster that we are going to create.

When adding our servers, we can enter IP or hostname. Here we must use the IP Address that we got from each container previously.

In the last step, we can choose if our replication will be Synchronous or Asynchronous.

We can monitor the status of the creation of our new cluster from the ClusterControl activity monitor.

Once the task is finished, we can see our cluster in the main ClusterControl screen.

Conclusion

As we could see, the deploy of PostgreSQL with Docker could be easy at the beginning but it’ll require a bit more work to configure replication. Finally, you should monitor your cluster to see what is happening. With ClusterControl, you can import or deploy your PostgreSQL cluster with Docker, as well as automate the monitoring and management tasks like backup and automatic failover/recovery. Try it out.

Tags:

A Docker image can be built by anyone who has the ability to write a script. That is why there are many similar images being built by the community, with minor differences but really serving a common purpose. A good (and popular) container image must have well-written documentation with clear explanations, an actively maintained repository and with regular updates. Check out this blog post if you want to learn how to build and publish your own Docker image for MySQL, or this blog post if you just want to learn the basics of running MySQL on Docker.

In this blog post, we are going to look at some of the most popular Docker images to run our MySQL or MariaDB server. The images we have chosen are general-purpose public images that can at least run a MySQL service. Some of them include non-essential MySQL-related applications, while others just serve as a plain mysqld instance. The listing here is based on the result of Docker Hub, the world's largest library and community for container images.

TLDR

The following table summarizes the different options:

Aspect	MySQL (Docker)	MariaDB (Docker)	Percona (Docker)	MySQL (Oracle)	MySQL/MariaDB (CentOS)	MariaDB (Bitnami)
Downloads^*	10M+	10M+	10M+	10M+	10M+	10M+
Docker Hub	mysql	mariadb	percona	mysql/mysql-server	mysql-80-centos7 mysql-57-centos7 mysql-56-centos7 mysql-55-centos7 mariadb-102-centos7 mariadb-101-centos7	bitnami/mariadb
Project page	mysql	mariadb	percona-docker	mysql-docker	mysql-container	bitnami-docker-mariadb
Base image	Debian 9	Ubuntu 18.04 (bionic) Ubuntu 14.04 (trusty)	CentOS 7	Oracle Linux 7	RHEL 7 CentOS 7	Debian 9 (minideb) Oracle Linux 7
Supported database versions	5.5 5.6 5.7 8.0	5.5 10.0 10.1 10.2 10.3 10.4	5.6 5.7 8.0	5.5 5.6 5.7 8.0	5.5 5.6 5.7 8.0 10.1 10.2	10.1 10.2 10.3
Supported platforms	x86_64	x86 x86_64 arm64v8 ppc64le	x86 x86_64	x86_64	x86_64	x86_64
Image size (tag: latest)	129 MB	120 MB	193 MB	99 MB	178 MB	87 MB
First commit	May 18, 2014	Nov 16, 2014	Jan 3, 2016	May 18, 2014^**	Feb 15, 2015	May 17, 2015
Contributors	18	9	15	14	30	20
Github Star	1267	292	113	320	89	152
Github Fork	1291	245	121	1291**	146	71

^* Taken from Docker Hub page.
^** Forked from MySQL docker project.

mysql (Docker)

The images are built and maintained by the Docker community with the help of MySQL team. It can be considered the most popular publicly available MySQL server images hosted on Docker Hub and one of the earliest on the market (the first commit was May 18, 2014). It has been forked ~1300 times with 18 active contributors. It supports the Docker version down to 1.6 on a best-effort basis. At this time of writing, all the MySQL major versions are supported - 5.5, 5.6, 5.7 and 8.0 on x86_64 architecture only.

Most of the MySQL images built by others are inspired by the way this image was built. MariaDB, Percona and MySQL Server (Oracle) images are following a similar environment variables, configuration file structure and container initialization process flow.

The following environment variables are available on most of the MySQL container images on Docker Hub:

MYSQL_ROOT_PASSWORD
MYSQL_DATABASE
MYSQL_USER
MYSQL_PASSWORD
MYSQL_ALLOW_EMPTY_PASSWORD
MYSQL_RANDOM_ROOT_PASSWORD
MYSQL_ONETIME_PASSWORD

The image size (tag: latest) is averagely small (129MB), easy to use, well maintained and updated regularly by the maintainer. If your application requires the latest MySQL database container, this is the most recommended public image you can use.

mariadb (Docker)

The images are maintained by Docker community with the help of MariaDB team. It uses the same style of building structure as the mysql (Docker) image, but it comes with multiple architectures support:

Linux x86-64 (amd64)
ARMv8 64-bit (arm64v8)
x86/i686 (i386)
IBM POWER8 (ppc64le)

At the time of this writing, the images support MariaDB version 5.5 up until 10.4, where image with the "latest" tag size is around 120MB. This image serves as a general-purpose image and follows the instructions, environment variables and configuration file structure as mysql (Docker). Most applications that required MySQL as the database backend is commonly compatible with MariaDB, since both are talking the same protocol.

MariaDB server used to be a fork of MySQL but now it has been diverted away from it. In terms of database architecture design, some MariaDB versions are not 100% compatible and no longer a drop-in replacement with theirs respective MySQL versions. Check out this page for details. However, there are ways to migrate between each other by using logical backup. Simply said, that once you are in the MariaDB ecosystem, you probably have to stick with it. Mixing or switching between MariaDB and MySQL in a cluster is not recommended.

If you would like to set up a more advanced MariaDB setup (replication, Galera, sharding), there are other images built to achieve that objective much more easily, e.g, bitnami/mariadb as explained further down.

percona (Docker)

Percona Server is a fork of MySQL created by Percona. These are the only official Percona Server Docker images, created and maintained by the Percona team. It supports both x86 and x86_64 architecture and the image is based on CentOS 7. Percona only maintains the latest 3 major MySQL versions for container images - 5.6, 5.7 and 8.0.

The code repository points out that first commit was Jan 3, 2016 with 15 actively contributors mostly from Percona development team. Percona Server for MySQL comes with XtraDB storage engine (a drop-in replacement for InnoDB) and follows the upstream Oracle MySQL releases very closely (including all the bug fixes in it) with some additional features like MyRocks storage engine, TokuDB as well as Percona’s own bug fixes. In a way, you can think of it as an improved version of Oracle’s MySQL. You can easily switch between MySQL and Percona Server images, provided you are running on the compatible version.

The images recognize two additional environment variables for TokuDB and RocksDB for MySQL (available since v5.6):

INIT_TOKUDB - Set to 1 to allow the container to be started with enabled TOKUDB storage engine.
INIT_ROCKSDB - Set to 1 to allow the container to be started with enabled ROCKSDB storage engine.

mysql-server (Oracle)

The repository is forked from mysql by Docker team. The images are created, maintained and supported by the MySQL team at Oracle built on top of Oracle Linux 7 base image. The MySQL 8.0 image comes with MySQL Community Server (minimal) and MySQL Shell and the server is configured to expose X protocol on port 33060 from minimal repository. The minimal package was designed for use by the official Docker images for MySQL. It cuts out some of the non-essential pieces of MySQL like innochecksum, myisampack, mysql_plugin, but is otherwise the same product. Therefore, it has a very small image footprint which is around 99 MB.

One important point to note is the images have a built-in health check script, which is very handy for some people who are in need for an accurate availability logic. Otherwise, people have to write a custom Docker's HEALTHCHECK command (or script) to check for the container health.

mysql-xx-centos7 & mariadb-xx-centos7 (CentOS)

The container images are built and maintained by CentOS team which include MySQL database server for OpenShift and general usage. For RHEL based images, you can pull them from Red Hat's Container Catalog while the CentOS based images are hosted publicly at Docker Hub on different pages for every major version (only list out images with 10M+ downloads):

MySQL 8.0: https://hub.docker.com/r/centos/mysql-80-centos7
MySQL 5.7: https://hub.docker.com/r/centos/mysql-57-centos7
MySQL 5.6: https://hub.docker.com/r/centos/mysql-56-centos7
MySQL 5.5: https://hub.docker.com/r/centos/mysql-55-centos7
MariaDB 10.2: https://hub.docker.com/r/centos/mariadb-102-centos7
MariaDB 10.1: https://hub.docker.com/r/centos/mariadb-101-centos7

The image structure is a bit different and it doesn't make use of image tag like others, thus the image name becomes a bit longer instead. Having said that, you have to go to the correct Docker Hub page to get the major version you want to pull.

According to the code repository page, 30 contributors have collaborated in the project since February 15, 2015. It supports MySQL 5.5 up until 8.0 and MariaDB 5.5 until 10.2 for x86_64 architecture only. If you heavily rely on Red Hat containerization infrastructure like OpenShift, these are probably the most popular or well-maintained images for MySQL and MariaDB.

The following environment variables influence the MySQL/MariaDB configuration file and they are all optional:

MYSQL_LOWER_CASE_TABLE_NAMES (default: 0)
MYSQL_MAX_CONNECTIONS (default: 151)
MYSQL_MAX_ALLOWED_PACKET (default: 200M)
MYSQL_FT_MIN_WORD_LEN (default: 4)
MYSQL_FT_MAX_WORD_LEN (default: 20)
MYSQL_AIO (default: 1)
MYSQL_TABLE_OPEN_CACHE (default: 400)
MYSQL_KEY_BUFFER_SIZE (default: 32M or 10% of available memory)
MYSQL_SORT_BUFFER_SIZE (default: 256K)
MYSQL_READ_BUFFER_SIZE (default: 8M or 5% of available memory)
MYSQL_INNODB_BUFFER_POOL_SIZE (default: 32M or 50% of available memory)
MYSQL_INNODB_LOG_FILE_SIZE (default: 8M or 15% of available memory)
MYSQL_INNODB_LOG_BUFFER_SIZE (default: 8M or 15% of available memory)
MYSQL_DEFAULTS_FILE (default: /etc/my.cnf)
MYSQL_BINLOG_FORMAT (default: statement)
MYSQL_LOG_QUERIES_ENABLED (default: 0)

The images support MySQL auto-tuning when the MySQL image is running with the --memory parameter set and if you didn't specify value for the following parameters, their values will be automatically calculated based on the available memory:

MYSQL_KEY_BUFFER_SIZE (default: 10%)
MYSQL_READ_BUFFER_SIZE (default: 5%)
MYSQL_INNODB_BUFFER_POOL_SIZE (default: 50%)
MYSQL_INNODB_LOG_FILE_SIZE (default: 15%)
MYSQL_INNODB_LOG_BUFFER_SIZE (default: 15%)

DevOps Guide to Database Management

Learn about what you need to know to automate and manage your open source databases

Download for Free

bitnami/mariadb

The images are built and maintained by Bitnami, experts in software packaging in virtual or cloud deployment. The images are released daily with the latest distribution packages available and use a minimalist Debian-based image called minideb. Thus, the image size for the latest tag is the smallest among all which is around 87MB. The project has 20 contributors with the first commit happened on May 17, 2015. At this time of writing, it only supports MariaDB 10.1 up until 10.3.

One outstanding feature of this image is the ability to deploy a highly available MariaDB setup via Docker environment variables. A zero downtime MariaDB master-slave replication cluster can easily be setup with the Bitnami MariaDB Docker image using the following environment variables:

MARIADB_REPLICATION_MODE: The replication mode. Possible values master/slave. No defaults.
MARIADB_REPLICATION_USER: The replication user created on the master on first run. No defaults.
MARIADB_REPLICATION_PASSWORD: The replication users password. No defaults.
MARIADB_MASTER_HOST: Hostname/IP of replication master (slave parameter). No defaults.
MARIADB_MASTER_PORT_NUMBER: Server port of the replication master (slave parameter). Defaults to 3306.
MARIADB_MASTER_ROOT_USER: User on replication master with access to MARIADB_DATABASE (slave parameter). Defaults to root
MARIADB_MASTER_ROOT_PASSWORD: Password of user on replication master with access to
MARIADB_DATABASE (slave parameter). No defaults.

In a replication cluster, you can have one master and zero or more slaves. When replication is enabled the master node is in read-write mode, while the slaves are in read-only mode. For best performance its advisable to limit the reads to the slaves.

In addition, these images also support deployment on Kubernetes as Helm Charts. You can read more about the installation steps in the Bitnami MariaDB Chart GitHub repository.

Conclusions

There are tons of MySQL server images that have been contributed by the community and we can't cover them all here. Keep in mind that these images are popular because they are built for general purpose usage. Some less popular images can do much more advanced stuff, like database container orchestration, automatic bootstrapping and automatic scaling. Different images provide different approaches that can be used to address other problems.

Tags:

Nowadays, Docker is the most common tool to create, deploy, and run applications by using containers. It allows us to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package. It could be considered as a virtual machine, but instead of creating a whole virtual operating system, Docker allows applications to use the same Linux kernel as the system that they're running on and only requires applications to be shipped with things not already running on the host computer. This gives a significant performance boost and reduces the size of the application.

In case of Docker Images, they come with a predefined OS version and the packages are installed in a way that was decided by the person who created the image. It’s possible that you want to use a different OS or maybe you want to install the packages in a different way. For these cases, you should use a clean OS Docker Image and install the software from scratch.

Replication is a common feature in a database environment, so after having the TimescaleDB Docker Images deployed, if you want to configure a replication setup, you’ll need to do it manually from the container, by using a Docker file or even a script. This task could be complex if you don’t have Docker knowledge.

In this blog, we’ll see how we can deploy TimescaleDB via Docker by using a TimescaleDB Docker Image, and then, we’ll see how to install it from scratch by using a CentOS Docker Image and ClusterControl.

How to Deploy TimescaleDB with a Docker Image

First, let’s see how to deploy TimescaleDB by using a Docker Image available on Docker Hub.

$ docker search timescaledb
NAME                                       DESCRIPTION                                     STARS               OFFICIAL            AUTOMATED
timescale/timescaledb                      An open-source time-series database optimize…   52

We’ll take the first result. So, we need to pull this image:

$ docker pull timescale/timescaledb

And run the node containers mapping a local port to the database port into the container:

$ docker run -d --name timescaledb1 -p 7551:5432 timescale/timescaledb
$ docker run -d --name timescaledb2 -p 7552:5432 timescale/timescaledb

After running these commands, you should have this Docker environment created:

$ docker ps
CONTAINER ID        IMAGE                   COMMAND                  CREATED             STATUS              PORTS                    NAMES
6d3bfc75fe39        timescale/timescaledb   "docker-entrypoint.s…"   15 minutes ago      Up 15 minutes       0.0.0.0:7552->5432/tcp   timescaledb2
748d5167041f        timescale/timescaledb   "docker-entrypoint.s…"   16 minutes ago      Up 16 minutes       0.0.0.0:7551->5432/tcp   timescaledb1

Now, you can access each node with the following commands:

$ docker exec -ti [db-container] bash
$ su postgres
$ psql
psql (9.6.13)
Type "help" for help.
postgres=#

As you can see, this Docker Image contains a TimescaleDB 9.6 version by default, and it’s installed on Alpine Linux v3.9. You can use a different TimescaleDB version by changing the tag:

$ docker pull timescale/timescaledb:latest-pg11

Then, you can create a database user, change the configuration according to your requirements or configure replication between the nodes manually.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

How to Deploy TimescaleDB with ClusterControl

Now, let’s see how to deploy TimescaleDB with Docker by using a CentOS Docker Image (centos) and a ClusterControl Docker Image (severalnines/clustercontrol).

First, we’ll deploy a ClusterControl Docker Container using the latest version, so we need to pull the severalnines/clustercontrol Docker Image.

$ docker pull severalnines/clustercontrol

Then, we’ll run the ClusterControl container and publish the port 5000 to access it.

$ docker run -d --name clustercontrol -p 5000:80 severalnines/clustercontrol

Now, we can open the ClusterControl UI at http://[Docker_Host]:5000/clustercontrol and create a default admin user and password.

The CentOS Official Docker Image comes without SSH service, so we’ll install it and allow the connection from the ClusterControl node with passwordless by using an SSH key.

$ docker search centos
NAME                               DESCRIPTION                                     STARS               OFFICIAL            AUTOMATED
centos                             The official build of CentOS.                   5378                [OK]

So, we’ll pull the CentOS Official Docker Image.

$ docker pull centos

And then, we’ll run two node containers, timescale1 and timescale2, linked with ClusterControl and we’ll map a local port to connect to the database (optional).

$ docker run -dt --privileged --name timescale1 -p 8551:5432 --link clustercontrol:clustercontrol centos /usr/sbin/init
$ docker run -dt --privileged --name timescale2 -p 8552:5432 --link clustercontrol:clustercontrol centos /usr/sbin/init

As we need to install and configure the SSH service, we need to run the container with privileged and /usr/sbin/init parameters to be able to manage the service inside the container.

After running these commands, we should have this Docker environment created:

$ docker ps
CONTAINER ID        IMAGE                         COMMAND             CREATED             STATUS                       PORTS                                                                                     NAMES
230686d8126e        centos                        "/usr/sbin/init"    4 seconds ago       Up 3 seconds                 0.0.0.0:8552->5432/tcp                                                                    timescale2
c0e7b245f7fe        centos                        "/usr/sbin/init"    23 seconds ago      Up 22 seconds                0.0.0.0:8551->5432/tcp                                                                    timescale1
7eadb6bb72fb        severalnines/clustercontrol   "/entrypoint.sh"    2 weeks ago         Up About an hour (healthy)   22/tcp, 443/tcp, 3306/tcp, 9500-9501/tcp, 9510-9511/tcp, 9999/tcp, 0.0.0.0:5000->80/tcp   clustercontrol

We can access each node with the following command:

$ docker exec -ti [db-container] bash

As we mentioned earlier, we need to install the SSH service, so let’s install it, allow the root access and set the root password for each database container:

$ docker exec -ti [db-container] yum update -y
$ docker exec -ti [db-container] yum install -y openssh-server openssh-clients
$ docker exec -it [db-container] sed -i 's|^#PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
$ docker exec -it [db-container] systemctl start sshd
$ docker exec -it [db-container] passwd

$ docker inspect [db-container] |grep IPAddress
            "IPAddress": "172.17.0.5",

Then, attach to the ClusterControl container interactive console:

$ docker exec -it clustercontrol bash

And copy the SSH key to all database containers:

$ ssh-copy-id 172.17.0.5

Now we have the server nodes up and running, we need to deploy our database cluster. To make it in an easy way we’ll use ClusterControl.

To perform a deployment from ClusterControl, open the ClusterControl UI at http://[Docker_Host]:5000/clustercontrol, then select the option “Deploy” and follow the instructions that appear.

When selecting TimescaleDB, we must specify User, Key or Password and port to connect by SSH to our servers. We also need a name for our new cluster and if we want ClusterControl to install the corresponding software and configurations for us.

After setting up the SSH access information, we must define the database user, version and datadir (optional). We can also specify which repository to use.

In the next step, we need to add our servers to the cluster that we are going to create.

Here we must use the IP Address that we got from each container previously.

In the last step, we can choose if our replication will be Synchronous or Asynchronous.

We can monitor the status of the creation of our new cluster from the ClusterControl activity monitor.

Once the task is finished, we can see our cluster in the main ClusterControl screen.

Note that, if you want to add more standby nodes, you can do it from the ClusterControl UI in the Cluster Actions menu.

In the same way, if you have your TimescaleDB cluster running on Docker and you want ClusterControl to manage it to be able to use all the features of this system like monitoring, backing up, automatic failover, and even more, you can simply run the ClusterControl container in the same Docker network as the database containers. The only requirement is to ensure the target containers have SSH related packages installed (openssh-server, openssh-clients). Then allow passwordless SSH from ClusterControl to the database containers. Once done, use the “Import Existing Server/Cluster” feature and the cluster should be imported into ClusterControl.

One possible issue running containers is the IP address or hostname assignment. Without an orchestration tool like Kubernetes, the IP address or hostname could be different if you stop the nodes and create new containers before start it again. You’ll have a different IP address for the old nodes and ClusterControl is assuming that all nodes are running on an environment with a dedicated IP address or hostname, so after the IP address changed, you should re-import the cluster into ClusterControl. There are many workarounds for this issue, you can check this link to use Kubernetes with StatefulSet, or this one for running containers without orchestration tool.

Conclusion

As we could see, the deploy of TimescaleDB with Docker should be easy if you don’t want to configure a replication or failover environment and if you don’t want to make any changes on the OS version or database packages installation.

With ClusterControl, you can import or deploy your TimescaleDB cluster with Docker by using the OS image that you prefer, as well as automate the monitoring and management tasks like backup and automatic failover/recovery.

Tags:

Nowadays, terms like Docker, Images or Containers are pretty common in all database environments, so it’s normal to see a MariaDB server running on Docker in both production and non-production setups. It is possible, however, that while you may have heard the terms, you might now know the differences between them. In this blog, we provide an overview of these terms and how we can apply them in practice to deploy a MariaDB server.

What is Docker?

Docker is the most common tool to create, deploy, and run applications by using containers. It allows you to package up an application with all of the parts it needs (such as libraries and other dependencies) and ship it all out as one package, allowing for the portable sharing of containers across different machines.

Container vs Virtual Machine

What is an Image?

An Image is like a virtual machine template. It has all the required information to run the container. This includes the operating system, software packages, drivers, configuration files, and helper scripts… all packed into one bundle.

A Docker image can be built by anyone who has the ability to write a script. That is why there are many similar images being built by the community, each with minor differences...but serving a common purpose.

What is a Docker Container?

A Docker Container is an instance of a Docker Image. It runs completely isolated from the host environment by default, only accessing host files and ports if configured to do so.

A container could be considered as a virtual machine, but instead of creating a whole virtual operating system, it allows applications to use the same Linux kernel as the system that they're running on. It only requires applications to be shipped with parts not already running on the host computer. This gives you a significant performance boost and reduces the size of the application.

Keep in mind that any changes made to the container are recorded on a separate layer, not in the same Docker Image. This means if you delete the container, or if you create a new one based on the same Docker Image, the changes won’t be there. To preserve the changes you must commit it into a new Docker Image or create a Docker File.

What is a DockerFile?

A DockerFile is a script used to generate a Docker Image where you have the steps to generate it based on any modifications that you want to apply.

Docker Components

Let’s see a Docker File example.

$ vi Dockerfile
# MariaDB 10.3 with SSH
# Pull the mariadb latest image
FROM mariadb:latest
# List all the packages that we want to install
ENV PACKAGES openssh-server openssh-client
# Install Packages
RUN apt-get update && apt-get install -y $PACKAGES
# Allow SSH Root Login
RUN sed -i 's|^#PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
# Configure root password
RUN echo "root:root123" | chpasswd

Now, we can build a new Docker Image from this Docker File:

$ docker build --rm=true -t severalnines/mariadb-ssh .

Check the new image created:

$ docker images
REPOSITORY                                 TAG                 IMAGE ID            CREATED             SIZE
severalnines/mariadb-ssh                   latest              a8022951f195        17 seconds ago      485MB

And now, we can use the new image as a common Docker Image as we’ll see in the next section.

DevOps Guide to Database Management

Learn about what you need to know to automate and manage your open source databases

Download for Free

How to Deploy MariaDB on Docker Without Dockerfile

Now that we know more about the Docker world, let’s see how to use it to create a MariaDB server. For this, we’ll assume you already have Docker installed.

We can use the image created by using the Dockerfile, but we’ll pull the official MariaDB Docker Image.

$ docker search mariadb
NAME                                   DESCRIPTION                                     STARS               OFFICIAL            AUTOMATED
mariadb                                MariaDB is a community-developed fork of MyS…   2804                [OK]

Without specifying a TAG, by default, it’ll pull the latest image version, in this case, MariaDB Server 10.3 on Ubuntu 18.04.

$ docker pull mariadb

We can check the image downloaded.

$ docker images
REPOSITORY                                 TAG                 IMAGE ID            CREATED             SIZE
mariadb                                    latest              e07bb20373d8        2 weeks ago         349MB

Then, we’ll create two directories under our MariaDB Docker directory, one for the datadir and another one for the MariaDB configuration files. We’ll add both on our MariaDB Docker Container.

$ cd ~/Docker
$ mkdir datadir
$ mkdir config

The startup configuration is specified in the file /etc/mysql/my.cnf, and it includes any files found in the /etc/mysql/conf.d directory that end with .cnf.

$ tail -1 /etc/mysql/my.cnf
!includedir /etc/mysql/conf.d/

The content of these files will override any repeated parameter configured in /etc/mysql/my.cnf, so you can create an alternative configuration here.

Let’s run our first MariaDB Docker Container:

$ docker run -d --name mariadb1 \
-p 33061:3306 \
-v ~/Docker/mariadb1/config:/etc/mysql/conf.d \
-v ~/Docker/mariadb1/datadir:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=root123 \
-e MYSQL_DATABASE=dbtest \
mariadb

After this, we can check our containers running:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS                     NAMES
12805cc2d7b5        mariadb             "docker-entrypoint.s…"   About a minute ago   Up About a minute   0.0.0.0:33061->3306/tcp   mariadb1

The container log:

$ docker logs mariadb1
MySQL init process done. Ready for start up.
2019-06-03 23:18:01 0 [Note] mysqld (mysqld 10.3.15-MariaDB-1:10.3.15+maria~bionic) starting as process 1 ...
2019-06-03 23:18:01 0 [Note] InnoDB: Using Linux native AIO
2019-06-03 23:18:01 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-06-03 23:18:01 0 [Note] InnoDB: Uses event mutexes
2019-06-03 23:18:01 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2019-06-03 23:18:01 0 [Note] InnoDB: Number of pools: 1
2019-06-03 23:18:01 0 [Note] InnoDB: Using SSE2 crc32 instructions
2019-06-03 23:18:01 0 [Note] InnoDB: Initializing buffer pool, total size = 256M, instances = 1, chunk size = 128M
2019-06-03 23:18:01 0 [Note] InnoDB: Completed initialization of buffer pool
2019-06-03 23:18:01 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2019-06-03 23:18:01 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2019-06-03 23:18:01 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2019-06-03 23:18:01 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2019-06-03 23:18:02 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2019-06-03 23:18:02 0 [Note] InnoDB: Waiting for purge to start
2019-06-03 23:18:02 0 [Note] InnoDB: 10.3.15 started; log sequence number 1630824; transaction id 21
2019-06-03 23:18:02 0 [Note] Plugin 'FEEDBACK' is disabled.
2019-06-03 23:18:02 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2019-06-03 23:18:02 0 [Note] Server socket created on IP: '::'.
2019-06-03 23:18:02 0 [Note] InnoDB: Buffer pool(s) load completed at 190603 23:18:02
2019-06-03 23:18:02 0 [Note] Reading of all Master_info entries succeded
2019-06-03 23:18:02 0 [Note] Added new Master_info '' to hash table
2019-06-03 23:18:02 0 [Note] mysqld: ready for connections.
Version: '10.3.15-MariaDB-1:10.3.15+maria~bionic'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution

And the content of our Docker datadir path (host):

$ ls -l ~/Docker/mariadb1/datadir/
total 249664
-rw-rw----   1 sinsausti  staff     16384 Jun  3 20:18 aria_log.00000001
-rw-rw----   1 sinsausti  staff        52 Jun  3 20:18 aria_log_control
drwx------   3 sinsausti  staff        96 Jun  3 20:18 dbtest
-rw-rw----   1 sinsausti  staff       976 Jun  3 20:18 ib_buffer_pool
-rw-rw----   1 sinsausti  staff  50331648 Jun  3 20:18 ib_logfile0
-rw-rw----   1 sinsausti  staff  50331648 Jun  3 20:17 ib_logfile1
-rw-rw----   1 sinsausti  staff  12582912 Jun  3 20:18 ibdata1
-rw-rw----   1 sinsausti  staff  12582912 Jun  3 20:18 ibtmp1
-rw-rw----   1 sinsausti  staff         0 Jun  3 20:17 multi-master.info
drwx------  92 sinsausti  staff      2944 Jun  3 20:18 mysql
drwx------   3 sinsausti  staff        96 Jun  3 20:17 performance_schema
-rw-rw----   1 sinsausti  staff     24576 Jun  3 20:18 tc.log

We can access the MariaDB container running the following command and using the password specified in the MYSQL_ROOT_PASSWORD variable:

$ docker exec -it mariadb1 bash
root@12805cc2d7b5:/# mysql -p -e "SHOW DATABASES;"
Enter password:
+--------------------+
| Database           |
+--------------------+
| dbtest             |
| information_schema |
| mysql              |
| performance_schema |
+--------------------+

Here we can see our dbtest created.

Docker Commands

Finally, let’s see some useful commands for managing Docker.

Image search
```
$ docker search Image_Name
```
Image download
```
$ docker pull Image_Name
```
List the images installed
```
$ docker images
```
List containers (adding the flag -a we can see also the stopped containers)
```
$ docker ps -a
```
Delete a Docker Image
```
$ docker rmi Image_Name
```
Delete a Docker Container (the container must be stopped)
```
$ docker rm Container_Name
```
Run a container from a Docker Image (adding the flag -p we can mapping a container port to localhost)
```
$ docker run -d --name Container_Name -p Host_Port:Guest_Port Image_Name
```
Stop container
```
$ docker stop Container_Name
```
Start container
```
$ docker start Container_Name
```
Check container logs
```
$ docker logs Container_Name
```
Check container information
```
$ docker inspect Container_Name
```

Create a container linked

$ docker run -d --name Container_Name --link Container_Name:Image_Name Image_Name

Connect to a container from localhost
```
$ docker exec -it Container_Name bash
```

Create a container with volume added

$ docker run -d --name Container_Name --volume=/home/docker/Container_Name/conf.d:/etc/mysql/conf.d Image_Name

Commit changes to a new image

$ docker commit Container_ID Image_Name:TAG

Conclusion

Docker is a really useful tool for sharing a development environment easily using a Dockerfile or publishing a Docker Image. By using it you can make sure that everyone is using the same environment. At the same time it’s also useful to recreate or clone an existing environment. Docker can share volumes, use private networks, map ports, and even more.

In this blog, we saw how to deploy MariaDB Server on Docker as a standalone server. If you want to use a more complex environment like Replication or Galera Cluster, you can use bitnami/mariadb to achieve this configuration.

Tags:

MariaDB MaxScale is an advanced, plug-in database proxy for MariaDB database servers. It sits between client applications and the database servers, routing client queries and server responses. MaxScale also monitors the servers, so it will quickly notice any changes in server status or replication topology. This makes MaxScale a natural choice for controlling failover and similar features.

In this two-part blog series we are going to give a complete walkthrough on how to run MariaDB MaxScale on Docker. This part covers the deployment as a standalone Docker container and MaxScale clustering via Docker Swarm for high availability.

MariaDB MaxScale on Docker

There are a number of MariaDB Docker images available in Docker Hub. In this blog, we are going to use the official image maintained and published by MariaDB called "mariadb/maxscale" (tag: latest). The image is around 71MB in size. At this time of writing, the image is pre-installed with MaxScale 2.3.4 as part of its required packages.

Generally, the following steps are required to run a MaxScale with this image on container environment:

A running MariaDB (master-slave or master-master) replication/Galera Cluster or NDB Cluster
Create and grant a database user dedicated for MaxScale monitoring
Prepare the MaxScale configuration file
Map the configuration file into container or load into Kubernetes ConfigMap or Docker Swarm Configs
Start the container/pod/service/replicaset

Note that MaxScale is a product of MariaDB, which means it is tailored towards MariaDB server. Most of the features are still compatible with MySQL except some parts like for example GTID handling, Galera Cluster configuration and internal data files. The version that we are going to use is 2.3.4, which is released under Business Source License (BSL). It allows for all the code to be open and usage under THREE servers is free. When usage goes over three backend servers, the company using it must pay for a commercial subscription. After a specific time period (2 years in the case of MaxScale) the release moves to GPL and all usage is free.

Just to be clear, since this is a test environment, we are okay to have more than 2 nodes. As stated in the MariaDB BSL FAQ page:

Q: Can I use MariaDB products licensed under BSL in test and development environment?
A: Yes, In non-production test and development environment, you can use products licensed under BSL without needing a subscription from MariaDB

In this walkthrough, we already have a three-node MariaDB Replication deployed using ClusterControl. The following diagram illustrates the setup that we are going to deploy:

Our system architecture consists of:

mariadb1 - 192.168.0.91 (master)
mariadb2 - 192.168.0.92 (slave)
mariadb3 - 192.168.0.93 (slave)
docker1 - 192.168.0.200 (Docker host for containers - maxscale, app)

Preparing the MaxScale User

Firstly, create a MySQL database user for MaxScale and allow all hosts in the network 192.168.0.0/24:

MariaDB> CREATE USER 'maxscale'@'192.168.0.%' IDENTIFIED BY 'my_s3cret';

Then, grant the required privileges. If you just want to monitor the backend servers with load balancing, the following grants would suffice:

MariaDB> GRANT SHOW DATABASES ON *.* TO 'maxscale'@'192.168.0.%';
MariaDB> GRANT SELECT ON `mysql`.* TO 'maxscale'@'192.168.0.%';

However, MaxScale can do much more than routing queries. It has the ability to perform failover and switchover for example promoting a slave to a new master. This requires SUPER and REPLICATION CLIENT privileges. If you would like to use this feature, assign ALL PRIVILEGES to the user instead:

mysql> GRANT ALL PRIVILEGES ON *.* TO maxscale@'192.168.0.%';

That's it for the user part.

Preparing MaxScale Configuration File

The image requires a working configuration file to be mapped into the container before it is started. The minimal configuration file provided in the container is not going to help us build the reverse proxy that we want. Therefore, the configuration file has to be prepared beforehand.

The following list can help us in collecting required basic information to construct our configuration file:

Cluster type - MaxScale supports MariaDB replication (master-slave, master-master), Galera Cluster, Amazon Aurora, MariaDB ColumnStore and NDB Cluster (aka MySQL Cluster).
Backend IP address and/or hostname - Reachable IP address or hostname for all backend servers.
Routing algorithm - MaxScale supports two types of query routing - read-write splitting and load balancing in round-robin.
Port to listen by MaxScale - By default, MaxScale uses port 4006 for round-robin connections and 4008 for read-write split connections. You may use UNIX socket if you want.

In the current directory, create a text file called maxscale.cnf so we can map it into the container when starting up. Paste the following lines in the file:

########################
## Server list
########################

[mariadb1]
type            = server
address         = 192.168.0.91
port            = 3306
protocol        = MariaDBBackend
serv_weight     = 1

[mariadb2]
type            = server
address         = 192.168.0.92
port            = 3306
protocol        = MariaDBBackend
serv_weight     = 1

[mariadb3]
type            = server
address         = 192.168.0.93
port            = 3306
protocol        = MariaDBBackend
serv_weight     = 1

#########################
## MaxScale configuration
#########################

[maxscale]
threads                 = auto
log_augmentation        = 1
ms_timestamp            = 1
syslog                  = 1

#########################
# Monitor for the servers
#########################

[monitor]
type                    = monitor
module                  = mariadbmon
servers                 = mariadb1,mariadb2,mariadb3
user                    = maxscale
password                = my_s3cret
auto_failover           = true
auto_rejoin             = true
enforce_read_only_slaves = 1

#########################
## Service definitions for read/write splitting and read-only services.
#########################

[rw-service]
type            = service
router          = readwritesplit
servers         = mariadb1,mariadb2,mariadb3
user            = maxscale
password        = my_s3cret
max_slave_connections           = 100%
max_sescmd_history              = 1500
causal_reads                    = true
causal_reads_timeout            = 10
transaction_replay              = true
transaction_replay_max_size     = 1Mi
delayed_retry                   = true
master_reconnection             = true
master_failure_mode             = fail_on_write
max_slave_replication_lag       = 3

[rr-service]
type            = service
router          = readconnroute
servers         = mariadb1,mariadb2,mariadb3
router_options  = slave
user            = maxscale
password        = my_s3cret

##########################
## Listener definitions for the service
## Listeners represent the ports the service will listen on.
##########################

[rw-listener]
type            = listener
service         = rw-service
protocol        = MariaDBClient
port            = 4008

[ro-listener]
type            = listener
service         = rr-service
protocol        = MariaDBClient
port            = 4006

A bit of explanations for every section:

Server List - The backend servers. Define every MariaDB server of this cluster in its own stanza. The stanza name will be used when we specify the service definition further down. The component type must be "server".
MaxScale Configuration - Define all MaxScale related configurations there.
Monitor module - How MaxScale should monitor the backend servers. The component type must be "monitor" followed by either one of the monitoring modules. For the list of supported monitors, refer to MaxScale 2.3 Monitors.
Service - Where to route the query. The component type must be "service". For the list of supported routers, refer to MaxScale 2.3 Routers.
Listener - How MaxScale should listen to incoming connections. It can be port or socket file. The component type must be "listener". Commonly, listeners are tied to services.

So basically, we would like MaxScale to listen on two ports, 4006 and 4008. Port 4006 is specifically for round-robin connection, suitable for read-only workloads for our MariaDB Replication while port 4008 is specifically for critical read and write workloads. We also want to use MaxScale to perform action to our replication in case of a failover, switchover or slave rejoining, thus we use the monitor module for called "mariadbmon".

Running the Container

We are now ready to run our standalone MaxScale container. Map the configuration file with -v and make sure to publish both listener ports 4006 and 4008. Optionally, you can enable MaxScale REST API interface at port 8989:

$ docker run -d \
--name maxscale \
--restart always \
-p 4006:4006 \
-p 4008:4008 \
-p 8989:8989 \
-v $PWD/maxscale.cnf:/etc/maxscale.cnf \
mariadb/maxscale

Verify with:

$ docker logs -f maxscale
...
2019-06-14 07:15:41.060   notice : (main): Started REST API on [127.0.0.1]:8989
2019-06-14 07:15:41.060   notice : (main): MaxScale started with 8 worker threads, each with a stack size of 8388608 bytes.

Ensure you see no error when looking at the above logs. Verify if the docker-proxy processes are listening on the published ports - 4006, 4008 and 8989:

$ netstat -tulpn | grep docker-proxy
tcp6       0      0 :::8989                 :::*                    LISTEN      4064/docker-proxy
tcp6       0      0 :::4006                 :::*                    LISTEN      4092/docker-proxy
tcp6       0      0 :::4008                 :::*                    LISTEN      4078/docker-proxy

At this point, our MaxScale is running and capable of processing queries.

MaxCtrl

MaxCtrl is a command line administrative client for MaxScale which uses the MaxScale REST API for communication. It is intended to be the replacement software for the legacy MaxAdmin command line client.

To enter MaxCtrl console, execute the "maxctrl" command inside the container:

$ docker exec -it maxscale maxctrl
 maxctrl: list servers
┌──────────┬──────────────┬──────┬─────────────┬─────────────────┬─────────────┐
│ Server   │ Address      │ Port │ Connections │ State           │ GTID        │
├──────────┼──────────────┼──────┼─────────────┼─────────────────┼─────────────┤
│ mariadb1 │ 192.168.0.91 │ 3306 │ 0           │ Master, Running │ 0-5001-1012 │
├──────────┼──────────────┼──────┼─────────────┼─────────────────┼─────────────┤
│ mariadb2 │ 192.168.0.92 │ 3306 │ 0           │ Slave, Running  │ 0-5001-1012 │
├──────────┼──────────────┼──────┼─────────────┼─────────────────┼─────────────┤
│ mariadb3 │ 192.168.0.93 │ 3306 │ 0           │ Slave, Running  │ 0-5001-1012 │
└──────────┴──────────────┴──────┴─────────────┴─────────────────┴─────────────┘

To verify if everything is okay, simply run the following commands:

maxctrl: list servers
maxctrl: list services
maxctrl: list filters
maxctrl: list sessions

To get further info on every component, prefix with "show" command instead, for example:

maxctrl: show servers
┌──────────────────┬──────────────────────────────────────────┐
│ Server           │ mariadb3                                 │
├──────────────────┼──────────────────────────────────────────┤
│ Address          │ 192.168.0.93                             │
├──────────────────┼──────────────────────────────────────────┤
│ Port             │ 3306                                     │
├──────────────────┼──────────────────────────────────────────┤
│ State            │ Slave, Running                           │
├──────────────────┼──────────────────────────────────────────┤
│ Last Event       │ new_slave                                │
├──────────────────┼──────────────────────────────────────────┤
│ Triggered At     │ Mon, 17 Jun 2019 08:57:59 GMT            │
├──────────────────┼──────────────────────────────────────────┤
│ Services         │ rw-service                               │
│                  │ rr-service                               │
├──────────────────┼──────────────────────────────────────────┤
│ Monitors         │ monitor                                  │
├──────────────────┼──────────────────────────────────────────┤
│ Master ID        │ 5001                                     │
├──────────────────┼──────────────────────────────────────────┤
│ Node ID          │ 5003                                     │
├──────────────────┼──────────────────────────────────────────┤
│ Slave Server IDs │                                          │
├──────────────────┼──────────────────────────────────────────┤
│ Statistics       │ {                                        │
│                  │     "connections": 0,                    │
│                  │     "total_connections": 0,              │
│                  │     "persistent_connections": 0,         │
│                  │     "active_operations": 0,              │
│                  │     "routed_packets": 0,                 │
│                  │     "adaptive_avg_select_time": "0ns"│
│                  │ }                                        │
├──────────────────┼──────────────────────────────────────────┤
│ Parameters       │ {                                        │
│                  │     "address": "192.168.0.93",           │
│                  │     "protocol": "MariaDBBackend",        │
│                  │     "port": 3306,                        │
│                  │     "extra_port": 0,                     │
│                  │     "authenticator": null,               │
│                  │     "monitoruser": null,                 │
│                  │     "monitorpw": null,                   │
│                  │     "persistpoolmax": 0,                 │
│                  │     "persistmaxtime": 0,                 │
│                  │     "proxy_protocol": false,             │
│                  │     "ssl": "false",                      │
│                  │     "ssl_cert": null,                    │
│                  │     "ssl_key": null,                     │
│                  │     "ssl_ca_cert": null,                 │
│                  │     "ssl_version": "MAX",                │
│                  │     "ssl_cert_verify_depth": 9,          │
│                  │     "ssl_verify_peer_certificate": true, │
│                  │     "disk_space_threshold": null,        │
│                  │     "type": "server",                    │
│                  │     "serv_weight": "1"│
│                  │ }                                        │
└──────────────────┴──────────────────────────────────────────┘

Connecting to the Database

The application's database user must be granted with the MaxScale host since from MariaDB server point-of-view, it can only sees the MaxScale host. Consider the following example without MaxScale in the picture:

Database name: myapp
User: myapp_user
Host: 192.168.0.133 (application server)

To allow the user to access the database inside MariaDB server, one has to run the following statement:

MariaDB> CREATE USER 'myapp_user'@'192.168.0.133' IDENTIFIED BY 'mypassword';
MariaDB> GRANT ALL PRIVILEGES ON myapp.* to 'myapp_user'@'192.168.0.133';

With MaxScale in the picture, one has to run the following statement instead (replace the application server IP address with the MaxScale IP address, 192.168.0.200):

MariaDB> CREATE USER 'myapp_user'@'192.168.0.200' IDENTIFIED BY 'mypassword';
MariaDB> GRANT ALL PRIVILEGES ON myapp.* to 'myapp_user'@'192.168.0.200';

From the application, there are two ports you can use to connect to the database:

4006 - Round-robin listener, suitable for read-only workloads.
4008 - Read-write split listener, suitable for write workloads.

If your application is allowed to specify only one MySQL port (e.g, Wordpress, Joomla, etc), pick the RW port 4008 instead. This is the safest endpoint connect regardless of the cluster type. However, if your application can handle connections to multiple MySQL ports, you may send the reads to the round-robin listener. This listener has less overhead and much faster if compared to the read-write split listener.

For our MariaDB replication setup, connect to either one of these endpoints as database host/port combination:

192.168.0.200 port 4008 - MaxScale - read/write or write-only
192.168.0.200 port 4006 - MaxScale - balanced read-only
192.168.0.91 port 3306 - MariaDB Server (master) - read/write
192.168.0.92 port 3306 - MariaDB Server (slave) - read-only
192.168.0.93 port 3306 - MariaDB Server (slave) - read-only

Note for multi-master cluster type like Galera Cluster and NDB Cluster, port 4006 can be used as multi-write balanced connections instead. With MaxScale you have many options to pick from when connecting to the database, with each of them provide its own set of advantages.

MaxScale Clustering with Docker Swarm

With Docker Swarm, we can create a group of MaxScale instances via Swarm service with more than one replica together with Swarm Configs. Firstly, import the configuration file into Swarm:

$ cat maxscale.conf | docker config create maxscale_config -

Verify with:

$ docker config inspect --pretty maxscale_config

Then, grant the MaxScale database user to connect from any Swarm hosts in the network:

MariaDB> CREATE USER 'maxscale'@'192.168.0.%' IDENTIFIED BY 'my_s3cret';
MariaDB> GRANT ALL PRIVILEGES ON *.* TO maxscale@'192.168.0.%';

When starting up the Swarm service for MaxScale, we can create multiple containers (called replicas) mapping to the same configuration file as below:

$ docker service create \
--name maxscale-cluster  \
--replicas=3 \
--publish published=4008,target=4008 \
--publish published=4006,target=4006 \
--config source=maxscale_config,target=/etc/maxscale.cnf \
mariadb/maxscale

The above will create three MaxScale containers spread across Swarm nodes. Verify with:

$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                     PORTS
yj6u2xcdj7lo        maxscale-cluster    replicated          3/3                 mariadb/maxscale:latest   *:4006->4006/tcp, *:4008->4008/tcp

If the applications are running within the Swarm network, you can simply use the service name "maxscale-cluster" as the database host for your applications. Externally, you can connect to any of the Docker host on the published ports and Swarm network will route and balance the connections to the correct containers in round-robin fashion. At this point our architecture can be illustrated as below:

In the second part, we are going to look at advanced use cases of MaxScale on Docker like service control, configuration management, query processing, security and cluster reconciliation.

Tags: