Saturday, 20 May 2017

How to install MyRocks into mariaDB as a plugin?

Whats is MyRocks?
MyRocks is a storage engine that integrates RocksDB into MySQL with most features of InnoDB and the storage efficiency of RocksDB. It has been said that it provides 3-4x compression when compared to uncompressed InnoDB tables and 10x less write amplification when compared to InnoDB

What is RocksDB?
RocksDB is a embedded key-value data storage(a fork of LevelDB) which stores data in the disk unlike redis(redis stores data in memory). It uses Log-structured merge-tree algorithm that reduces write amplification.
https://en.wikipedia.org/wiki/Log-structured_merge-tree

Now let's come to the post topic. We are going to compile MyRocks engine as a plugin into mariaDB.

Requirements

1) gcc 4.8.1 and above
MyRocks document says it could be installed in centos 6.8 too. But it's a bit tedious since a MyRocks requires a gcc 4.8 which is not available through default repositories in Centos 6.8 . So let's choose Centos 7 which has gcc 4.8.5
2) gflags-devel
3) readline-devel
4) ncurses-devel
5) openssl-devel
6) lz4-devel
7) gdb
8) git

shell> rpm -qa | grep gcc
libgcc-4.8.5-11.el7.x86_64
gcc-4.8.5-11.el7.x86_64
gcc-gfortran-4.8.5-11.el7.x86_64
gcc-c++-4.8.5-11.el7.x86_64
shell> cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)

Installation Steps


1) The first step in the installation process is to compile mariaDB with MyRocks support

shell> yum install -y gflags-devel readline-devel ncurses-devel openssl-devel lz4-devel gdb git shell> cd /usr/local/
shell> git clone https://github.com/MariaDB/server.git mariadb-10.2
shell> cd mariadb-10.2
shell> git checkout bb-10.2-mariarocks
shell> git submodule init
shell> git submodule update
shell> cmake .
shell> make -j10

2) Next is installing the MySQL system tables. We would be using a different command instead of mysql_install_db

shell> cd /usr/local/mariadb-10.2
shell> cd mysql-test; ./mtr alias

The above step creates data directory in the below location(with a name install.db). Nothing to be alarmed with install.db. It's just the name. It's contents are not so different from a conventional MySQL data directory. We can rename it to whatever name we wish.

shell> ls -ltrh mysql-test/var/install.db
total 23M
drwxr-xr-x. 2 root root 6 May 20 11:33 test
-rw-rw----. 1 root root 5.0M May 20 11:33 ib_logfile1
drwxrwx---. 2 root root 20 May 20 11:33 performance_schema
drwxr-xr-x. 2 root root 4.0K May 20 11:33 mysql
drwxrwx---. 2 root root 4.0K May 20 11:33 mtr
-rw-rw----. 1 root root 2.8K May 20 11:33 ib_buffer_pool
-rw-rw----. 1 root root 5.0M May 20 11:33 ib_logfile0
-rw-rw----. 1 root root 12M May 20 11:33 ibdata1

3) Now we can copy the data directory install.db to our location of interest

shell> cp -r mysql-test/var/install.db /data/mysqldata/install.db shell> cp support-files/my-small.cnf /etc/my.cnf

4) Now add the below lines to /etc/my.cnf to enable MyRocks

plugin-dir=/usr/local/mariadb-10.2/storage/rocksdb
language=/usr/local/mariadb-10.2/sql/share/english
datadir=/data/mysqldata/slave2/install.db
plugin-load=ha_rocksdb
default-storage-engine=rocksdb
skip-innodb
default-tmp-storage-engine=MyISAM
collation-server=latin1_bin


5) We have now got everything installed and ready. Let's start the server

shell> ./sql/mysqld --defaults-file=/etc/my.cnf --user=mysql &

[2] 12245 [root@server1 mariadb-10.2]# 2017-05-20 12:52:52 140303005308992 [Note] ./sql/mysqld (mysqld 10.2.6-MariaDB-log) starting as process 12245 ...
2017-05-20 12:52:52 140303005308992 [Note] RocksDB: 2 column families found
2017-05-20 12:52:52 140303005308992 [Note] RocksDB: Column Families at start:
2017-05-20 12:52:52 140303005308992 [Note] cf=default
2017-05-20 12:52:52 140303005308992 [Note] write_buffer_size=67108864
2017-05-20 12:52:52 140303005308992 [Note] target_file_size_base=67108864
2017-05-20 12:52:52 140303005308992 [Note] cf=__system__
2017-05-20 12:52:52 140303005308992 [Note] write_buffer_size=67108864
2017-05-20 12:52:52 140303005308992 [Note] target_file_size_base=67108864
2017-05-20 12:52:52 140303005308992 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
2017-05-20 12:52:52 140303005308992 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
2017-05-20 12:52:52 140303005308992 [Note] RocksDB instance opened
2017-05-20 12:52:52 140303005308992 [Note] Plugin 'InnoDB' is disabled.
2017-05-20 12:52:52 140303005308992 [Note] Plugin 'FEEDBACK' is disabled.
2017-05-20 12:52:52 140303005308992 [Note] Server socket created on IP: '::'.
2017-05-20 12:52:52 140303005308992 [Warning] 'user' entry 'root@server1.ngl.com' ignored in --skip-name-resolve mode.
2017-05-20 12:52:52 140303005308992 [Warning] 'proxies_priv' entry '@% root@server1.ngl.com' ignored in --skip-name-resolve mode.
2017-05-20 12:52:52 140303005308992 [Note] Reading of all Master_info entries succeded
2017-05-20 12:52:52 140303005308992 [Note] Added new Master_info '' to hash table
2017-05-20 12:52:52 140303005308992 [Note] ./sql/mysqld: ready for connections.
Version: '10.2.6-MariaDB-log' socket: '/tmp/mysql3310.sock' port: 3310 Source distribution

Great. We have successfully started mariaDB. From the mysqld message we could see that mariaDB has started with MyRocks engine enabled.

So Simple isn't it ?


Let's login to the MySQL instance and query the information_schema to see if it shows RocksDB

shell> ./sql/mysql -uroot -p
password:
mysql> select ENGINE,SUPPORT,TRANSACTIONS from information_schema.engines where engine='ROCKSDB'\G
*************************** 1. row ***************************
ENGINE: ROCKSDB
SUPPORT: DEFAULT
TRANSACTIONS: YES
1 row in set (0.00 sec)

Yes RocksDB is available as a plugin with support for transactions.


Let's create a RocksDB table

mysql> create database ara;
mysql> \u ara
mysql> create table rocks_test( name varchar(10),age int(10));
mysql> show create table rocks_test\G
*************************** 1. row ***************************
Table: rocks_test
Create Table: CREATE TABLE `rocks_test` (
`name` varchar(10) COLLATE latin1_bin DEFAULT NULL,
`age` int(10) DEFAULT NULL
) ENGINE=ROCKSDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin
1 row in set (0.00 sec)

Let's insert some data


mysql> insert into rocks_test values ('aravinth',30);
mysql> insert into rocks_test values ('raju',31);

mysql> select * from rocks_test\G
*************************** 1. row ***************************
name: aravinth
age: 30
*************************** 2. row ***************************
name: raju
age: 31
2 rows in set (0.00 sec)


That's it. All done..

In the next post we will see if MyRocks actually provides 3-4x compression when compared to uncompress innoDB tables

Thursday, 18 May 2017

MySQL Enterprise backup : How to setup a slave for a standalone server with zero downtime?

 

 

It’s a well know fact that we could take backups with almost zero downtime with the “mysqldump” tool provided with the community edition of MySQL with a magical option --single-transaction on one  condition that all the tables are InnoDB. Myisam tables locked even with single transaction option.

 

Now let us talk about the MySQL Enterprise backup tool that is only available with the Enterprise edition. We are also offered a 30 day trial. It has been said that MEB(let us just call the MySQL Enterprise backup, MEB)  provides hot online backups with zero downtime.

 

So a hot backup is done online and doesn’t disrupt the normal operations of the database and it even captures the database operations happening during the backup process and applies it during the restoration process. This is applicable only to InnoDB tables.

 

When it comes to other engine tables like Myisam, CSV it has to read lock the tables..

So this is pretty much similar to mysqldump with --single-transaction in terms to locking and disruption to database operations, ain’t it?  But mysqldump produces a sql file whereas MEB produces and backup of the data directory.

 

Wait.., This is not the only difference. There are several other features available in MEB like parallelism, incremental backup, differential backup,backup compression backup encryption etc.., Let’s not get into those now.

Let’s first concentrate on creating a slave from a standalone server with very minimal disruption to database in the most simple way possible using MEB.

 

I have a MySQL community edition installed in port 3306 of my test server. I have downloaded the trial version of MEB from the below link

https://www.mysql.com/trials/

There are several packages available. Let's choose the generic linux version.

How to install MEB ?

 

The downloaded package in zip format

Unzip the file and you would receive below files

 

meb-3.12.3-linux-glibc2.5-x86-64bit.tar.gz
meb-3.12.3-linux-glibc2.5-x86-64bit.tar.gz.asc
meb-3.12.3-linux-glibc2.5-x86-64bit.tar.gz.md5

 The .tar.gz file is the one. 

Arav $] tar  -xzvf meb-3.12.3-linux-glibc2.5-x86-64bit.tar.gz
Arav $] cd meb-3.12.3-linux-glibc2.5-x86-64bit
Arav $] cd /bin/
Arav $] ls -ltrh
-rwxr-xr-x. 1 7161 31415 8.0M May  5  2016 mysqlbackup
 

 That’s it we have just installed MEB. So easy isn’t it.

Now steps to take backup. We are interested in taking the backup into a single image file

  

 
Arav $] ./mysqlbackup --user=root --password --host=127.0.0.1 --backup-image=/home/aravinth/ent_backup/my_ba.mbi --backup-dir=/home/aravinth/ent_backup/ backup-to-image
 

 

--backup-image= : location to place the backup image

--backup-dir= : stores temporary files created during backup. It also has the log file that stores the course of the backup process.

 The location would be /home/aravinth/ent_backup/meta/MEB_2017-05-17.12-22-52_image_backup.log.

This file also holds the binary log position to configure the slave.

 

To get the binary log position run the below command

  

 

Arav $] grep 'INFO: MySQL binlog position:'  /home/aravinth/ent_backup/meta/MEB_2017-05-17.12-22-52_image_backup.log

 

170517 12:40:27 mysqlbackup: INFO: MySQL binlog position: filename mysql-bin.000436, position 55853225

 

Now that backup is complete and we have the exact binlog position, we can configure the slave.

 

Let's start by installing MySQL in the slave server. Login to the slave server and use the below commands to install MySQL

 

 

Arav $] wget https://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.36-linux-glibc2.5-x86_64.tar.gz

Arav $] tar -xzvf mysql-5.6.36-linux-glibc2.5-x86_64.tar.gz

Arav $] mv mysql-5.6.36-linux-glibc2.5-x86_64 /usr/local/mysql

Arav $] cd /usr/local/mysql

Arav $] mkdir /mysqldata/

Arav $] chown -R mysql.mysql /mysqldata/

Arav $] cp support-files/mysql.server /etc/

Arav $] ./scripts/mysql_installdb --datadir=/mysqldata/ --user=mysql

Arav $]./bin/mysqld_safe --datadir=/mysqldata/ --user=mysql &

 

 
Now shutdown the slave server and empty the data directory to start the restoration process. MEB needs a empty data directory for restoration process. You could also use --force option which replaces the files in data directory  
 

 

Arav $ ] ./bin/mysqladmin -uroot -p shutdown

Arav $] rm -rf /mysqldata/*

 

 
 
Now run the below command in the master server to restore the backup in the empty data directory of the slave server.
 

 

Arav $] ./mysqlbackup --user=root --password --port=3310 --host=192.168.1.2 --datadir=/home/aravinth/ent_test/mysql/data/ --backup-image=/home/aravinth/ent_backup/my_ba.mbi --backup-dir=/tmp/ent_backup/ copy-back-and-apply-log

 

 

--host=192.168.1.2 : the ip of the slave server

 

--datadir=/home/aravinth/ent_test/mysql/data/ : the data directory path of the slave server which is empty. If not empty it would through and error, we could also use force option to override it.

 

--backup-dir=/tmp/ent_backup/ : temporary directory to store temp location to store the temporary files created during the restore process

 

copy-back-and-apply-log : This is the option that tells MEB to restore the files to data directory and apply the changes that happened during the backup process

 

Wait for the restoration to complete. It took around 5 minutes to complete for 30G of data directory. Once the restoration process is complete start the mysql using mysqld_safe or "service start mysql" and configure the slave as shown below.

 

Create the replication user in master. Login to the master server and execute the below commands to create the replication user

Arav $]./bin/mysql -u root -p

mysql $] grant replication slave on *.* to repicator@192.168.1.2 identified by 'replica'

 

 

Use the below commands in slave server to configure the slave using the position that we got from the MEB log file. 

 

Arav $]./bin/mysql -u root -p

mysql $] change master to master_host='192.168.1.1',master_user='replicator',master_password='replica',master_log_file='mysql-bin.000436',master_log_pos=55853225;

mysql $] start slave;

 

 

Now run "Show slave status" check if the slave is running perfect. Slave should be behind by a few seconds depending on the hits in the master server. It should start catching up.

 

Once the slave has caught up with master it can be made availave as for read hits from application.

 

Now let's look at the backup and restoration speed

 

Backup Type

mysqldump

MEB

Data Directory Size

30G

30G

Backup Time

20 mins 40 secs

18 mins 20 secs

Restore Time

35 mins 11 secs

5 mins 41 secs

 

Restore time is very less in MEB since it involves only disk level operations. Yet it is a great improvement that too with no downtime.

Also I believe with compression enabled in MEB backup, the time taken would yet go down. Let’s explore that another day.

Thursday, 18 December 2014

Drop Table slow


A few days back we were given a MySQL database hosted in EC2 with around 5TB data to move some of the tables to a new server.  The intention was to share the tables between 2 EC2 instances. Since AWS had the option to take online snapshots, the plan was to take a snapshot, create a new machine with that snapshot and drop the unwanted tables.
So everything went as planned until creating a new machine with the snapshot.  The real challenge was dropping the unwanted tables. 
It took around 4 minutes to Drop a table whose size is 20GB.  It took 20 minutes to drop a 100GB table. The time kept on increasing for larger tables. MySQL even went to “defunct” when we killed the drop query and at times crashed.
To track down this issue we executed drop table in one session and checked the processlist from another session that gave the below output.

mysql> show processlist \G
*************************** 1. row ***************************
     Id: 2352
   User: dbuser
   Host: localhost
     db: db
Command: Query
   Time: 2573
  State: checking permissions
   Info: DROP TABLE `test`

I wondered if it is a MySQL user related permission or a OS related one.
Now I went on to check the “InnoDB status”. Found something interesting in the “ROW OPERATIONS”. Printing it below

--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 6117, id 47405873539392, state: doing insert buffer merge
Number of rows inserted 167872705, updated 1018100, deleted 55251906, read 19531441141
71.93 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s

And Innodb status output also gave me lots of semaphore waits on data dictionary


InnoDB Change Buffering :

From MySQL reference manual “Index changes resulting from SQL statements, which could normally involve random I/O operations, are held back and performed periodically by a background thread. This sequence of operations can write the disk blocks for a series of index values more efficiently than if each value were written to disk immediately."

So here is what had really happened. The given DB server is big in data and used to get frequent inserts and didn't have enough ram to accumulate all the pages in “InnoDB buffer pool”. So it had to cache all the secondary index changes to “Insert Buffer”. These cached changes are flushed to disk only when the pages are loaded to the “buffer pool” or when the server is idle. When we took a snapshot and mounted it in a new server the database was idle and so the InnoDB main thread started merging all the changes cached in “insert buffer” to the disk. So it was the “Innodb main thread” that held the lock on those tables and the drop statement that we executed has waited for the InnoDB main thread to complete the insert buffer merge process.

So we waited for the buffer merge process to complete and then we executed the drop statements.
All those tables got dropped in seconds. Even a 700GB table got dropped in 5 to 10 seconds.
Also setting “innodb_fast_shutdown=0” and shutting down MySQL does the buffer merge operation.
But what I feel is “buffer merging by keeping the server idle or setting “innodb_fast_shutdown=0” and restarting results in same amount of time”.

Also the status “Checking permissions” that we got in the processlist output is very misleading. It is natural for everybody to thing that the drop statement is waiting for MySQL user permission or OS related permission. But actually it was waiting for permission from the InnoDB main thread.

I still wonder why we need to buffer merges for a table that is to be dropped. I need to think further.

Note : There might some some other reasons too for drop table slowness. But this is the one I have faced till date.

--Aravinth C


Tuesday, 23 September 2014

How to backup MySQL database with zero downtime


Choosing the right backup solution is most important for any database. People always choose a backup solution that has the below qualities

 a) Minimal Downtime
 b) Consistent data
 c) Less restoration time

There are several backup solutions like mysqldump, mysqlhottcopy, Xtrabackup, raw copy using cp/rsync/scp (and of course MySQL Enterprise backup. we are not going to talk about it here) etc.., I always prefer mysqldump when the data size is small(few GBs) even if it locks the tables. But what if the data size is huge in TB's, you cannot bring down the DB for a whole day. No client is comfortable with this, they need minimal or no downtime. Percona's Xtrabackup actually comes in handy in such situations. Xtrabackup locks the tables only when dumping the meta data.

                But what we are really interested here is LVM snapshot. With LVM snapshots we could actually take backups with zero downtime.

Requirements

 a) Disk should be partitioned with LVM
 b) All contents of the data directory should reside in the same LVM parititon. No symbolic links
 c) There should be sufficient Free Extents(un-partitioned space) available in the same Volume group of the partition containing MySQL data directory. Practically you can create a LVM snapshot with a minimum of 32M. But if your database is a busy system with too many changes the snapshot would get corrupted. So it is always better to keep extra space in the volume group

What are LVM snapshots?

Logical volumes store data in two essential parts
 a) metadata pointers ( similar to metadata in MySQL)
 b) data block(actual data)
When you create a snapshot of a LVM , it just creates metadata pointers and these metadata pointers occupy very less space . The data block is the actual data. I was able to keep 600 GB data in 32 MB LVM snapshot. Whenever changes are made in the original partition the data block has to be copied to the snapshot and this occupies. So snapshot gets corrupted when these data cannot be occupied ins the space assigne for the snapshot size. So I would always recommend to keep at-least 20% of the space occupied by the data directory to the snapshot.

Now type "vgdispay" list out the available volume groups

vgdisplay 
--- Volume group ---
  VG Name               vg_www
  System ID
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  22
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                6
  Open LV               6
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               1.09 TiB
  PE Size               4.00 MiB
  Total PE              285944
  Alloc PE / Size       285652 / 1.09 TiB
  Free  PE / Size       292 / 1.14 GiB
  VG UUID               37jef7-3q3E-FyZS-lMPG-5Jzi-djdO-Bgopa
 

I have about 1.14GB free which is enough for me to demonstrate.


Now type lvdisplay to list down all the lvm partitions

lvdisplay 
 --- Logical volume ---
  LV Path                /dev/volume_1/vol
  LV Name                lv_data
  VG Name                vg_www
  LV UUID                ECue1P-lZGa-qeAf-yD6l-eGqH-BtXG-ot5oXr
  LV Write Access        read/write
  LV Creation host, time  2014-07-18 10:49:57 
  LV Status              available
  # open                 1
  LV Size                500.00 GiB
  Current LE             128000
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:4




"/dev/vg_www/lv_data" is my logical volume that has the MySQL data directory.

Now we are ready to take the snapshot. It takes seconds to create snapshots




 lvcreate -s -L 1G -n vol_snapshot /dev/volume_1/vol
  Logical volume "vol_snapshot" created
Now the snapshot is ready and lets create a mount point and mount the snapshot.


mkdir /mnt/vol
mount /dev/volume_1/vol_snapshot /mnt/vol

Now you could copy the mysql data directory from the snapshot partiton "/mn/vol to a "NAS" or anywhere you want. This copy process never locks MySQL or hinders the original data directory. If you intend to create a new slave from this backup you could just "rsync" the files to the slave server, change the server id ,relaylog/binary log names and start the MySQL. Then reset the slave and configure the slave to run the exact position

LVM snapshots can also be used as a substitute for the original partition in case you loose the data in the original partition. This could be done easily with LVM commands like lvconvert and lvchange. Perhaps we would talk about this another time 


Monday, 15 September 2014

Configuring multiple Galera cluster nodes in a single machine


What I love about Galera cluster is it's simplicity, SST/IST protocol and the native MySQL feel.
SST(state transfer protocol) is nothing but a mechanism used by Galera to transfer data between the nodes. Galera supports 3 types of SST’s
               1)      mysqldump
               2)      rsync
               3)      xtrabackup( Xtrabackup should be installed separately)
Xtrabackup is the most preferred one with less downtime since it locks only when the schema is fetched.

There are lots of superb features in Galera that we could talk about, but we shall skip that and concentrate on what the topic is about

Practically speaking I wouldn't recommend running multiple Galera nodes in a single machine. But you might run into a situation where you want to create a development environment and you have a single machine. The procedure that I am going to explain here might be handy in such situations.
Every Galera reference manual instructs us to go for a minimum of 3 nodes even if it is possible with 2 nodes. So we are also going for a 3 node setup in a single machine.

This procedure might not be useful for those who do not have any basic idea on galera cluster. I consider that you already have a basic idea and working mechanism of galera.

The first thing to understand while setting up multiple nodes in a single machine is that each galera node needs two different ports that are not used by any other process/daemon.
              1)      MySQL port ( default 3306) should be differenct
              2)      Port for State transfer SST/IST( default 4567) should be different
              3)      My.cnf(/etc/my.cnf) files should be different
              4)      Data directories should be different
              5)      Sockets must be different(/tmp/mysql.sock)

I am listing down the details of the 3 nodes in the image below






Now download binary tar ball of either Percona XtraDB Cluster or Mariadb Galera cluster (Both has galera cluster libraries included). I used Percona XtraDB Cluster 5.6(PXC) whose download link is shared below

The first step is to install 3 separate MySQL instances with different cnf’s, ports (3310, 3311, and 3312), data directory and sockets. Once you are done with installing and configuring the 3 MySQL instances, keep them stopped.

      1)   Now follow the below steps to configure cluster in Node1.

Edit the cnf of Node1 and add the below changes

[root@cluster1 ~]# vim /etc/my_3310.cnf

datadir=/data/new/gal1
basedir=/usr/local/maria_cluster/
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
query_cache_size=0
binlog_format=row
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_cluster_address=gcomm:/10.0.0.10:4568,10.0.0.20:4569
wsrep_cluster_name=cluster1
wsrep_node_name=node1
wsrep_sst_method=rsync
wsrep_node_address=10.0.0.20:4567
   


Now start the cluster Node1 using the below command


[root@cluster1 ~]# mysqld_safe --defaults-file=/etc/my_3310.cnf  --wsrep-new-cluster --user=mysql &

 


The option "--wsrep-new-cluster" is a must when starting the first node. If this option is not used galera cluster will look for all the nodes mentioned in “wsrep_cluster_address”, but those nodes are down.
   


      2)  Now to add Node2 to the cluster follow the below steps

[root@cluster1 ~]# vim /etc/my_3311.cnf

datadir=/data/new/gal2
basedir=/usr/local/maria_cluster/
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
query_cache_size=0
binlog_format=row
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_cluster_address=gcomm:/10.0.0.10:4567,10.0.0.20:4569
wsrep_cluster_name=cluster1
wsrep_node_name=node2
wsrep_sst_method=rsync
wsrep_node_address=10.0.0.20:4568 
 Now use the below command to start Node2 


[root@cluster1 ~]# mysqld_safe --defaults-file=/etc/my_3311.cnf  --user=mysql &




      3)  Now add Node3 to the cluster.

[root@cluster1 ~]# vim /etc/my_3312.cnf

datadir=/data/new/gal3
basedir=/usr/local/maria_cluster/
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
query_cache_size=0
binlog_format=row
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_cluster_address=gcomm:/10.0.0.10:4567,10.0.0.20:4568
wsrep_cluster_name=cluster1
wsrep_node_name=node3
wsrep_sst_method=rsync
wsrep_node_address=10.0.0.20:4569
 

Now use the below command to start Node3


[root@cluster1 ~]# mysqld_safe --defaults-file=/etc/my_3312.cnf --user=mysql &

 

Now to test the setup , login to each of the nodes and create some databases/tables and check if  you are able to see those databases/tables in the other nodes and you are good to go.




Friday, 25 July 2014

Cost of enabling general log in MySQL

At some point of time every DBA must have come across a situation where he has to enable the general log to debug an issue or even to capture all the queries hitting the DB. In such situations what most DBA’s do is inform the stakeholders that he is  enabling the general log and this might cause a drop in performance and in turn the stakeholders ask a question like
“What will the percentage of performance drop?  Can you give us a number?”

The easy answer is “It is difficult to provide a number”.
 I had wanted to address this, but never really made an effort until I saw the below blog post “PERFORMANCE_SCHEMA VS SLOW QUERY LOG”  http://www.mysqlperformanceblog.com/2014/02/11/performance_schema-vs-slow-query-log/

In that particular blog Peter explains why he chose to use slow query log to capture slow queries instead of using the PERFORMANCE SCHEMA. At some point he talks about the overhead of enabling the PERFORMANCE SCHEMA. He used sysbench tool to calculate the overhead and provides us an approximate value. I wanted to use the same technique to calculate the overhead of general log
I did some benchmarks that gave me some interesting numbers.

The machine I used is a dedicated box with 12 cores. I am mentioning it here because it plays an important role in my benchmarks.

sysbench --test=oltp --oltp-test-mode=simple  --num-threads=8  --max-requests=0 --max-time=100  --mysql-db=test  --db-driver=mysql   --mysql-user=root --oltp-table-size=1000000 --mysql-password=’’ run

I did a series of benchmarks by gradually increasing the number of threads from 4 until 25


From the chart it is evident that increasing the number of threads increases the “queries per second” until the number of threads match the number of CPU cores. The machine has 12 cores and it is obvious that increasing the number of threads beyond 12 has little effect on “Queries per sec”.
Now the overhead


With number of threads =4 the overhead was around 14%. Increasing the number of threads gradually increases the overhead up to 42% and stays around 42 regardless of increasing the number of threads above 12(which is equal to the number of CPU cores).
But this doesn’t mean that enabling general log results in an overhead of 40% unless the system is very busy serving many parallel threads at the same instance.
This number could also vary depending on the Disk IO. I am interested to see the results of this benchmark on different hardware environments.  I would be happy if somebody could do the same benchmark on their machines and share the results here
                                                                                                                                                  --Aravinth

Friday, 7 June 2013

Why is my MySQL server using a single core while my processor has multiple cores ?

I am writting my first ever blog post after four and a half years since I started my career as a MySQL DBA. I used to read lots of blog posts  especially from mysqlperformceblog , but  never really had the interest of writting one.  But this interesting question from a client made me write this blog post.

“Why is my MySQL consuming only “core 0”, while I have 8 cores and there is hardly any usage in the other cores?”

This is a question put forward by one of our clients. Started my investigation by looking into the cpu usage stats. It was true that only “core 0” was getting used and the other cores were rarely getting used.

 Usage of “core 0” was like below


Where as there is hardly any usage in the other cores as shown below






But if we look at the overall cpu usage it rarely crossed 30%. But the client questioned “why can’t MySQL share the usage in ‘core 0’ with other cores?”

In my first look found that the machine is a dedicated for MySQL with multi-thread support, all tables are innodb, and innodb has been configured to use multiple threads.  I even checked the cpu affinity which was set to all cores . So there shouldn’t be any issue in multi-threading.

I started digging further. I issued “show processlist” and “top” command in different terminals.  In the top command I had enabled “show threads “option and ignored idle threads









From the above analysis I found there were no parallel active threads Most of the time MySQL gets a single read query (all other threads are in sleep state).  Since “core 0” is logically the first choice, obviously  that single query was handled by  “core 0”. That’s the reason for the spike in “core 0”.  There is no possibility of multiple cores getting used unless there are more than 1 query running in parallel, since a single query(connection) can use on one core.

By prolonged monitoring of processlist and top I also found that at times there were parallel queries that used of multiple cores.

Also I felt that “core 0” usage reaching 60% was not a big deal . But whenever there is room for improvement we should do it. That 60% usage was caused by a bad query which could be tuned.  
So I conclude we need to have multiple connections running in parallel to make the MySQL use multiple cores.  

Suggestions and discussions are greatly appreciated.