Difference between revisions of "MAN06 Failover for MySQL grid based services"
Line 2: | Line 2: | ||
<br /> | <br /> | ||
__TOC__ | |||
= Introduction = | |||
* Several critical grid services such as the Logical File Catalogue (LFC) or the VO Management Service (VOMS) server represent single points of failure in a grid infrastructure. When unavailable, a user can no longer access to the infrastructure since it is prevented from issuing new proxies, or is no longer able to access to the physical location of his data. | |||
* However, those services rely on MySQL backends which opens a window to replicate the relevant databases to different / backup services which could be used when the primary instances are unavailable. The MySQL DB replication process is one of the ways to get scability and higher availability. | |||
<br /> | |||
= Arquitecture = | |||
In this document we propose to follow a Master-Slave arquitecture for the MySQL replication, consisting in keeping DB copies of a main host (MASTER) in a secondary host (SLAVE). The slave host will only have read access to the Database entries. | |||
<br /> | |||
= Security = | |||
* For better availability, it is preferable to deploy the Master and the Slave services in different geographical locations, which normally means exposing the generated traffic to the internet. In that case, you will have to find a mechanims to encrypt the communication between the two hosts. | |||
* In this document, we propose to use '''Stunnel''': | |||
# Stunnel is a free multi-platform computer program, used to provide universal TLS/SSL tunneling service. | |||
# Stunnel can be used to provide secure encrypted connections for clients or servers that do not speak TLS or SSL natively. It runs on a variety of operating systems, including most Unix-like operating systems and Windows. | |||
# Stunnel relies on a separate library such as OpenSSL or SSLeay to implement the underlying TLS or SSL protocol. | |||
# Stunnel uses Public-key cryptography with X.509 digital certificates to secure the SSL connection. Clients can optionally be authenticated via a certificate too | |||
# For more references, please checl [http://www.stunnel.org/ www.stunnel.org] | |||
= The LFC example = | |||
= The VOMS server example = | |||
Revision as of 15:59, 15 June 2011
Introduction
- Several critical grid services such as the Logical File Catalogue (LFC) or the VO Management Service (VOMS) server represent single points of failure in a grid infrastructure. When unavailable, a user can no longer access to the infrastructure since it is prevented from issuing new proxies, or is no longer able to access to the physical location of his data.
- However, those services rely on MySQL backends which opens a window to replicate the relevant databases to different / backup services which could be used when the primary instances are unavailable. The MySQL DB replication process is one of the ways to get scability and higher availability.
Arquitecture
In this document we propose to follow a Master-Slave arquitecture for the MySQL replication, consisting in keeping DB copies of a main host (MASTER) in a secondary host (SLAVE). The slave host will only have read access to the Database entries.
Security
- For better availability, it is preferable to deploy the Master and the Slave services in different geographical locations, which normally means exposing the generated traffic to the internet. In that case, you will have to find a mechanims to encrypt the communication between the two hosts.
- In this document, we propose to use Stunnel:
- Stunnel is a free multi-platform computer program, used to provide universal TLS/SSL tunneling service.
- Stunnel can be used to provide secure encrypted connections for clients or servers that do not speak TLS or SSL natively. It runs on a variety of operating systems, including most Unix-like operating systems and Windows.
- Stunnel relies on a separate library such as OpenSSL or SSLeay to implement the underlying TLS or SSL protocol.
- Stunnel uses Public-key cryptography with X.509 digital certificates to secure the SSL connection. Clients can optionally be authenticated via a certificate too
- For more references, please checl www.stunnel.org
The LFC example
The VOMS server example
!!MySQL DB Replication
The DB replication process is one of the ways to get scability and higher availability. This process will consist in keeping DB copies of a main host (MASTER) in a secondary host (SLAVE). By architecture, the slave host will only have read access to the Database entries.
In this document we use 'stunnel' for security reasons but we are still searching information on mysql ssl connections which we can apply in the future.
As a case example, I'm following the method used to replicate a voms server DB.
!!!GENERIC SETUP info
MASTER hostname: server1.lip.pt MASTER username: root MASTER mysql superuser: root MASTER mysql replication user: db_rep SLAVE hostname: server2.lip.pt SLAVE username: root SLAVE mysql superuser: root DB to replicate: DB_*
!!!MASTER setup
__Step 1__: Configure stunnel to get secure connection between MASTER and SLAVE. Create /etc/stunnel.conf with the following contents:
# Authentication stuff verify = 2 CApath = /etc/grid-security/certificates cert = /etc/grid-security/hostcert.pem key = /etc/grid-security/hostkey.pem # Auth fails in chroot because CApath is not accessible #chroot = /var/run/stunnel #debug = 7 output = /var/log/stunnel.log pid = /var/run/stunnel/master.pid setuid = nobody setgid = nobody # Service-level configuration \[mysql\] accept = 3307 connect = 127.0.0.1:3306
__Step 2__: Start the service 'stunnel' and add it to rc.local
mkdir /var/run/stunnel chown nobody:nobody /var/run/stunnel stunnel /etc/stunnel.conf echo 'stunnel /etc/stunnel.conf' >> /etc/rc.local
__Step 3__: Setup the firewall to allow connections from SLAVE (ex.:server2.lip.pt) on port 3307 TCP
# MySQL replication -A INPUT -s 200.143.244.108 -m state --state NEW -m tcp -p tcp --dport 3307 -j ACCEPT
__Step 4__: Edit /etc/my.cnf
Setup de ID of the Master (usually 1), setup the binary log for the DB to be replicated.
[mysqld] server-id=1 # create replication log log-bin = /path_to_log_file/log_file.index # but only for this database binlog-do-db=DB_2 binlog-do-db=DB_1
OBS. Please ensure that the path declared under log-bin has the mysql:mysql ownerships, and that the binary log exists.
__Step 5__: Restart MySQL
/etc/init.d/mysqld restart
__Step 6__: Add a specific Replication User
GRANT REPLICATION SLAVE ON *.* TO 'db_rep'@'127.0.0.1' IDENTIFIED BY '16_char_password';
__Step 7__: Make a backup of the Databases
mysqldump -u root -p --default-character-set=latin1 --master-data=2 --databases DB_1 DB_2 > dump.sql
The option "--master-data=2" writes a comment on dump.sql that show the log file and the ID to be used on the Slave Setup. This option also lock all the tables while they are beeing copied, avoiding problems.
!!!SLAVE Setup
OBS.It must be set the same timezone on master and slave. This prevent problems with the script that runs to verify consistency between servers.
__Step 1__: Setup stunnel. Create /etc/stunnel.conf with the following contents:
# Authentication stuff verify = 2 CApath = /etc/grid-security/certificates cert = /etc/grid-security/hostcert.pem key = /etc/grid-security/hostkey.pem # Auth fails in chroot because CApath is not accessible #chroot = /var/run/stunnel #debug = 7 output = /var/log/stunnel.log pid = /var/run/stunnel/master.pid setuid = nobody setgid = nobody # Use it for client mode client = yes # Service-level configuration [mysql] accept = 127.0.0.1:3307 connect = server1.lip.pt:3307
__Step 2__: Start stunnel and add to rc.local
mkdir /var/run/stunnel chown nobody:nobody /var/run/stunnel stunnel /etc/stunnel.conf [root@voms root]# echo 'stunnel /etc/stunnel.conf' >> /etc/rc.local
__Step 3__: Edit /etc/my.cnf
Setup the ID of SLAVE (ID 2).
\[mysqld\] server-id=2 replicate-do-db=DB_1 replicate-do-db=DB_2
__Step 4__: Restart MySQL and insert dump.sql created on Master
/etc/init.d/mysql restart mysql -u root -p < dump.sql
__Step 5__: Start slave
login in mysql of slave
mysql -u root -p
Run the 2 queries above changing the values xxxxx and yyyyy by the values on top of dump.sql file.
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3307, MASTER_USER='db_rep', MASTER_PASSWORD='16_char_password', MASTER_LOG_FILE='xxxxx', MASTER_LOG_POS=yyyyy; SLAVE START;
__Step 6__: Check if everything is OK:
On mysql prompt run "show slave status\G", check the replication status.
Slave_IO_State: Waiting for master to send event
If status are diferent check if you can access to port 3307 of MASTER running
telnet server1.lip.pt 3307
If you can't reach it, you should also verify what is wrong on master, for example the firewall. Also you can see mysql errors on /var/lib/mysql
!!!TROUBLESHOOTING ===
* /var/log/stunnel.log * SHOW MASTER STATUS; * SHOW PROCESSLIST; * SHOW SLAVE STATUS;
http://dev.mysql.com/doc/refman/5.1/en/replication-options-slave.html#option_mysqld_read-only http://www.howtoforge.com/how-to-set-up-mysql-database-replication-with-ssl-encryption-on-centos-5.4