Sunday, April 22, 2012

Designing Backup Server for Web Farms

Hello All,
I am going to explain how to configure a Backup server in our scenario (last part). After designing firewall, gateway, log server, and load balancer, it's a right time to design a backup server for any disaster recovery. To remember you our network diagram, please look at to the following picture:


In order to set up, configure, and automate system backup, I did the following steps: 
mkdir /backup     ---> This is the location of backups. 
To automate the process of backup without requiring a password, I'll use an RSA public/private key pair for passwordless authentication. So, in both web servers, I used this command:
ssh-keygen -t rsa
Then, I copied the public key from web server 1 and 2 to Backup server with this commands:
From web server 1: scp /root/.ssh/id_rsa.pub 192.168.56.104:/root/.ssh/authorized_keys2
From web server 2: scp /root/.ssh/id_rsa.pub 192.168.56.104:/root/.ssh/ 

For second web server, after copying id_rsa.pub to Backup server, I appended the content of public key to the authorized_keys2 in Backup server.

Now, I can run rsync command without any authentication. But I can use the cron job to automate this process. So, I added two lines to each crontab for each web server and here is the output of my crontab:

For web server 1: 
crontab -l
0 1 */2 * * /usr/bin/rsync -avz /var/www 192.168.56.104:/backup/
0 1 */2 * * /usr/bin/rsync -avz /etc/httpd 192.168.56.104:/backup/

For web server 2:
crontab -l
0 23 */2 * * /usr/bin/rsync -avz /var/www 192.168.56.104:/backup/webserver2/
0 23 */2 * * /usr/bin/rsync -avz /etc/httpd 192.168.56.104:/backup/webserver2/

The backup runs every other day, the first one at 1:00 AM and the last one at 23:00 PM. They copy anything under these two directories (“/var/www” and “/etc/httpd”) at the first time by rsync command, and then will continue with incremental method at the next time. Also, I installed mail to receive the reports from cron jobs: yum install mailx

The following shows my content of first email that I received from cron jobs in web server 1:
 [root@f13-ws1 ~]# mail
Heirloom Mail version 12.5.  Type ? for help.
"/var/spool/mail/root": 2 messages
>   1 Cron Daemon          39/1336  "Cron <root@f13-ws1> /usr/bin/rsync -av"
    2 Cron Daemon           281/7066  "Cron <root@f13-ws1> /usr/bin/rsync -av"
& 1
Message  1:
From root@localhost.localdomain
Return-Path: <root@localhost.localdomain>
From: root@localhost.localdomain (Cron Daemon)
To: root@localhost.localdomain
Subject: Cron <root@f13-ws1> /usr/bin/rsync -avz /etc/httpd 192.168.56.104:/backup/webserver2/
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
X-Cron-Env: <USER=root>
Status: RO

sending incremental file list
httpd/
httpd/logs -> ../../var/log/httpd
httpd/modules -> ../../usr/lib64/httpd/modules
httpd/run -> ../../var/run/httpd
httpd/conf.d/
httpd/conf.d/README
httpd/conf.d/mod_dnssd.conf
httpd/conf.d/proxy_ajp.conf
httpd/conf.d/welcome.conf
httpd/conf/
httpd/conf/httpd.conf
httpd/conf/magic

sent 17796 bytes  received 147 bytes  11962.00 bytes/sec
total size is 48236  speedup is 2.69

& q
Held 2 messages in /var/spool/mail/root
[root@f13-ws1 ~]# 

Also, The following shows my content of second email that I received from cron jobs in web server 1.
I just skipped some lines:

[root@f13-ws1 ~]# mail
Heirloom Mail version 12.5.  Type ? for help.
"/var/spool/mail/root": 2 messages
>   1 Cron Daemon          39/1336  "Cron <root@f13-ws1> /usr/bin/rsync -av"
    2 Cron Daemon           281/7066  "Cron <root@f13-ws1> /usr/bin/rsync -av"
& 2
Message  2:
From root@localhost.localdomain
Return-Path: <root@localhost.localdomain>
From: root@localhost.localdomain (Cron Daemon)
To: root@localhost.localdomain
Subject: Cron <root@f13-ws1> /usr/bin/rsync -avz /var/www 192.168.56.104:/backup/webserver2/
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
X-Cron-Env: <USER=root>
Status: RO

sending incremental file list
created directory /backup/webserver2
www/
www/cgi-bin/
www/error/
www/error/HTTP_BAD_GATEWAY.html.var
www/error/HTTP_BAD_REQUEST.html.var
.
.
.
www/icons/small/unknown.png
www/icons/small/uu.gif
www/icons/small/uu.png

sent 148144 bytes  received 4698 bytes  43669.14 bytes/sec
total size is 291088  speedup is 1.90

& q
Held 2 messages in /var/spool/mail/root
[root@f13-ws1 ~]#

Conclusion
In short, you can make a load-balanced cluster for all web requests using any Linux box such as Fedora. There are many benefits of having a cluster handling incoming requests. The requests are just forwarded to any available computer with round-robin method. So, you can maximize the performance   of your web farms. For example, when I bombarded my web servers with wget command, it worked perfectly without any problems. Another advantage of using this method is easy administration and more importantly easy to configure and setup. There is only a big problem that I found when I was experimenting this scenario. Computers can not be removed from the cluster real-time and make it difficult to minimize downtime during upgrades or hardware failures. So, any client that redirects to  a web server which is down for any reason, the client will not receive its request and the firewall has not ability to figure out the available server. There is only a chance for client to refresh a page or request a page again to redirect to another available server if it doesn't redirect to the same server again which make our web farms unreliable.  Finally, having a log server in your web farm is very useful because you can centralize the log file in one location and it is very easy to analyze the log files. Also, it makes hard for hackers to access to your log files since they move to another place.  
Thanks all,
Khosro