Saturday, June 20, 2009

High Availability Cluster with DRBD & Heartbeat by Babar Zahoor Video Link

http://blip.tv/file/2264185/

Wednesday, June 17, 2009

High Availability Cluster with DRBD & Heartbeat

Created By Babar Zahoor (RHCE 5.0)



HA High Avalibility with DRBD & Heart Beat.


#### This How To belongs to My video on High Availability with drbd and heartbeat ####


OS CentOS 5.3 on both machines.

We will setup for Transparent squid on High Availability Cluster.

Packages are available on CentOS extras repository.


Our Scenario


We have two servers


baber 192.168.1.50 Primary server

farrukh 192.168.1.60 Secondry server



Setup for ip to name resolve ## we don't have DNS we need this step ##

Basic Setup Configuration.


[root@baber ~]# vim /etc/hosts

192.168.1.50 baber
192.168.1.60 farrukh

save & exit


[root@baber ~]# ping baber
PING baber (192.168.1.50) 56(84) bytes of data.
64 bytes from baber (192.168.1.50): icmp_seq=1 ttl=64 time=4.15 ms
64 bytes from baber (192.168.1.50): icmp_seq=2 ttl=64 time=0.126 ms
64 bytes from baber (192.168.1.50): icmp_seq=3 ttl=64 time=1.88 ms

[1]+ Stopped ping baber
[root@baber ~]# ping farrukh
PING farrukh (192.168.1.60) 56(84) bytes of data.
64 bytes from farrukh (192.168.1.60): icmp_seq=1 ttl=64 time=1.32 ms
64 bytes from farrukh (192.168.1.60): icmp_seq=2 ttl=64 time=0.523 ms
64 bytes from farrukh (192.168.1.60): icmp_seq=3 ttl=64 time=1.79 ms

[2]+ Stopped ping farrukh
[root@baber ~]#





[root@baber ~]# scp /etc/hosts 192.168.1.60:/etc/hosts

On Node1 servers:



stop unwanted services on both servers


[root@baber ~]# /etc/init/sendmail stop


[root@baber ~]# chkconfig --level 235 sendmail off


[root@baber ~]# iptables -F

[root@baber ~]#service iptables save



[root@farrukh ~]# /etc/init/sendmail stop


[root@farrukh ~]# chkconfig --level 235 sendmail off


[root@farrukh ~]# iptables -F

[root@farrukh ~]#service iptables save



[root@baber ~]# rpm -qa | grep ntp
ntp-4.2.2p1-9.el5.centos.1

[root@baber ~]#

Then we need to open ntp server configuration file.


# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery

# Permit all access over the loopback interface. This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).

### Edited By Babar Zahoor Jun 16 2009 ###
#server 0.centos.pool.ntp.org
#server 1.centos.pool.ntp.org
#server 2.centos.pool.ntp.org

#broadcast 192.168.1.255 key 42 # broadcast server
#broadcastclient # broadcast client
#broadcast 224.0.1.1 key 42 # multicast server
#multicastclient 224.0.1.1 # multicast client
#manycastserver 239.255.254.254 # manycast server
#manycastclient 239.255.254.254 key 42 # manycast client

# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.


########## for server use this and on clients comment this and use server serverIP ##################

server 127.127.1.0 # local clock



#fudge 127.127.1.0 stratum 10



# Drift file. Put this in a directory which the daemon can write to.
# No symbolic links allowed, either, since the daemon updates the file
# by creating a temporary in the same directory and then rename()'ing
# it to the file.
# driftfile /var/lib/ntp/drift

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
# Specify the key identifiers which are trusted.
# trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
# requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8
keys /etc/ntp/keys

save quit.


[root@baber ~]#
[root@baber ~]# /etc/init.d/ntpd start
[root@baber ~]# chkconfig --level 235 ntpd on



[root@farrukh ~]# vim ntp.conf
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery

# Permit all access over the loopback interface. This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
#restrict 127.0.0.1
#estrict -6 ::1

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).




server 192.168.1.50 ### add this line on second server ###





#server 0.centos.pool.ntp.org
#server 1.centos.pool.ntp.org
#server 2.centos.pool.ntp.org

#broadcast 192.168.1.255 key 42 # broadcast server
#broadcastclient # broadcast client
#broadcast 224.0.1.1 key 42 # multicast server
#multicastclient 224.0.1.1 # multicast client
#manycastserver 239.255.254.254 # manycast server
#manycastclient 239.255.254.254 key 42 # manycast client

# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.




#server 127.127.1.0 # local clock ##### #####
#fudge 127.127.1.0 stratum 10





# Drift file. Put this in a directory which the daemon can write to.
# No symbolic links allowed, either, since the daemon updates the file
# by creating a temporary in the same directory and then rename()'ing
# it to the file.
driftfile /var/lib/ntp/drift

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys

# Specify the key identifiers which are trusted.
#trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
#requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8


save & exit



[root@farrukh ~]# /etc/init.d/ntpd start
[root@farrukh ~]# chkconfig --level 235 ntpd on



[root@farrukh ~]# ntpdate -u 192.168.1.50



[root@farrukh ~]# watch ntpq -p -n


[root@baber ~]# watch ntpq -p -n




PARTITION SETUP On Both Servers.


Partion setup on both server identical same with fdisk


We have 3GB disks on both servers.

Partition Setup for Cluster Servers

We need to create LVM partition



[root@baber ~]# fdisk -l



[root@baber ~]# fdisk /dev/sdb


[root@baber ~]# fdisk /dev/sd
sda sda1 sda2 sdb sdb1
[root@farrukh ~]# fdisk /dev/sdb

Command (m for help): m
Command action
a toggle a bootable flag
b edit bsd disklabel
c toggle the dos compatibility flag
d delete a partition
l list known partition types
m print this menu
n add a new partition
o create a new empty DOS partition table
p print the partition table
q quit without saving changes
s create a new empty Sun disklabel
t change a partition's system id
u change display/entry units
v verify the partition table
w write table to disk and exit
x extra functionality (experts only)

Command (m for help): p

Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 522 4192933+ 8e Linux LVM

Command (m for help): d
Selected partition 1

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-522, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-522, default 522): +4000M

Command (m for help): p

Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 487 3911796 83 Linux

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 8e
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): p

Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 487 3911796 8e Linux LVM

Command (m for help):

Command (m for help): w


[root@baber ~]# partprobe



Create Physical Volume for LVM this is second step for LVM partition.


[root@baber ~]# pvcreat /dev/sdb1


Create Volume Group with this command


[root@baber ~]# vgcreate vgdrbd /dev/sdb1


Create Logical volume partition


[root@baber ~]# lvcreate -n lvdrbd /dev/mapper/vgdrbd -L +4000M


Note: Create LVM on Both servers identical same ...................




add these three values in sysctl.conf


[root@baber ~]#vi /etc/sysctl.conf

net.ipv4.conf.eth0.arp_ignore = 1

net.ipv4.conf.all.arp_announce = 2

net.ipv4.conf.eth0.arp_announce = 2



save & quit




[root@baber ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.eth0.arp_announce = 2
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 4294967295
kernel.shmall = 268435456
[root@baber ~]#



DRBD Setup
==========


Install drbd82 & kmod-drbd82 rpms using yum command.



[root@baber ~]#yum install -y drbd82 kmod-drbd82


open /etc/drbd.conf



[root@baber ~]#vim /etc/drbd.conf

global {
usage-count yes;
}



common {
syncer { rate 10M; }
}


resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
}

startup {
}

disk {
on-io-error detach;
}

net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}

syncer {
rate 10M;
al-extents 257;
}

on baber {
device /dev/drbd0;
disk /dev/VGdrbd/lvdrbd;
address 192.168.1.50:7788;
meta-disk internal;
}

on node2 {
device /dev/drbd0;
disk /dev/VGdrbd/lvdrbd;
address 192.168.1.60:7788;
meta-disk internal;
}

}


save it........

[root@baber ~]#
[root@baber ~]# scp /etc/drbd.conf farrukh:/etc/drbd.conf


We need to run module on both servers to run drbd


Load DRBD module both nodes:



[root@baber ~]# modprobe drbd


[root@baber ~]# echo "modprobe drbd" >> /etc/rc.local

[root@farrukh ~]# modprobe drbd


[root@farrukh ~]# echo "modprobe drbd" >> /etc/rc.local



##### run this on both servers ######

[root@baber ~]#drbdadm create-md r0

[root@farrukh ~]#drbdadm create-md r0


[root@baber ~]#drbdadm attach r0

[root@farrukh ~]#drbdadm attach r0


[root@baber ~]#drbdadm syncer r0

[root@farrukh ~]#drbdadm syncer r0

[root@baber ~]#drbdadm connect r0

[root@farrukh ~]#drbdadm connect r0



On Primary Node only


[root@baber ~]#drbdadm -- --overwrite-data-of-peer primary r0


On both Nodes:


[root@baber ~]#drbdadm up all

[root@farrukh ~]#drbdadm up all

On Primary Node only



[root@baber ~]#drbdadm -- primary all #### ON Node one Only ####




[root@baber ~]#watch cat /proc/drbd





only on baber ########## Primary Node ########


[root@baber ~]#mkfs.ext3 /dev/drbd0


[root@baber ~]#mkdir /data/

[root@baber ~]#mount /dev/drbd0 /data/

[root@baber ~]#
[root@baber ~]# df -hk
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
5967432 2625468 3033948 47% /
/dev/sda1 101086 12074 83793 13% /boot
tmpfs 257720 0 257720 0% /dev/shm
/dev/drbd0 4031516 107600 3719128 3% /data
[root@baber ~]#


On farrukh ####### Secondry Node #######

[root@farrukh ~]#mkdir /data



Heartbeat Setup:
================

Install heartbeat package using yum

Note: Internet connection is required or configure yum repository on your local machine with extras.


[root@baber ~]#yum install -y heartbeat heartbeat-pils heartbeat-stonith heartbeat-devel


[root@baber ~]#vim /etc/ha.d/ha.cf ## Create this file and copy this text ##

logfacility local0
keepalive 2
#deadtime 30 # USE THIS!!!
deadtime 10
# we use two heartbeat links, eth2 and serial 0
bcast eth0 ####### We can use eth1 instead of eth0 it's better option ########
#serial /dev/ttyS0
baud 19200
auto_failback on ################## Active Active state #################
node baber
node farrukh


save & quit.



Server Baber

[root@baber ~]#vi /etc/ha.d/haresources

baber IPaddr::192.168.1.190/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext3 squid

Server farrukh:

[root@farrukh ~]#vi /etc/ha.d/haresources

farrukh IPaddr::192.168.1.190/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext3 squid

On Both Servers:

[root@baber ~]#vi /etc/ha.d/authkeys

auth 3
3 md5 redhat ######### Use Long name as password #########

both NODE:


[root@baber ~]#chmod 600 /etc/ha.d/authkeys


[root@baber ~]#scp /etc/ha.d/authkeys farrukh:/etc/ha.d/authkeys



[root@baber ~]#chkconfig --level 235 heartbeat on




Note: if you have problem mounting /dev/drbd0 on /data then run these commands to check the status if you found the drbddisk stopped then start it.


[root@baber ~]#/etc/ha.d/resource.d/drbddisk r0 status
[root@baber ~]#/etc/ha.d/resource.d/drbddisk r0 start
[root@baber ~]#/etc/ha.d/resource.d/drbddisk r0 restart


[root@baber data]# service drbd status
drbd driver loaded OK; device status:
version: 8.0.13 (api:86/proto:86)
GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by buildsvn@c5-i386-build, 2008-10-02 13:31:44
m:res cs st ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /data ext3



we can see that servers are in Primary/Secondary state and working well with /data directory mounted.




To takeover the machine baber to node2 forcefully.


[root@baber ~]#/usr/lib/heartbeat/hb_takeover


Transparent Squid Configuration on both servers.


[root@baber ~]#vim /etc/sysctl.conf

# Controls IP packet forwarding

net.ipv4.ip_forward = 1 #### If it is 0 make it 1 for packet forwarding ####



save it

then

[root@baber ~]#scp /etc/sysctl.conf farrukh:/etc/sysctl.conf


[root@baber ~]#sysctl -p


[root@farrukh ~]# sysctl -p


[root@baber ~]#yum install -y squid


[root@baber ~]#vim /etc/squid/squid.conf

search these options using / and edit as required


http_port 3128 transparent

acl our_networks src 192.168.1.0/24 192.168.2.0/24

http_access allow our_networks


cache_dir ufs /data/squid 1000 32 256 ##### cache directories must be at /data/squid #####

visible_hostname squid.ha-cluster.com

save & exit



[root@baber ~]# cd /data

[root@baber ~]# mkdir squid

[root@baber ~]# chown squid:squid squid

Note: This is required on only primary server i.e baber

[root@baber ~]#scp /etc/squid/squid.conf farrukh:/etc/squid/squid.conf

[root@baber ~]#iptables -F

[root@baber ~]#iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -j REDIRECT --to-port 3128

[root@baber ~]#iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

[root@baber ~]#service iptables save

[root@farrukh ~]#iptables -F

[root@farrukh ~]#iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -j REDIRECT --to-port 3128

[root@farrukh ~]#iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

[root@farrukh ~]#service iptables save



On both servers

[root@baber ~]#/etc/init/heartbeat start

[root@baber ~]#ifconfig

[root@baber ~]#tail -f /var/log/squid/access.log

[root@farrukh ~]#/etc/init/heartbeat start

[root@farrukh ~]#ifconfig


Note: We must use VIP/Service IP which we define in heartbeat i.e. 192.168.1.190 as default gateway IP for accessing the internet transparently.


ALHAMDULILLAH We have Done it.............