前言
集群:将多台主机组织起来统一调度,满足某一特定需求。
集群类型:
1)LB:Load Balancing 负载均衡集群;
2)HA:High Availability 高可用集群;
3)HP:High Performance 高性能集群;
4)DS:Distributed System 分布式系统;
LB集群的软件实现:lvs
lvs:Linux Virtual Server(四层交换,四层路由)
根据调度算法将请求报文的目标IP和目标PORT转发至后端主机集群中的某台服务器;
lvs集群的术语:
vs:Virtul Server
rs:Real Server
CIP:Client IP
VIP:Director Virtual IP
DIP:Director IP
RIP:Real Server IP
lvs的实现:
ipvsadm:用户空间的命令行工具,用于管理集群服务及集群服务上的RS等;
ipvs:工作于内核上的netfilter INPUT钩子之上的程序代码;
(其集群的功能依赖于ipvsadm定义的集群服务器规则)
LVS-TYPE:
(1)lvs-nat:MASQUERADE
多目标的DNAT:通过将请求报文的目标地址和目标端口修改为挑选出的某RS的RIP和PORT来实现
①RIP和DIP应该使用私网地址,RS的网关应该指向DIP;
②请求和响应报文都要经由director转发;极高负载的场景中,Director可能会成为系统瓶颈
③支持端口映射;
④vs必须为Linux,RS可以为任意OS;
⑤RS的RIP与Director的DIP必须在同一IP网络中;
(2)lvs-dr:GATEWAY
通过修改请求报文的MAC地址进行转发,IP首部不会发生变化(源IP始终为CIP,目标IP始终为VIP);
①确保前端路由器将目标IP为VIP的请求报文一定会送给Director;
解决方案:静态绑定;
禁止RS响应VIP的ARP请求;
a)arptables;
b)修改各RS的内核参数,并把VIP配置在特定的接口上实现禁止其响应;
②RS的RIP可以使用私有地址,也可以使用公网地址;
③RS跟Director必须在同一物理网络中;
④请求报文必须由Director调度,但响应报文必须不能经过Director;
⑤不支持端口映射;
⑥各RS支持大多数的OS;
(3) lvs-tun:IPIP
不修改请求报文的ip首部,而是在原有的ip首部之外再次封装一个ip首部;
①RIP,DIP,VIP必须都是公网地址;
②RS网关不能指向DIP
③请求报文经由Director调度,但响应报文将直接发给CIP;
④不支持端口映射;
⑤RS的OS必须支持ip隧道功能;
(4)lvs-fullnat
通过同时修改请求报文的源ip和目标ip实现转发;
①VIP是公网地址;RIP和DIP是私网地址,且可以不在同一IP网络中,但需要通过路由互相通信;
②RS收到请求报文的源IP为DIP,因此其响应报文将发送给DIP;
③请求报文和响应报文都必须经由DIrector;
④支持端口映射;
⑤RS可使用任意OS;
lvs scheduler(lvs的调度方法):当用户请求到达时必须调度请求到后端的real server
(1)静态方法:仅根据算法本身调度,而不考虑当前的RS的负载状态;
1)RR:round robin,轮调;
2)WRR:weighted rr,加权轮调;(根据RS负载能力进行调度,权重大的负载多;)
3)SH:source ip hash,源地址哈希;(session绑定的方式,在director上维护一张键值对hash表,源ip地址为键,RS的IP为值,当请求到达时,可以根据hash表来查找源ip曾经到达的RSip,并将请求发往同一个RS,当请求到达时没有hash记录就使用加权轮调的算法调度RS;缺点:粒度过大,过于粗糙,久而久之会损坏负载均衡的效果)
4)DH:desination ip hash,目标地址哈希;(正向web代理,负载内网用户对互联网的请求;Client-->Director-->Web Cache Server)
(2)动态方法:根据算法及各RS当前的负载状态进行评估;
基于Overhead(负载值),值小的优先调度;
1)LC:least connection(最少连接)
Overhead=Active*256+Inactive
2)WLC:weighted LC
Overhead=(Active*256+Inactive)/weight
3)SED:Shortest Expection Delay(最短期望延迟)
Overhead=(Active+1)*256/weight
4)NQ:Nerver Queue(永不排队,SED算法的改进)
先按照权重由高到低各调度一个请求,再使用SED算法进行统一调度
5)LBLC:Locality-Based LC(基于本地的最少连接)
动态的DH算法;
Client-->Director-->Web Cache Server(正向代理)
6)LBLCR:LBLC with Replication,带复制功能的LBLC;
(每个算法的出现都是为了弥补上一个算法的缺陷)
lvs-nat的架构:
(1)配置负载均衡器
[root@Director ~]# ifconfig #准备一台负载均衡器,一个网卡模拟外网地址VIP,一个网卡是DIP,分别接到两个不同的交换机上eth0 Link encap:Ethernet HWaddr 00:0C:29:1D:BA:F1
inet addr:172.16.100.123 Bcast:172.16.255.255 Mask:255.255.0.0
inet6 addr: fe80::20c:29ff:fe1d:baf1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7285 errors:0 dropped:0 overruns:0 frame:0
TX packets:456 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:588563 (574.7 KiB) TX bytes:52419 (51.1 KiB)
eth1 Link encap:Ethernet HWaddr 00:0C:29:1D:BA:FB #后端远程节点的交换机要和此网卡连接到一起
inet addr:192.168.16.3 Bcast:192.168.16.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe1d:bafb/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7 errors:0 dropped:0 overruns:0 frame:0
TX packets:28 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:420 (420.0 b) TX bytes:1848 (1.8 KiB)
[root@Director ~]# echo 1 > /proc/sys/net/ipv4/ip_forward #开启负载均衡器的转发功能
[root@Director ~]# cat /proc/sys/net/ipv4/ip_forward
1
(2)配置两台后端远程节点
[root@node1 ~]# ifconfig #第一台远程节点ip地址eno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.61.2 netmask 255.255.255.0 broadcast 192.168.61.255 inet6 fe80::20c:29ff:fe70:e227 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:70:e2:27 txqueuelen 1000 (Ethernet) RX packets 4791 bytes 393497 (384.2 KiB) RX errors 0 dropped 10 overruns 0 frame 0 TX packets 283 bytes 33580 (32.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0[root@node1 ~]# route -n #配置路由指向负载均衡器的DIPKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.61.3 0.0.0.0 UG 100 0 0 eno16777736[root@node2 ~]# ifconfig #第二台远程节点ip地址eno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.61.1 netmask 255.255.255.0 broadcast 192.168.61.255 inet6 fe80::20c:29ff:fefe:9633 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:fe:96:33 txqueuelen 1000 (Ethernet) RX packets 4718 bytes 387058 (377.9 KiB) RX errors 0 dropped 10 overruns 0 frame 0 TX packets 195 bytes 24091 (23.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0[root@node2 ~]# route -n #配置路由指向负载均衡器的DIPKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.61.3 0.0.0.0 UG 100 0 0 eno16777736192.168.61.0 0.0.0.0 255.255.255.0 U 0 0 0 eno16777736
(3)配置负载均衡器
[root@Director ~]# ipvsadm -A -t 172.16.100.123:80 -s rr #配置负载均衡的VIP为172.16.100.123,并设置调度算法为rr ,并指明调度80端口的服务[root@Director ~]# ipvsadm -a -t 172.16.100.123:80 -r 192.168.61.1:80 -m #增加两台远程调度节点,并指明类型为nat[root@Director ~]# ipvsadm -a -t 172.16.100.123:80 -r 192.168.61.2:80 -m[root@Director ~]# ipvsadmIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.16.100.123:http rr -> 192.168.61.1:http Masq 1 0 0 -> 192.168.61.2:http Masq 1 0 0
(4)测试:
[root@node1 ~]# systemctl start httpd #开启远程节点的http服务[root@node1 ~]# vim /var/www/html/index.html #配置httpd主页<h1>192.168.61.1</h1>[root@node2 ~]# systemctl start httpd [root@node2 ~]# vim /var/www/html/index.html <h1>192.168.61.1</h1>[root@localhost ~]# curl 172.16.100.123 #在测试主机上请求http服务<h1>192.168.61.2</h1>[root@localhost ~]# curl 172.16.100.123<h1>192.168.61.1</h1>[root@localhost ~]# curl 172.16.100.123<h1>192.168.61.2</h1>
lvs-dr模型架构:
(1)配置负载均衡器(VIP、RIP在同一网段内):
注:负载均衡器和后端服务器都要配置一个VIP地址,为了能够接收请求,但掩码必须为255,255,255,255,广播地址也必须为自己,同时后端服务器的网卡内核参数的arp_ignore设置为1,arp_announce设置为2,使后端节点增加VIP时不会广播该地址,避免客户端请求直接发送到后端节点。
[root@Director ~]# ifconfig #负载均衡器配置eno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 #此为DIP inet 172.16.61.100 netmask 255.255.0.0 broadcast 172.16.255.255 inet6 fe80::20c:29ff:feb7:d79d prefixlen 64 scopeid 0x20<link> ether 00:0c:29:b7:d7:9d txqueuelen 1000 (Ethernet) RX packets 23828 bytes 1886786 (1.7 MiB) RX errors 0 dropped 38 overruns 0 frame 0 TX packets 942 bytes 97487 (95.2 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0eno16777736:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 #此为VIP inet 172.16.61.9 netmask 255.255.255.255 broadcast 172.16.61.9 ether 00:0c:29:b7:d7:9d txqueuelen 1000 (Ethernet)[root@Director ~]# ipvsadm -A -t 172.16.61.9:80 -s rr #配置集群服务,调度算法为rr [root@Director ~]# ipvsadm -a -t 172.16.61.9:80 -r 172.16.61.2:80 -g #配置RS类型为gateway[root@Director ~]# ipvsadm -a -t 172.16.61.9:80 -r 172.16.61.3:80 -g
(2)配置后端RS
[root@node1 ~]# cat /proc/sys/net/ipv4/conf/all/arp_ignore #增加vip时要先更改此两项内核参数1[root@node1 ~]# cat /proc/sys/net/ipv4/conf/all/arp_announce 2[root@node1 ~]# ifconfigeno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.16.61.2 netmask 255.255.0.0 broadcast 172.16.255.255 inet6 fe80::20c:29ff:fefe:9633 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:fe:96:33 txqueuelen 1000 (Ethernet) RX packets 9373 bytes 784171 (765.7 KiB) RX errors 0 dropped 22 overruns 0 frame 0 TX packets 856 bytes 132887 (129.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 1228 bytes 107680 (105.1 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1228 bytes 107680 (105.1 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0lo:0: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 #VIP要配置在本地环回地址 inet 172.16.61.9 netmask 255.255.255.255 loop txqueuelen 0 (Local Loopback)[root@node1 ~]# route -n #配置路由vip路由Kernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface172.16.0.0 0.0.0.0 255.255.0.0 U 100 0 0 eno16777736172.16.61.9 0.0.0.0 255.255.255.255 UH 0 0 0 lo[root@node1 ~]# vim /var/www/html/index.html <h1>172.16.61.2</h1>[root@node1 ~]# systemctl start httpd[root@node2 ~]# cat /proc/sys/net/ipv4/conf/all/arp_ignore 1[root@node2 ~]# cat /proc/sys/net/ipv4/conf/all/arp_announce 2[root@node2 ~]# ifconfigeno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.16.61.3 netmask 255.255.0.0 broadcast 172.16.255.255 inet6 fe80::20c:29ff:fe70:e227 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:70:e2:27 txqueuelen 1000 (Ethernet) RX packets 11479 bytes 948089 (925.8 KiB) RX errors 0 dropped 25 overruns 0 frame 0 TX packets 1063 bytes 168820 (164.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 1188 bytes 103452 (101.0 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1188 bytes 103452 (101.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0lo:0: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 172.16.61.9 netmask 255.255.255.255 loop txqueuelen 0 (Local Loopback)[root@node2 ~]# route -nKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface172.16.0.0 0.0.0.0 255.255.0.0 U 100 0 0 eno16777736172.16.61.9 0.0.0.0 255.255.255.255 UH 0 0 0 lo[root@node2 ~]# vim /var/www/html/index.html <h1>172.16.61.3</h1>[root@node2 ~]# systemctl start httpd
(3)测试:
[root@CentOS_6 ~]# curl 172.16.61.9<h1>172.16.61.3</h1>[root@CentOS_6 ~]# curl 172.16.61.9<h1>172.16.61.2</h1>
基于防火墙标记定义集群服务:
[root@Director ~]# iptables -t mangle -A PREROUTING -d 172.16.61.9 -p tcp --dport 80 -j MARK --set-mark 3 #先在防火墙的mangle表的PREROUTING链上为到达vip的请求打上标记[root@Director ~]# ipvsadm -A -f 3 -s rr #基于防火墙标记定义集群服务[root@Director ~]# ipvsadm -a -f 3 -r 172.16.61.2 -g #增加RS[root@Director ~]# ipvsadm -a -f 3 -r 172.16.61.3 -g[root@Director ~]# ipvsadm -L -nIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnFWM 3 rr -> 172.16.61.2:0 Route 1 0 0 -> 172.16.61.3:0 Route 1 0 0
lvs persistence:持久连接
功能:无论ipvs使用何种调度算法,其都能实现在指定时间范围内始终将来自于同一个ip地址的请求发往同一个RS;此功能是通过lvs持久连接模版(持久连接hash表)实现,其与调度算法无关;
模式:
每端口持久(ppc)
每客户端持久(pcc)
每FWM持久(PFWMC)
[root@Director ~]# ipvsadm -A -t 172.16.61.9:80 -s rr -p #基于端口持久,默认时间360s[root@Director ~]# ipvsadm -a -t 172.16.61.9:80 -r 172.16.61.2 -g [root@Director ~]# ipvsadm -a -t 172.16.61.9:80 -r 172.16.61.3 -g [root@Director ~]# ipvsadmIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.16.61.9:http rr persistent 360 -> 172.16.61.2:http Route 1 0 0 -> 172.16.61.3:http Route 1 0 0 [root@CentOS_6 ~]# curl 172.16.61.9 #测试<h1>172.16.61.3</h1>[root@CentOS_6 ~]# curl 172.16.61.9<h1>172.16.61.3</h1>[root@CentOS_6 ~]# curl 172.16.61.9<h1>172.16.61.3</h1>[root@CentOS_6 ~]# curl 172.16.61.9<h1>172.16.61.3</h1>