Checkpoint防火墙因CoreXL被激活的防火墙实例(firewall instances)的数目不同而导致cluster协商失败的排除过程。
故障现象是:两台做cluster的防火墙中,cp-246的HA状态是ready,而另一台cp-248的状态是active,并且两台cp互相不显示对方的状态。
NJZQ-CP-246的Cluster对比检查
[NJZQ-CP-246]# cphaprob -a if
Required interfaces: 3
Required secured interfaces: 1
eth0 UP non sync(nonsecured), multicast
eth1 UP non sync(non secured), multicast
eth2 UP sync(secured),multicast
Virtual cluster interfaces: 3
eth0 221.226.154.194
eth1 192.168.200.247
eth2 19.19.19.247
[NJZQ-CP-246]#
[NJZQ-CP-246]# cphaprob state
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
2 (local) 19.19.19.246 0% Ready
[NJZQ-CP-246]#
[NJZQ-CP-246]# cphaprob list
Built-in Devices:
Device Name: Interface Active Check
Current state:OK
Registered Devices:
Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 77483.5 sec
Device Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 77477.4 sec
Device Name: cphad
Registration number: 2
Timeout: 2 sec
Current state: OK
Time since last report: 0.2 sec
Device Name: fwd
Registration number: 3
Timeout: 2 sec
Current state: OK
Time since last report: 0.5 sec
[NJZQ-CP-246]#
[NJZQ-CP-246]# cpstat ha -f all
Product name: High Availability
Major version: 6
Minor version: 0
Service pack: 1
Version string: N/A
Status code: 0
Status short: OK
Status long: Refer to the Notification andInterfaces tables for information about the problem
HA installed: 1
Working mode: High Availability (Active Up)
HA protocol version: 2
HA started: yes
HA state: ready
HA identifier: 2
Interface table
-------------------------------------------------------------
|Name|IP |Status|Verified|Trusted|Shared|Netmask|
-------------------------------------------------------------
|eth0|221.226.154.195|Up | 200| 0| 2|0.0.0.0|
|eth1|192.168.200.246|Up | 0| 0| 2|0.0.0.0|
|eth2| 19.19.19.246|Up | 0| 1| 2|0.0.0.0|
-------------------------------------------------------------
Problem Notification table
------------------------------------------------
|Name |Status|Priority|Verified|Descr|
------------------------------------------------
|Synchronization|OK | 0| 77531| |
|Filter |OK | 0| 77524| |
|cphad |OK | 0| 0| |
|fwd |OK | 0| 1| |
------------------------------------------------
Cluster IPs table
----------------------------------------------------------------------
|Name|IP |Netmask |Member Network |Member Netmask |
----------------------------------------------------------------------
|eth0|221.226.154.194|255.255.255.248|221.226.154.192|255.255.255.248|
|eth1|192.168.200.247| 255.255.255.0| 192.168.200.0| 255.255.255.0|
|eth2| 19.19.19.247| 255.255.255.0| 19.19.19.0| 255.255.255.0|
----------------------------------------------------------------------
Sync table
---------------------------------
|Name|IP |Netmask |
---------------------------------
|eth2|19.19.19.246|255.255.255.0|
---------------------------------
[NJZQ-CP-246]#
[NJZQ-CP-246]# fw ctl pstat
Machine Capacity Summary:
Memory used: 7% (126MB out of 1638MB) - below low watermark
Concurrent Connections: 0% (5 out of 24900) - below low watermark
Aggressive Aging is not active
Hash kernel memory (hmem) statistics:
Total memory allocated: 31457280 bytes in 7672 4KB blocks using 8 pools
Initial memory allocated: 20971520 bytes (Hash memory extended by10485760 bytes)
Memory allocation limit: 31457280bytes using 512 pools
Total memory bytes used:15350072 unused: 16107208 (51.20%) peak: 26094340
Total memory blocks used: 4436 unused: 3236 (42%) peak: 6794
Allocations: 25663486 alloc, 402789 failed alloc, 25502424 free
System kernel memory (smem) statistics:
Total memory bytes used: 113440916 peak: 153201032
Blocking memory bytes used: 2041508 peak: 2602416
Non-Blocking memory bytes used: 111399408 peak: 150598616
Allocations: 415867 alloc, 0 failed alloc, 411131 free, 0 failed free
Kernel memory (kmem) statistics:
Total memory bytes used: 96995928 peak: 148937068
Allocations: 26073835 alloc, 0 failed alloc, 25909727 free, 0 failedfree
External Allocations: 0 for packets, 0 for SXL
Kernel stacks:
0 bytes total, 0 bytes stack size, 0 stacks,
0 peak used, 0 max stack bytes used, 0 min stack bytes used,
0 failed stack calls
INSPECT:
0 packets, 0 operations, 0 lookups,
0 record, 0 extract
Cookies:
4739679 total, 0 alloc, 0 free,
11 dup, 346925 get, 77498 put,
4739829 len, 0 cached len, 0 chain alloc,
0 chain free
Connections:
464 total, 399 TCP, 50 UDP, 9 ICMP,
6 other, 0 anticipated, 30 recovered, 5 concurrent,
509 peak concurrent
Fragments:
0 fragments, 0 packets, 0 expired, 0 short,
0 large, 0 duplicates, 0 failures
NAT:
53/0 forw, 0/0 bckw, 52 tcpudp,
1 icmp, 40-39 alloc
Sync: //可以看出cluster的同步接口之间收发数据包有异常,这里收不到同步包(要先确认这不是防火墙策略禁止!)
Version: new
Status: Able to Send/Receive syncpackets
Sync packets sent:
total : 50885, retransmitted : 0, retrans reqs : 0, acks : 0
Sync packets received:
total : 0, were queued : 0, dropped by net : 0
retrans reqs : 0, received 0 acks
retrans reqs for illegal seq : 0
dropped updates as a result of syncoverload: 0
[NJZQ-CP-246]#
[NJZQ-CP-246]# cpconfig
This program will let you re-configure
your Check Point products configuration.
Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable cluster membership for this gateway
(7) Configure Check Point CoreXL
(8) Automatic start of Check Point Products
(9) Exit
Enter your choice (1-9) :7
Configuring Configure Check Point CoreXL...
===========================================
CoreXL is currently enabled with 6 firewall instances.
(1) Change the numberof firewall instances
(2) Disable Check Point CoreXL
(3) Exit
Enter your choice (1-3) : 1
This machine has 8CPUs.
Note: All cluster members must have the same number of firewallinstances enabled.
How many firewall instances would you liketo enable (2 to 4) [3] ? 4
CoreXL was enabledsuccessfully with 4 firewall instances.
Important: Thischange will take effect after reboot.
[NJZQ-CP-246]# reboot
Are you sure? (y/n) y
Broadcast message from root (pts/0) (WedJul 29 14:33:54 2015):
The system is going down for reboot NOW!
[NJZQ-CP-246]#
NJZQ-CP-248的Cluster对比检查
[NJZQ-CP-248]# cphaprob -a if
Required interfaces: 3
Required secured interfaces: 1
eth0 UP non sync(non secured), multicast
eth1 UP non sync(nonsecured), multicast
eth2 UP sync(secured),multicast
Virtual cluster interfaces: 3
eth0 221.226.154.194
eth1 192.168.200.247
eth2 19.19.19.247
[NJZQ-CP-248]# cphaprob state
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 (local) 19.19.19.248 100% Active
[NJZQ-CP-248]#
[NJZQ-CP-248]#
[NJZQ-CP-248]# cphaprob list
Built-in Devices:
Device Name: Interface Active Check
Current state: OK
Registered Devices:
Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 77425.4 sec
Device Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 77419.4 sec
Device Name: cphad
Registration number: 2
Timeout: 2 sec
Current state: OK
Time since last report: 0.8 sec
Device Name: fwd
Registration number: 3
Timeout: 2 sec
Current state: OK
Time since last report: 0.8 sec
Device Name: FIB
Registration number: 4
Timeout: none
Current state: OK
Time since last report: 145126 sec
[NJZQ-CP-248]#
[NJZQ-CP-248]# cpstat ha -f all
Product name: High Availability
Major version: 6
Minor version: 0
Service pack: 1
Version string: N/A
Status code: 0
Status short: OK
Status long: Refer to the Notification andInterfaces tables for information about the problem
HA installed: 1
Working mode: High Availability (Active Up)
HA protocol version: 2
HA started: yes
HA state: active
HA identifier: 1
Interface table
-------------------------------------------------------------
|Name|IP |Status|Verified|Trusted|Shared|Netmask|
-------------------------------------------------------------
|eth0|221.226.154.196|Up | 300| 0| 2|0.0.0.0|
|eth1|192.168.200.248|Up | 0| 0| 2|0.0.0.0|
|eth2| 19.19.19.248|Up | 0| 1| 2|0.0.0.0|
-------------------------------------------------------------
Problem Notification table
------------------------------------------------
|Name |Status|Priority|Verified|Descr|
------------------------------------------------
|Synchronization|OK | 0| 77681| |
|Filter |OK | 0| 77675| |
|cphad |OK | 0| 0| |
|fwd |OK | 0| 0| |
|FIB |OK | 0| 145382| |
------------------------------------------------
Cluster IPs table
----------------------------------------------------------------------
|Name|IP |Netmask |Member Network |Member Netmask |
----------------------------------------------------------------------
|eth0|221.226.154.194|255.255.255.248|221.226.154.192|255.255.255.248|
|eth1|192.168.200.247| 255.255.255.0| 192.168.200.0| 255.255.255.0|
|eth2| 19.19.19.247| 255.255.255.0| 19.19.19.0| 255.255.255.0|
----------------------------------------------------------------------
Sync table
---------------------------------
|Name|IP |Netmask |
---------------------------------
|eth2|19.19.19.248|255.255.255.0|
---------------------------------
[NJZQ-CP-248]#
[NJZQ-CP-248]# fw ctl pstat
Machine Capacity Summary:
Memory used: 3% (56MB out of 1638MB) - below low watermark
Concurrent Connections: 0% (15 out of 24900) - below low watermark
Aggressive Aging is not active
Hash kernel memory (hmem) statistics:
Total memory allocated: 20971520 bytes in 5115 4KB blocks using 5 pools
Total memory bytes used: 5420960 unused: 15550560 (74.15%) peak: 9363424
Total memory blocks used: 1590 unused: 3525 (68%) peak: 2434
Allocations: 20398394 alloc, 0 failed alloc, 20341055 free
System kernel memory (smem) statistics:
Total memory bytes used: 58076812 peak: 74594452
Blocking memory bytes used: 1435484 peak: 1435484
Non-Blocking memory bytes used: 56641328 peak: 73158968
Allocations: 4509 alloc, 0 failed alloc, 3473 free, 0 failed free
Kernel memory (kmem) statistics:
Total memory bytes used: 42463860 peak: 65598912
Allocations: 20401060 alloc, 0 failed alloc, 20343252 free, 0 failedfree
External Allocations: 0 for packets, 0 for SXL
Kernel stacks:
0 bytes total, 0 bytes stack size, 0 stacks,
0 peak used, 0 max stack bytes used, 0 min stack bytes used,
0 failed stack calls
INSPECT:
0 packets, 0 operations, 0 lookups,
0 record, 0 extract
Cookies:
8540948 total, 0 alloc, 0 free,
3288 dup, 4471698 get, 26365 put,
8614434 len, 0 cached len, 0 chain alloc,
0 chain free
Connections:
23178 total, 563 TCP, 17814 UDP, 3 ICMP,
4798 other, 0 anticipated, 52 recovered, 15 concurrent,
589 peak concurrent
Fragments:
0 fragments, 0 packets, 0 expired, 0 short,
0 large, 0 duplicates, 0 failures
NAT:
4312/0 forw, 74/0 bckw, 4369 tcpudp,
11 icmp, 14678-13878 alloc
Sync: (//可以看出cluster的同步接口之间收发数据包有异常,这里收不到同步包(要先确认这不是防火墙策略禁止!)
Version: new
Status: Able to Send/Receive syncpackets
Sync packets sent:
total : 119178, retransmitted : 0, retrans reqs : 0, acks : 0
Sync packets received:
total : 0, were queued : 0, dropped by net : 0
retrans reqs : 0, received 0 acks
retrans reqs for illegal seq : 0
dropped updates as a result of syncoverload: 0
[NJZQ-CP-248]#
[NJZQ-CP-248]# cpconfig
This program will let you re-configure
your Check Point products configuration.
Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable Advanced Routing
(7) Disable cluster membership for this gateway
(8) Configure Check Point CoreXL
(9) Automatic start of Check Point Products
(10) Exit
Enter your choice (1-10) :8
Configuring Configure Check Point CoreXL...
===========================================
CoreXL is currently enabled with 2 firewall instances.
//对比发现CP-248这台防火墙的CoreXL被激活的防火墙实例数目和CP-246对应的明显不同,做cluster-HA必须要保证此参数一致!下面为通过cpconfig修改本防火墙设备激活的防火墙实例的过程。
(1) Change the numberof firewall instances
(2) Disable Check Point CoreXL
(3) Exit
Enter your choice (1-3) : 1
This machine has 8CPUs.
Note: All cluster members must have the same number offirewall instances enabled.
How many firewall instances would you liketo enable (2 to 4) [3] ? 4
CoreXL was enabledsuccessfully with 4 firewall instances.
Important: Thischange will take effect after reboot.
[NJZQ-CP-248]# reboot
Are you sure? (y/n) y
Broadcast message from root (pts/0) (WedJul 29 14:24:14 2015):
The system is going down for reboot NOW!
[NJZQ-CP-248]#
重启之后:
[NJZQ-CP-246]# cphaprob state
Cluster Mode: NewHigh Availability (Active Up)
Number UniqueAddress Assigned Load State
1 19.19.19.248 100% Active
2 (local) 19.19.19.246 0% Standby
[NJZQ-CP-246]#
[NJZQ-CP-248]# cphaprob state
Cluster Mode: NewHigh Availability (Active Up)
Number UniqueAddress Assigned Load State
1 (local) 19.19.19.248 100% Active
2 19.19.19.246 0% Standby
[NJZQ-CP-248]#
本文出自 “见贤思齐的蜗牛” 博客,请务必保留此出处http://jeffsoung.blog.51cto.com/392776/1681227