MPP 一、Greenplum 集群安装

时间:2023-03-09 19:28:21
MPP 一、Greenplum 集群安装

Installating and Initializing a Greenplum Database System...

1 安装说明

1.1 环境说明

名称 版本 下载地址
虚拟机 Oracle VirtualBox 4.3.10 http://www.virtualbox.org
操作系统 CentOS 6.7 64bit https://www.centos.org
greenplum 5.0.0-alpha.5 https://network.pivotal.io/products/pivotal-gpdb
文件系统 ext4  

1.2 集群说明

角色 数量 主机名 IP
Greenplum Master 1 gp-master 192.168.56.10
Greenplum Standby
Greenplum Segment 3 gp-sdw1、gp-sdw2、gp-sdw1 192.168.56.12、192.168.56.14、192.168.56.16

2 准备工作

2.1 Linux用户

在所有节点上创建greenplum管理员用户。

groupadd -g 530 gpadmin
useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin
chown -R gpadmin:gpadmin /home/gpadmin
echo "gpadmin" | passwd --stdin gpadmin

2.2 主机名和hosts配置

相同的配置先在一个节点上配置,配置完成后在2.6小节中复制到其它节点上。

vi /etc/hosts

192.168.56.10 gp-master
192.168.56.12 gp-sdw1
192.168.56.14 gp-sdw2
192.168.56.16 gp-sdw3

分别对应每一台主机修改主机名;

vi /etc/sysconfig/network

2.3 防火墙

禁用防火墙;

vi /etc/selinux/config

SELINUX=disabled

service iptables stop
chkconfig iptables off

查看防火墙状态 service iptables status

2.4 系统资源配置

vi /etc/sysctl.conf

kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.defalut.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
#vm.overcommit_memory = 2 ### 测试环境要取消这个,否则oracle启不来 ### 值为1

使资源文件生效;

sysctl -p

进程数配置;

vi /etc/security/limits.d/90-nproc.conf

*          soft    nproc     131072
root soft nproc unlimited

2.5 暂时启用gpadmin sudo

因为后面的集群节点上安装greenplum时会涉及到创建目录和文件操作,在此临时启用sudo,安装成功后撤销。

visudo

gpadmin    ALL=(ALL)       ALL
gpadmin ALL=(ALL) NOPASSWD:ALL

2.6 复制配置文件到所有节点上

scp /etc/hosts gp-sdw1:/etc
scp /etc/sysctl.conf gp-sdw1:/etc
scp /etc/security/limits.d/90-nproc.conf gp-sdw1:/etc/security/limits.d
scp /etc/selinux/config gp-sdw1:/etc/selinux

重启操作系统。

3 安装Greenplum DB

3.1 在Master节点上安装Greenplum DB

首先在master节点上安装,设置安装路径为/opt/greenplum/greenplum-db-5.0.0-alpha.5;

cd /tmp
unzip greenplum-db-5.0.0-alpha.5-rhel6-x86_64.zip
/tmp/greenplum-db-5.0.0-alpha.5-rhel6-x86_64.bin ********************************************************************************
Do you accept the Pivotal Database license agreement? [yes|no]
******************************************************************************** yes ********************************************************************************
Provide the installation path for Greenplum Database or press ENTER to
accept the default installation path: /usr/local/greenplum-db-5.0.0-alpha.5
******************************************************************************** /opt/greenplum/greenplum-db-5.0.0-alpha.5 ********************************************************************************
Install Greenplum Database into /opt/greenplum/greenplum-db-5.0.0-alpha.5? [yes|no]
******************************************************************************** yes ********************************************************************************
/opt/greenplum/greenplum-db-5.0.0-alpha.5 does not exist.
Create /opt/greenplum/greenplum-db-5.0.0-alpha.5 ? [yes|no]
(Selecting no will exit the installer)
******************************************************************************** yes Extracting product to /opt/greenplum/greenplum-db-5.0.0-alpha.5 ********************************************************************************
Installation complete.
Greenplum Database is installed in /opt/greenplum/greenplum-db-5.0.0-alpha.5 Pivotal Greenplum documentation is available
for download at http://gpdb.docs.pivotal.io
********************************************************************************

安装过程中系统会默认创建一个指向greenplum-db-5.0.0-alpha.5的软链接(greenplum-db);

ls -ltr /opt/greenplum/
total 8
lrwxrwxrwx 1 gpadmin gpadmin 28 May 30 12:14 greenplum-db -> ./greenplum-db-5.0.0-alpha.5
drwxr-xr-x 11 gpadmin gpadmin 4096 May 30 12:19 greenplum-db-5.0.0-alpha.5

修改目录权限和所有者为gpadmin;

chown -R gpadmin:gpadmin /opt/greenplum/
chown -R gpadmin:gpadmin /opt/greenplum/greenplum-db

3.2 在Master节点上配置集群host

su - gpadmin
mkdir -p /opt/greenplum/greenplum-db/conf
vi /opt/greenplum/greenplum-db/conf/hostlist gp-master
gp-sdw1
gp-sdw2
gp-sdw3

创建一个 seg_hosts ,包含所有的Segment Host的主机名;

vi /opt/greenplum/greenplum-db/conf/seg_hosts

gp-sdw1
gp-sdw2
gp-sdw3

3.3 配置SSH免密连接

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh # 不设置报错Error: unable to import module: No module named gppylib.commands
/opt/greenplum/greenplum-db/bin/gpssh-exkeys -f /opt/greenplum/greenplum-db/conf/hostlist [STEP 1 of 5] create local ID and authorize on local host
... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped [STEP 2 of 5] keyscan all hosts and update known_hosts file [STEP 3 of 5] authorize current user on remote hosts
... send to gp-sdw1
... send to gp-sdw2
... send to gp-sdw3 [STEP 4 of 5] determine common authentication file content [STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with gp-sdw1
... finished key exchange with gp-sdw2
... finished key exchange with gp-sdw3 [INFO] completed successfully

测试ssh gp-sdw1,不需要密码即可登录。

3.4 Segment节点上安装Greenplum DB

在Master节点上远程创建Segment节点所需的目录,并更改目录权限和所有者为gpadmin;

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/seg_hosts -e -v "sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum" [INFO] login gp-master
[INFO] login gp-sdw1
[INFO] login gp-sdw2
[INFO] login gp-sdw3
[ gp-sdw1] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
[ gp-sdw2] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
[ gp-sdw3] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
[INFO] completed successfully [Cleanup...]

将Master节点上安装的Greenplum db文件复制到所有Segment节点上安装;

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpseginstall -f /opt/greenplum/greenplum-db/conf/hostlist -u gpadmin -p gpadmin 20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /opt/greenplum/greenplum-db-5.0.0-alpha.5
binary_dir_location /opt/greenplum
binary_dir_name greenplum-db-5.0.0-alpha.5
20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-check cluster password access
20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-de-duplicate hostnames
20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-master hostname: gp-master
20170530:12:26:51:004409 gpseginstall:gp-master:gpadmin-[INFO]:-rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar; rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
20170530:12:26:51:004409 gpseginstall:gp-master:gpadmin-[INFO]:-cd /opt/greenplum; tar cf greenplum-db-5.0.0-alpha.5.tar greenplum-db-5.0.0-alpha.5
20170530:12:26:54:004409 gpseginstall:gp-master:gpadmin-[INFO]:-gzip /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar
20170530:12:27:22:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: mkdir -p /opt/greenplum
20170530:12:27:23:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: rm -rf /opt/greenplum/greenplum-db-5.0.0-alpha.5
20170530:12:27:24:004409 gpseginstall:gp-master:gpadmin-[INFO]:-scp software to remote location
20170530:12:27:40:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: gzip -f -d /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
20170530:12:28:00:004409 gpseginstall:gp-master:gpadmin-[INFO]:-md5 check on remote location
20170530:12:28:05:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: cd /opt/greenplum; tar xf greenplum-db-5.0.0-alpha.5.tar
20170530:12:28:38:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar
20170530:12:28:39:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: cd /opt/greenplum; rm -f greenplum-db; ln -fs greenplum-db-5.0.0-alpha.5 greenplum-db
20170530:12:28:40:004409 gpseginstall:gp-master:gpadmin-[INFO]:-rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
20170530:12:28:41:004409 gpseginstall:gp-master:gpadmin-[INFO]:-version string on master: gpssh version 5.0.0 alpha.5 build commit:2e87c5aa435c779b2f3837fa8c7273876497f6ba
20170530:12:28:41:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: . /opt/greenplum/greenplum-db/./greenplum_path.sh; /opt/greenplum/greenplum-db/./bin/gpssh --version
20170530:12:28:48:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: . /opt/greenplum/greenplum-db-5.0.0-alpha.5/greenplum_path.sh; /opt/greenplum/greenplum-db-5.0.0-alpha.5/bin/gpssh --version
20170530:12:28:49:004409 gpseginstall:gp-master:gpadmin-[INFO]:-SUCCESS -- Requested commands completed

检查每个节点安装和目录情况;

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e ls -l $GPHOME
```xml
[gp-master] ls -l /opt/greenplum/greenplum-db/.
[gp-master] total 40
[gp-master] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
[gp-master] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
[gp-master] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
[gp-master] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
[gp-master] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
[gp-master] -rw-r--r-- 1 gpadmin gpadmin 745 May 30 12:14 greenplum_path.sh
[gp-master] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
[gp-master] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
[gp-master] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
[gp-master] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
[ gp-sdw2] ls -l /opt/greenplum/greenplum-db/.
[ gp-sdw2] total 40
[ gp-sdw2] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
[ gp-sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
[ gp-sdw2] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
[ gp-sdw2] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
[ gp-sdw2] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
[ gp-sdw2] -rw-r--r-- 1 gpadmin gpadmin 745 May 30 12:14 greenplum_path.sh
[ gp-sdw2] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
[ gp-sdw2] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
[ gp-sdw2] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
[ gp-sdw2] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
[ gp-sdw1] ls -l /opt/greenplum/greenplum-db/.
[ gp-sdw1] total 40
[ gp-sdw1] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
[ gp-sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
[ gp-sdw1] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
[ gp-sdw1] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
[ gp-sdw1] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
[ gp-sdw1] -rw-r--r-- 1 gpadmin gpadmin 745 May 30 12:14 greenplum_path.sh
[ gp-sdw1] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
[ gp-sdw1] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
[ gp-sdw1] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
[ gp-sdw1] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
[ gp-sdw3] ls -l /opt/greenplum/greenplum-db/.
[ gp-sdw3] total 40
[ gp-sdw3] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
[ gp-sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
[ gp-sdw3] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
[ gp-sdw3] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
[ gp-sdw3] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
[ gp-sdw3] -rw-r--r-- 1 gpadmin gpadmin 745 May 30 12:14 greenplum_path.sh
[ gp-sdw3] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
[ gp-sdw3] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
[ gp-sdw3] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
[ gp-sdw3] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share

创建数据存储区域目录;

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e 'mkdir -p /opt/greenplum/data'

在master上创建master数据存储区域;

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpssh -h gp-master -e 'mkdir -p /opt/greenplum/data/master'

在Segment节点上创建数据存储区域

su - gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/seg_hosts -e 'mkdir -p /opt/greenplum/data/primary && mkdir -p /opt/greenplum/data/mirror'

3.5 环境变量配置

gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e -v "cat >> /home/gpadmin/.bash_profile <<EOF

source /opt/greenplum/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master
export GPPORT=5432
export PGDATABASE=gp_sydb
EOF"

3.6 NTP 配置

启用master节点上的ntp,并在Segment节点上配置和启用NTP;

echo "server gp-master perfer" >>/etc/ntp.conf
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -v -e 'sudo ntpd'
/opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -v -e 'sudo /etc/init.d/ntpd start && sudo chkconfig --level 35 ntpd on'

4 初始化Greenplum DB

4.1 初始化前检查

检查主机名配置;

su gpadmin
source /opt/greenplum/greenplum-db/greenplum_path.sh
gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e hostname [ gp-sdw3] hostname
[ gp-sdw3] gp-sdw3
[ gp-sdw1] hostname
[ gp-sdw1] gp-sdw1
[gp-master] hostname
[gp-master] gp-master
[ gp-sdw2] hostname
[ gp-sdw2] gp-sdw2

检查节点与节点之间文件读取;

gpcheckperf -h gp-sdw1 -h gp-sdw2 -d /tmp -r d -D -v
gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -d /tmp -r d -D -v
$ gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -r N -d /tmp
/opt/greenplum/greenplum-db/./bin/gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -r N -d /tmp -------------------
-- NETPERF TEST
------------------- ====================
== RESULT
====================
Netperf bisection bandwidth test
gp-master -> gp-sdw1 = 72.220000
gp-sdw2 -> gp-sdw3 = 21.470000
gp-sdw1 -> gp-master = 43.510000
gp-sdw3 -> gp-sdw2 = 44.200000 Summary:
sum = 181.40 MB/sec
min = 21.47 MB/sec
max = 72.22 MB/sec
avg = 45.35 MB/sec
median = 44.20 MB/sec [Warning] connection between gp-sdw2 and gp-sdw3 is no good
[Warning] connection between gp-sdw1 and gp-master is no good
[Warning] connection between gp-sdw3 and gp-sdw2 is no good

4.2 初始化

初始化 Greenplum 配置文件模板都在/opt/greenplum/greenplum-db/docs/cli_help/gpconfigs目录下,gpinitsystem_config是初始化 Greenplum 的模板,此模板中 Mirror Segment的配置都被注释;创建一个副本,对其修改;

cd /opt/greenplum/greenplum-db/docs/cli_help/gpconfigs
cp gpinitsystem_config initgp_config
vi initgp_config declare -a DATA_DIRECTORY=(/opt/greenplum/data/primary /opt/greenplum/data/primary /opt/greenplum/data/primary)
MASTER_HOSTNAME=gp-master
MASTER_DIRECTORY=/opt/greenplum/data/master
declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data/mirror /opt/greenplum/data/mirror /opt/greenplum/data/mirror)
DATABASE_NAME=gp_sydb
MACHINE_LIST_FILE=/opt/greenplum/greenplum-db/conf/seg_hosts

执行初始化;

gpinitsystem -c initgp_config -S

若初始化失败,需要删除数据目录重新初始化;

5 后续操作

5.1 停止和启动集群

gpstop -a
gpstart -a

5.2 登录数据库

$ psql -d postgres

postgres=# \l # 查询数据库
List of databases
Name | Owner | Encoding | Access privileges
-----------+---------+----------+---------------------
gp_sydb | gpadmin | UTF8 |
postgres | gpadmin | UTF8 |
template0 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
template1 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
(4 rows) postgres=# \l # 查询数据库表

5.3 集群状态

gpstate -e #查看mirror的状态
gpstate -f #查看standby master的状态
gpstate -s #查看整个GP群集的状态
gpstate -i #查看GP的版本
gpstate --help #帮助文档,可以查看gpstate更多用法