集群模式
一、基础软件安装(必装项请自行安装)
1、MySQL需要JDBC Driver 5.1.47+
2、JDK (1.8+)
3、ZooKeeper (3.4.6+)
4、Hadoop (2.6+)
二、下载二进制tar.gz包
下载地址:
https://dolphinscheduler.apache.org/zh-cn/download/download.html
tar -zxvf apache-dolphinscheduler-incubating-1.3.5-dolphinscheduler-bin.tar.gz -C /opt/dolphinscheduler;
mv apache-dolphinscheduler-incubating-1.3.5-dolphinscheduler-bin dolphinscheduler-bin
三、安装
1、数据库初始化
mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO \'{user}\'@\'%\' IDENTIFIED BY \'{password}\';
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO \'{user}\'@\'localhost\' IDENTIFIED BY \'{password}\'; mysql> flush privileges;
2、修改配置文件conf/datasource.properties
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://192.168.18.12:3306/dolphinscheduler?characterEncoding=UTF-8&allowMultiQueries=true
spring.datasource.username=root
spring.datasource.password=123qwe!@#QWE
执行命令sh script/create-dolphinscheduler.sh
查看mysql中的dolphinscheduler库下是否创建表
修改配置文件:dolphinscheduler_env.sh
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.7 export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.7.7/etc/hadoop export SPARK_HOME1=/usr/local/spark/spark-2.4.3-bin-hadoop2.7 export SPARK_HOME2=/opt/soft/spark2 export PYTHON_HOME=/usr/local/python3 export JAVA_HOME=/usr/local/src/jdk1.8.0_261 export HIVE_HOME=/usr/local/hive/apache-hive-2.3.7-bin #export FLINK_HOME=/opt/soft/flink #export DATAX_HOME=/opt/soft/datax/bin/datax.py export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$FLINK_HOME/bin:$DATAX_HOME:$PATH
修改配置文件conf/config/install_config.conf
dbtype="mysql"
# db config
# db address and port
dbhost="192.168.18.12:3306"
# db username
username="root"
# database name
dbname="dolphinscheduler"
# db passwprd
# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
password="123qwe!@#QWE"
# zk cluster
zkQuorum="192.168.18.12:2181,192.168.18.14:2181,192.168.18.15:2181"
# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
installPath="/usr/local/dolphinscheduler"
# deployment user
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="root"
# alert config
# mail server host
mailServerHost="smtp.mxhichina.com"
# mail server port
# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
mailServerPort="465"
# sender
mailSender="dataservice@gongsibao.com"
# user
mailUser="xyxu@gongsibao.com"
# sender password
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="Kefu2017"
# TLS mail protocol support
starttlsEnable="true"
# SSL mail protocol support
# only one of TLS and SSL can be in the true state.
sslEnable="false"
#note: sslTrust is the same as mailServerHost
sslTrust="smtp.mxhichina.com"
# resource storage type:HDFS,S3,NONE
resourceStorageType="HDFS"
# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://master:9000"
# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
#s3Endpoint="http://192.168.18.12:9010"
#s3AccessKey="xxxxxxxxxx"
#s3SecretKey="xxxxxxxxxx"
# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarnHaIps="192.168.18.12"
# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
singleYarnIp="yarnIp1"
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
resourceUploadPath="/dolphinscheduler"
# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
# api server port
apiServerPort="12345"
# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="master,slave1,slave2"
# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"
# run master machine
# Note: list of hosts hostname for deploying master
masters="master"
# run worker machine
# note: need to write the worker group name of each worker, the default value is "default"
workers="slave1:default,slave2:default"
# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="master"
# run api machine
# note: list of machine hostnames for deploying api server
apiServers="master"
执行install_all.sh会出现五个进程 (分布在集群的节点上)
MasterServer ----- master服务
WorkerServer ----- worker服务
LoggerServer ----- logger服务
ApiApplicationServer ----- api服务
AlertServer ----- alert服务
链接页面:http://192.168.18.12:12345/dolphinscheduler
初始用户名密码为 admin/dolphinscheduler123