dolphinscheduler文档

时间:2024-02-16 10:37:34

 

集群模式

 

基础软件安装(必装项请自行安装)

1、MySQL需要JDBC Driver 5.1.47+

2、JDK (1.8+)

3、ZooKeeper (3.4.6+) 

4、Hadoop (2.6+)

 

、下载二进制tar.gz包

 

下载地址

https://dolphinscheduler.apache.org/zh-cn/download/download.html

 

tar -zxvf apache-dolphinscheduler-incubating-1.3.5-dolphinscheduler-bin.tar.gz -C /opt/dolphinscheduler;

 

mv apache-dolphinscheduler-incubating-1.3.5-dolphinscheduler-bin dolphinscheduler-bin

 

三、安装

 

1、数据库初始化

mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO \'{user}\'@\'%\' IDENTIFIED BY \'{password}\';

mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO \'{user}\'@\'localhost\' IDENTIFIED BY \'{password}\'; mysql> flush privileges;

 

 

2、修改配置文件conf/datasource.properties

spring.datasource.driver-class-name=com.mysql.jdbc.Driver

spring.datasource.url=jdbc:mysql://192.168.18.12:3306/dolphinscheduler?characterEncoding=UTF-8&allowMultiQueries=true

spring.datasource.username=root

spring.datasource.password=123qwe!@#QWE

 

 

 

执行命令sh script/create-dolphinscheduler.sh

查看mysql中的dolphinscheduler库下是否创建表

 

 

 

 

 

修改配置文件:dolphinscheduler_env.sh

 

 

 

export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.7

export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.7.7/etc/hadoop

export SPARK_HOME1=/usr/local/spark/spark-2.4.3-bin-hadoop2.7

export SPARK_HOME2=/opt/soft/spark2

export PYTHON_HOME=/usr/local/python3

export JAVA_HOME=/usr/local/src/jdk1.8.0_261

export HIVE_HOME=/usr/local/hive/apache-hive-2.3.7-bin

#export FLINK_HOME=/opt/soft/flink

#export DATAX_HOME=/opt/soft/datax/bin/datax.py

 

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$FLINK_HOME/bin:$DATAX_HOME:$PATH

 

 

 

 

修改配置文件conf/config/install_config.conf

 

dbtype="mysql"

 

# db config

# db address and port

dbhost="192.168.18.12:3306"

 

# db username

username="root"

 

# database name

dbname="dolphinscheduler"

 

# db passwprd

# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`

password="123qwe!@#QWE"

 

# zk cluster

zkQuorum="192.168.18.12:2181,192.168.18.14:2181,192.168.18.15:2181"

 

# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)

installPath="/usr/local/dolphinscheduler"

 

# deployment user

# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself

deployUser="root"

 

 

# alert config

# mail server host

mailServerHost="smtp.mxhichina.com"

 

# mail server port

# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.

mailServerPort="465"

 

# sender

mailSender="dataservice@gongsibao.com"

 

# user

mailUser="xyxu@gongsibao.com"

 

# sender password

# note: The mail.passwd is email service authorization code, not the email login password.

mailPassword="Kefu2017"

 

# TLS mail protocol support

starttlsEnable="true"

 

# SSL mail protocol support

# only one of TLS and SSL can be in the true state.

sslEnable="false"

 

#note: sslTrust is the same as mailServerHost

sslTrust="smtp.mxhichina.com"

                                                                                                                           # resource storage type:HDFS,S3,NONE

resourceStorageType="HDFS"

 

# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.

# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,

# Note,s3 be sure to create the root directory /dolphinscheduler

defaultFS="hdfs://master:9000"

 

# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore

#s3Endpoint="http://192.168.18.12:9010"

#s3AccessKey="xxxxxxxxxx"

#s3SecretKey="xxxxxxxxxx"

 

# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty

yarnHaIps="192.168.18.12"

 

# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.

singleYarnIp="yarnIp1"

 

# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended

resourceUploadPath="/dolphinscheduler"

 

# who have permissions to create directory under HDFS/S3 root path

# Note: if kerberos is enabled, please config hdfsRootUser=

                        

# kerberos config

# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore

kerberosStartUp="false"

# kdc krb5 config file path

krb5ConfPath="$installPath/conf/krb5.conf"

# keytab username

keytabUserName="hdfs-mycluster@ESZ.COM"

# username keytab path

keytabPath="$installPath/conf/hdfs.headless.keytab"

 

 

# api server port

apiServerPort="12345"

 

 

# install hosts

# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname

ips="master,slave1,slave2"

 

# ssh port, default 22

# Note: if ssh port is not default, modify here

sshPort="22"

 

# run master machine

# Note: list of hosts hostname for deploying master

masters="master"

# run worker machine

# note: need to write the worker group name of each worker, the default value is "default"

workers="slave1:default,slave2:default"

 

# run alert machine

# note: list of machine hostnames for deploying alert server

alertServer="master"

# run api machine

# note: list of machine hostnames for deploying api server

apiServers="master"

 

                                          

 

 

 

 

 

 

执行install_all.sh会出现五个进程 分布在集群的节点上)

 

MasterServer         ----- master服务

WorkerServer         ----- worker服务

LoggerServer         ----- logger服务

ApiApplicationServer ----- api服务

AlertServer          ----- alert服务

 

 

 

链接页面:http://192.168.18.12:12345/dolphinscheduler

初始用户名密码为 admin/dolphinscheduler123