kettle文件自动化部署(shell脚本执行):命令行参数传入

时间:2023-03-09 04:31:47
kettle文件自动化部署(shell脚本执行):命令行参数传入

shell脚本中调用kitchen 和 pan去执行,job和transformation文件。分 windows和 dos系统两种。

举个简单的小例子

shell脚本:

export JAVA_HOME=/usr/local/java/jdk

export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar

export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/

export LC_ALL=en_US.UTF-8

echo "KETTLE_HOME=$KETTLE_HOME"

echo "starting..."

yesterdayid=`date -d $yesterday +%Y%m%d`

/home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday='2014-02-24' -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt

完整的脚本

#!/bin/sh

check_date()

{

    [ $# -ne 1 ] && return 1

    _lenStr=`expr length "$1"`

    [ "$_lenStr" -ne 10 ] && return 1

    date -d $1 "+%Y/%m/%d" | grep -q $1

   if [ $? -eq 1 ]

   then

            return 1

   else

     return 0

   fi

    return 0

}

vardate=`date +%Y%m%d%H%M%S`

echo today is `date +%Y/%m/%d`

yesterday=`date -d "yesterday" +%Y/%m/%d`

while [ -n "$1" ]; do

  case $1 in

    -d)

       shift

       yesterday=$1

       echo "your input is $yesterday"

      

       shift;;

     *)

       echo "$1 is wrong paratism"      

       break;;            

  esac

done

check_date $yesterday

if [ $? -eq 1 ];then

    echo "date format error! date format:(<yyyy/mm/dd>)"

    exit 1

fi

echo Data aggregation date : $yesterday

export JAVA_HOME=/usr/local/java/jdk

export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar

export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/

export LC_ALL=en_US.UTF-8

echo "KETTLE_HOME=$KETTLE_HOME"

echo "starting..."

yesterdayid=`date -d $yesterday +%Y%m%d`

/home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday=$yesterday -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt

echo "done!"

命令行参数传入:

几篇讲解:

http://blog.csdn.net/john_f_lau/article/details/9260863

http://forums.pentaho.com/showthread.php?54423-Passing-parameters-to-jobs-on-kitchen-command-line

http://wiki.pentaho.com/display/EAI/Named+Parameters

http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation

http://wiki.pentaho.com/display/EAI/Named+Parameters

http://blog.csdn.net/qqzyb/article/details/8939517

http://blog.sina.com.cn/s/blog_543e73a80100k0vz.html

http://www.cnblogs.com/wxjnew/p/3620792.html

两个例子,传入多个参数:

/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os='1' -param:appstore='all' -param:dt='2014-02-24' >/home/www/allyes/aso/etl/log.txt 2>/home/www/allyes/aso/etl/error.txt

/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os=1 -param:appstore='all' -param:dt='2014-02-24' -level=Detailed >/home/www/allyes/aso/etl/log.txt

命令行执行,options 后面可以是"="也可以是":"也可以是空格,三者都行,如kitchen.bat /file d:\   或者 -file=D:\ 或者/file:D:\

kitchen.bat /norep -file=D:/kettledata/mysal2orcle.kjb >> kitchen_%date:~0,10%.log

参数传入后,必须先在transformation中的setting设置里添加对应参数。然后用get variables控件获得

http://wiki.pentaho.com/display/EAI/Named+Parameters

http://type-exit.org/adventures-with-open-source-bi/2010/07/using-named-parameters-in-kettle/

两种格式(住linux下可以没有双引号quotation,windows要求参数parameter必须有双引号)

1:kitchen /file:"MyJob.kjb" /param:ServerName=MyServer

多个param:

 Linux: ./kitchen.sh -file:job.kjb -param:files.dir=/opt/files -param:max.date=2010-06-02

 Windows: Kitchen.bat -file:job.kjb “-param:files.dir=/opt/files” “-param:max.date=2010-06-02″

2:kitchen /file:"your job name.kjb" "command line argument 1" "command line argument 2" "command line argument 3"....

listparam,也是使用多个parameters,如:

sh pan.sh -file:/tmp/foo.ktr -listparam

Parameter: MASTER_HOST=, default=localhost : The master slave server hostname to connect to

Parameter: MASTER_PORT=, default=8080 : The master slave server HTTP control port

也可以写成,等同于:

user@host:$ sh pan.sh -file:/tmp/foo.ktr -param:MASTER_HOST=192.168.1.3 -param:MASTER_PORT=8181

Windows requires you to use quotes around the parameter otherwise the equals sign is treated as a space by the command interpreter:

c:\> pan.sh -file:/tmp/foo.ktr "-param:MASTER_HOST=192.168.1.3" "-param:MASTER_PORT=8181"

日志的选择,不同参数的设定:

-level 日志级别:(运行界面,log显示框左上角三个小图标,最后一个扳手锤子为设置level)

Rowlevel: print所有在Kettle中的有效日志,包括在大量复杂步骤的信息;

Debugging: 产生大量的日志信息,主要用于调试,但是不是在行级别(row level);

Detailed:允许用户看到比基本日志级别更富比较性的信息,额外的信息实例包括SQL查询语句和一般的DDL都会产生。

Basic:默认的日子级别;仅仅打印这些能够反映在步骤或者任务条目上的信息。

Minimal:通知你仅仅关于一个任务或者转化的信息。

Errorlogging only: 如果那儿有一个错误,显示错误消息;否则,什么都不显示。

Nothingat all: 即使当有错误存在的时候,不要产生任何日志。