大数据之Apache Beam 使用Flink Runner管道参数

时间:2020-12-10 15:37:38

例子

Flink集群

mvn package exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--runner=FlinkRunner --flinkMaster=<flink master> --filesToStage=target/word-count-beam-bundled-0.1.jar --inputFile=/path/to/quickstart/pom.xml --output=/tmp/counts" -Pflink-runner

管道参数说明


Field Description Default Value
runner The pipeline runner to use. This option allows you to determine the pipeline runner at runtime. Set to FlinkRunnerto run using Flink.
streaming Whether streaming mode is enabled or disabled; true if enabled. Set to true if running pipelines with unbounded PCollections. false
flinkMaster The url of the Flink JobManager on which to execute pipelines. This can either be the address of a cluster JobManager, in the form "host:port" or one of the special Strings "[local]" or "[auto]""[local]" will start a local Flink Cluster in the JVM while "[auto]" will let the system decide where to execute the pipeline based on the environment. [auto]
filesToStage Jar Files to send to all workers and put on the classpath. Here you have to put the fat jar that contains your program along with all dependencies. empty
parallelism The degree of parallelism to be used when distributing operations onto workers. 1
checkpointingInterval The interval between consecutive checkpoints (i.e. snapshots of the current pipeline state used for fault tolerance). -1L, i.e. disabled
numberOfExecutionRetries Sets the number of times that failed tasks are re-executed. A value of 0 effectively disables fault tolerance. A value of -1 indicates that the system default value (as defined in the configuration) should be used. -1
executionRetryDelay Sets the delay between executions. A value of -1 indicates that the default value should be used. -1
stateBackend Sets the state backend to use in streaming mode. The default is to read this setting from the Flink config. empty, i.e. read from Flink config



本文出自 “无法言喻” 博客,请务必保留此出处http://limeixiong.blog.51cto.com/1888920/1975192