I am using dataflow for my work to write some data into the bigtable.
Currently, I got a task to read rows from the bigtable.
However, whenever I try to read rows from the bigtable using bigtable-hbase-dataflow, it fails and complains as follow.
我正在使用数据流为我的工作将一些数据写入bigtable。目前,我有一项任务是从bigtable中读取行。但是,每当我尝试使用bigtable-hbase-dataflow从bigtable中读取行时,它就会失败并抱怨如下。
Error: (3218070e4dd208d3): java.lang.IllegalArgumentException: b <= a
at org.apache.hadoop.hbase.util.Bytes.iterateOnSplits(Bytes.java:1720)
at org.apache.hadoop.hbase.util.Bytes.split(Bytes.java:1683)
at org.apache.hadoop.hbase.util.Bytes.split(Bytes.java:1664)
at com.google.cloud.bigtable.dataflow.CloudBigtableIO$AbstractSource.split(CloudBigtableIO.java:512)
at com.google.cloud.bigtable.dataflow.CloudBigtableIO$AbstractSource.getSplits(CloudBigtableIO.java:358)
at com.google.cloud.bigtable.dataflow.CloudBigtableIO$Source.splitIntoBundles(CloudBigtableIO.java:593)
at com.google.cloud.dataflow.sdk.runners.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:413)
at com.google.cloud.dataflow.sdk.runners.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:171)
at com.google.cloud.dataflow.sdk.runners.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:149)
at com.google.cloud.dataflow.sdk.runners.worker.SourceOperationExecutor.execute(SourceOperationExecutor.java:58)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:288)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:221)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:173)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:193)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:173)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:160)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am using 'com.google.cloud.dataflow:google-cloud-dataflow-java-sdk-all:1.6.0' and 'com.google.cloud.bigtable:bigtable-hbase-dataflow:0.9.0' now.
Here's my code.
我现在正在使用'com.google.cloud.dataflow:google-cloud-dataflow-java-sdk-all:1.6.0'和'com.google.cloud.bigtable:bigtable-hbase-dataflow:0.9.0'。这是我的代码。
CloudBigtableScanConfiguration config = new CloudBigtableScanConfiguration.Builder()
.withProjectId("project-id")
.withInstanceId("instance-id")
.withTableId("table")
.build();
pipeline.apply(Read.<Result>from(CloudBigtableIO.read(config)))
.apply(ParDo.of(new Test()));
FYI, I just read from bigtable and just count rows using aggregator in Test DoFn.
仅供参考,我只是从bigtable中读取并在Test DoFn中使用聚合器计算行数。
static class Test extends DoFn<Result, Result> {
private static final long serialVersionUID = 0L;
private final Aggregator<Long, Long> rowCount = createAggregator("row_count", new Sum.SumLongFn());
@Override
public void processElement(ProcessContext c) {
rowCount.addValue(1L);
c.output(c.element());
}
}
I just followed tutorial on the dataflow document, but it fails. Can anyone help me out?
我刚刚关注了数据流文档的教程,但它失败了。谁能帮我吗?
1 个解决方案
#1
0
The root cause was a dependency issue:
根本原因是依赖性问题:
Previously, our build file omitted this dependency:
以前,我们的构建文件省略了这个依赖:
compile 'io.netty:netty-tcnative-boringssl-static:1.1.33.Fork22'
Today, I added the dependency and it resolved all the issues. I double-checked that the problem arises when I don't have it in the build file.
今天,我添加了依赖项,它解决了所有问题。我仔细检查了当我在构建文件中没有它时出现问题。
From https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/912#issuecomment-249999380.
来自https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/912#issuecomment-249999380。
#1
0
The root cause was a dependency issue:
根本原因是依赖性问题:
Previously, our build file omitted this dependency:
以前,我们的构建文件省略了这个依赖:
compile 'io.netty:netty-tcnative-boringssl-static:1.1.33.Fork22'
Today, I added the dependency and it resolved all the issues. I double-checked that the problem arises when I don't have it in the build file.
今天,我添加了依赖项,它解决了所有问题。我仔细检查了当我在构建文件中没有它时出现问题。
From https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/912#issuecomment-249999380.
来自https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/912#issuecomment-249999380。