使用Dataflow Java SDK 2从Pubsub读取

时间:2022-07-09 15:21:50

A lot of the documentation for the Google Cloud Platform for Java SDK 2.x tell you to reference Beam documentation.

Google Cloud Platform for Java SDK 2.x的许多文档都会告诉您参考Beam文档。

When reading from PubSub using Dataflow, should I still be doing PubsubIO.Read.named("name").topic("");

当使用Dataflow从PubSub读取时,我是否还应该使用PubsubIO.Read.named(“name”)。topic(“”);

Or should I be doing something else?

或者我应该做别的事吗?

Also building off of that, is there a way to just print PubSub data received by the Dataflow to standard output or to a file?

另外,有没有办法将Dataflow收到的PubSub数据打印到标准输出或文件?

2 个解决方案

#1


1  

For Apache Beam 2.2.0, you can define the following transform to pull messages from a Pub/Sub subscription:

对于Apache Beam 2.2.0,您可以定义以下转换以从Pub / Sub订阅中提取消息:

PubsubIO.readMessages().fromSubscription("subscription_name")

This is one way to define a transform that will pull messages from Pub/Sub. However, the PubsubIO class contains different methods for pulling messages. Each method has slightly different functionality. See the PubsubIO documentation.

这是定义将从Pub / Sub中提取消息的转换的一种方法。但是,PubsubIO类包含用于提取消息的不同方法。每种方法的功能略有不同。请参阅PubsubIO文档。

You can write the Pub/Sub messages to a file using the TextIO class. See the examples in the TextIO documentation. See the Logging Pipeline Messages documentation for writing Pub/Sub messages to stdout.

您可以使用TextIO类将Pub / Sub消息写入文件。请参阅TextIO文档中的示例。有关将Pub / Sub消息写入stdout的信息,请参阅Logging Pipeline Messages文档。

#2


0  

Adding to what Adrew wrote above. Code to read strings from PubSubIO and write them to stdout (just for debugging) is below. That said, I will file internal bug to improve JavaDoc for PubsubIO, I think the current documentation is minimal.

添加到Adrew上面写的内容。从PubSubIO读取字符串并将它们写入stdout(仅用于调试)的代码如下所示。也就是说,我将提交内部错误来改进PubsubIO的JavaDoc,我认为当前的文档很少。

public static void main(String[] args) {

  Pipeline pipeline = Pipeline.create(PipelineOptionsFactory.fromArgs(args).create());
  pipeline
    .apply("ReadStrinsFromPubsub",
       PubsubIO.readStrings().fromTopic("/topics/my_project/my_topic"))
    .apply("PrintToStdout", ParDo.of(new DoFn<String, Void>() {
      @ProcessElement
      public void processElement(ProcessContext c) {
        System.out.printf("Received at %s : %s\n", Instant.now(), c.element()); // debug log
      }
    }));

  pipeline.run().waitUntilFinish();
}

#1


1  

For Apache Beam 2.2.0, you can define the following transform to pull messages from a Pub/Sub subscription:

对于Apache Beam 2.2.0,您可以定义以下转换以从Pub / Sub订阅中提取消息:

PubsubIO.readMessages().fromSubscription("subscription_name")

This is one way to define a transform that will pull messages from Pub/Sub. However, the PubsubIO class contains different methods for pulling messages. Each method has slightly different functionality. See the PubsubIO documentation.

这是定义将从Pub / Sub中提取消息的转换的一种方法。但是,PubsubIO类包含用于提取消息的不同方法。每种方法的功能略有不同。请参阅PubsubIO文档。

You can write the Pub/Sub messages to a file using the TextIO class. See the examples in the TextIO documentation. See the Logging Pipeline Messages documentation for writing Pub/Sub messages to stdout.

您可以使用TextIO类将Pub / Sub消息写入文件。请参阅TextIO文档中的示例。有关将Pub / Sub消息写入stdout的信息,请参阅Logging Pipeline Messages文档。

#2


0  

Adding to what Adrew wrote above. Code to read strings from PubSubIO and write them to stdout (just for debugging) is below. That said, I will file internal bug to improve JavaDoc for PubsubIO, I think the current documentation is minimal.

添加到Adrew上面写的内容。从PubSubIO读取字符串并将它们写入stdout(仅用于调试)的代码如下所示。也就是说,我将提交内部错误来改进PubsubIO的JavaDoc,我认为当前的文档很少。

public static void main(String[] args) {

  Pipeline pipeline = Pipeline.create(PipelineOptionsFactory.fromArgs(args).create());
  pipeline
    .apply("ReadStrinsFromPubsub",
       PubsubIO.readStrings().fromTopic("/topics/my_project/my_topic"))
    .apply("PrintToStdout", ParDo.of(new DoFn<String, Void>() {
      @ProcessElement
      public void processElement(ProcessContext c) {
        System.out.printf("Received at %s : %s\n", Instant.now(), c.element()); // debug log
      }
    }));

  pipeline.run().waitUntilFinish();
}