I'm building a Dataflow pipeline to process Stackdriver logs, the data are read from Pub/Sub and results written into BigQuery. When I read from Pub/Sub I get JSON strings of LogEntry objects but what I'm really interested in is protoPayload.line
records which contain user log messages. To get those I need to parse LogEntry
JSON object and I found a two years old Google example how to do it:
我正在构建一个Dataflow管道来处理Stackdriver日志,数据从Pub / Sub读取,结果写入BigQuery。当我从Pub / Sub读取时,我获得了LogEntry对象的JSON字符串,但我真正感兴趣的是包含用户日志消息的protoPayload.line记录。为了得到那些我需要解析LogEntry JSON对象的东西,我发现了一个两年前的Google示例如何做到这一点:
import com.google.api.client.json.JsonParser;
import com.google.api.client.json.jackson2.JacksonFactory;
import com.google.api.services.logging.model.LogEntry;
try {
JsonParser parser = new JacksonFactory().createJsonParser(entry);
LogEntry logEntry = parser.parse(LogEntry.class);
logString = logEntry.getTextPayload();
}
catch (IOException e) {
LOG.error("IOException parsing entry: " + e.getMessage());
}
catch(NullPointerException e) {
LOG.error("NullPointerException parsing entry: " + e.getMessage());
}
Unfortunately this doesn't work for me, the logEntry.getTextPayload()
returns null
. I'm not even sure if it's suppose to work as com.google.api.services.logging
library is not mentioned anywhere in Google Cloud docs, the current logging library seems to be google-cloud-logging.
不幸的是,这对我不起作用,logEntry.getTextPayload()返回null。我甚至不确定是否因为com.google.api.services.logging库在Google Cloud文档中的任何地方都没有提及,因此当前的日志库似乎是google-cloud-logging。
So if anyone could suggest what is the right or simplest way of parsing LogEntry
objects?
那么,如果有人能够建议解析LogEntry对象的正确或最简单的方法是什么?
1 个解决方案
#1
1
I ended up with manually parsing LogEntry
JSON with gson library, using the tree traversing approach in particular. Here is a small snippet:
我最终使用gson库手动解析LogEntry JSON,特别是使用树遍历方法。这是一个小片段:
static class ProcessLogMessages extends DoFn<String, String> {
@ProcessElement
public void processElement(ProcessContext c) {
String entry = c.element();
JsonParser parser = new JsonParser();
JsonElement element = parser.parse(entry);
if (element.isJsonNull()) {
return;
}
JsonObject root = element.getAsJsonObject();
JsonArray lines = root.get("protoPayload").getAsJsonObject().get("line").getAsJsonArray();
for (int i = 0; i < lines.size(); i++) {
JsonObject line = lines.get(i).getAsJsonObject();
String logMessage = line.get("logMessage").getAsString();
// Do what you need with the logMessage here
c.output(logMessage);
}
}
}
This is simple enough and works fine for me since I'm interested in protoPayload.line.logMessage
objects only. But I guess this is not ideal way of parsing LogEntry
objects if you need to work with many attributes.
这很简单,对我来说很好,因为我只对protoPayload.line.logMessage对象感兴趣。但是,如果您需要使用许多属性,我想这不是解析LogEntry对象的理想方法。
#1
1
I ended up with manually parsing LogEntry
JSON with gson library, using the tree traversing approach in particular. Here is a small snippet:
我最终使用gson库手动解析LogEntry JSON,特别是使用树遍历方法。这是一个小片段:
static class ProcessLogMessages extends DoFn<String, String> {
@ProcessElement
public void processElement(ProcessContext c) {
String entry = c.element();
JsonParser parser = new JsonParser();
JsonElement element = parser.parse(entry);
if (element.isJsonNull()) {
return;
}
JsonObject root = element.getAsJsonObject();
JsonArray lines = root.get("protoPayload").getAsJsonObject().get("line").getAsJsonArray();
for (int i = 0; i < lines.size(); i++) {
JsonObject line = lines.get(i).getAsJsonObject();
String logMessage = line.get("logMessage").getAsString();
// Do what you need with the logMessage here
c.output(logMessage);
}
}
}
This is simple enough and works fine for me since I'm interested in protoPayload.line.logMessage
objects only. But I guess this is not ideal way of parsing LogEntry
objects if you need to work with many attributes.
这很简单,对我来说很好,因为我只对protoPayload.line.logMessage对象感兴趣。但是,如果您需要使用许多属性,我想这不是解析LogEntry对象的理想方法。