云数据流上的Apache Beam - 无法查询Cadvisor

时间:2020-12-15 15:34:08

I have a cloud dataflow that is reading from a Pub/Sub and pushing data out to BQ. Recently the dataflow is reporting the error below and not writing any data to BQ.

我有一个从Pub / Sub读取并将数据推送到BQ的云数据流。最近,数据流报告下面的错误,而不是向BQ写入任何数据。

{
 insertId:  "3878608796276796502:822931:0:1075"  
 jsonPayload: {
  line:  "work_service_client.cc:490"   
  message:  "gcpnoelevationcall-01211413-b90e-harness-n1wd Failed to query CAdvisor at URL=<IPAddress>:<PORT>/api/v2.0/stats?count=1, error: INTERNAL: Couldn't connect to server"   
  thread:  "231"   
 }
 labels: {
  compute.googleapis.com/resource_id:  "3878608796276796502"   
  compute.googleapis.com/resource_name:  "gcpnoelevationcall-01211413-b90e-harness-n1wd"   
  compute.googleapis.com/resource_type:  "instance"   
  dataflow.googleapis.com/job_id:  "2018-01-21_14_13_45"   
  dataflow.googleapis.com/job_name:  "gcpnoelevationcall"   
  dataflow.googleapis.com/region:  "global"   
 }
 logName:  "projects/poc/logs/dataflow.googleapis.com%2Fshuffler"  
 receiveTimestamp:  "2018-01-21T22:41:40.053806623Z"  
 resource: {
  labels: {
   job_id:  "2018-01-21_14_13_45"    
   job_name:  "gcpnoelevationcall"    
   project_id:  "poc"    
   region:  "global"    
   step_id:  ""    
  }
  type:  "dataflow_step"   
 }
 severity:  "ERROR"  
 timestamp:  "2018-01-21T22:41:39.524005Z"  
}

Any ideas, on how could I help this? Has anyone faced a similar issue before?

任何想法,我怎么能帮助这个?以前有人遇到过类似的问题吗?

1 个解决方案

#1


2  

If this just happened once it could be attributed to a transient issue. The process running on the worker node can't reach cAdvisor. Either the cAdvisor container is not running or there is a temporal problem on the worker that can't contact cAdvisor and the job gets stuck.

如果这恰好发生一次可归因于一个短暂的问题。在工作节点上运行的进程无法访问cAdvisor。要么cAdvisor容器没有运行,要么工作者出现暂时问题,无法联系cAdvisor并且作业被卡住了。

#1


2  

If this just happened once it could be attributed to a transient issue. The process running on the worker node can't reach cAdvisor. Either the cAdvisor container is not running or there is a temporal problem on the worker that can't contact cAdvisor and the job gets stuck.

如果这恰好发生一次可归因于一个短暂的问题。在工作节点上运行的进程无法访问cAdvisor。要么cAdvisor容器没有运行,要么工作者出现暂时问题,无法联系cAdvisor并且作业被卡住了。