与BigQuery通信时，GAE App会出现套接字错误

Our GAE python application communicates with BigQuery using the Google Api Client for Python (currently we use version 1.3.1) with the GAE-specific authentication helpers. Very often we get a socket error while communicating with BigQuery.

我们的GAE python应用程序使用Google Api Client for Python(目前我们使用版本1.3.1)与GAE特定的身份验证帮助程序与BigQuery进行通信。我们经常在与BigQuery通信时遇到套接字错误。

More specifically, we build a python Google API client as follows

更具体地说,我们按如下方式构建了一个python Google API客户端

1. bq_scope = 'https://www.googleapis.com/auth/bigquery'
2. credentials = AppAssertionCredentials(scope=bq_scope)
3. http = credentials.authorize(httplib2.Http())
4. bq_service = build('bigquery', 'v2', http=http)

We then interact with the BQ service and get the following error

然后,我们与BQ服务交互并获得以下错误

File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/gae_override/httplib.py", line 536, in getresponse 'An error occured while connecting to the server: %s' % e) error: An error occured while connecting to the server: Unable to fetch URL: [api url...]

文件“/base/data/home/runtimes/python27/python27_dist/lib/python2.7/gae_override/httplib.py”,第536行,在getresponse中'连接到服务器时出错:%s'%e)错误:连接到服务器时出错:无法获取URL:[api url ...]

The error raised is of type google.appengine.api.remote_socket._remote_socket_error.error, not an exception that wraps the error.

引发的错误类型为google.appengine.api.remote_socket._remote_socket_error.error,而不是包装错误的异常。

Initially we thought that it might be timeout-related, so we also tried setting a timeout altering line 3 in the above snippet to

最初我们认为它可能与超时有关,所以我们也尝试在上面的代码片段中设置超时更改第3行

3. http = credentials.authorize(httplib2.Http(timeout=60))

However, according to the log output of client library the API call takes less than 1 second to crash and explicitly setting the timeout did not change the system behavior.

但是,根据客户端库的日志输出,API调用不到1秒就会崩溃,显式设置超时并不会改变系统行为。

Note that the error occurs in various API calls, not just a single one, and usually this happens on very light operations, for example we often see the error while polling BQ for a job status and rarely on data fetching. When we re-run the operation, the system works.

请注意,错误发生在各种API调用中,而不仅仅是单个API调用,并且通常会在非常轻的操作中发生,例如,我们经常在轮询BQ作业状态时看到错误,很少在数据获取时看到错误。当我们重新运行操作时,系统工作。

Any idea why this might happen and -perhaps- a best-practice to handle it?

知道为什么会发生这种情况并且 - 或许 - 这是处理它的最佳做法吗?

1 个解决方案

#1

All HTTP(s) requests will be routed through the urlfetch service.

所有HTTP(s)请求都将通过urlfetch服务进行路由。

Beneath that, the Google Api Client for Python uses httplib2 to make HTTP(s) requests and under the covers this library uses socket.

在此之下,Google Api Client for Python使用httplib2来发出HTTP请求,并且这个库使用套接字。

Since the error is coming from socket you might try to set the timeout there.

由于错误来自套接字,您可能会尝试在那里设置超时。

import socket
timeout = 30
socket.setdefaulttimeout(timeout)

If we continue up the stack httplib2 will use the timeout parameter from the socket level timeout.

如果我们继续向上堆栈,httplib2将使用套接字级超时的timeout参数。

http://httplib2.readthedocs.io/en/latest/libhttplib2.html

Moving further up the stack you can set the timeout and retries for BigQuery.

进一步向上移动堆栈可以设置BigQuery的超时和重试次数。

try:
    timeout = 30000
    num_retries = 5
    query_request = bigquery_service.jobs()
    query_data = {
        'query': (query_var),
        'timeoutMs': timeout,
    }

And finally you can set the timeout for urlfetch.

最后,您可以设置urlfetch的超时。

from google.appengine.api import urlfetch
urlfetch.set_default_fetch_deadline(30)

If you believe it's timeout related you might want to test each library / level to make sure the timeout is being passed correctly. You can also use a basic timer to see the results.

如果您认为超时相关,则可能需要测试每个库/级别以确保正确传递超时。您还可以使用基本计时器查看结果。

start_query = time.time()
query_response = query_request.query(
projectId='<project_name>',
body=query_data).execute(num_retries=num_retries)
end_query = time.time()
logging.info(end_query - start_query)

There are dozens of questions about timeout and deadline exceeded for GAE and BigQuery on this site so I wouldn't be surprised if you're hitting something weird.

关于GAE和BigQuery在这个网站上超时和截止日期的问题有很多问题,所以如果你遇到奇怪的事情我也不会感到惊讶。

Good luck!

#1