为什么在将数据上传到我的数据库时经常会看到“重置掉连接”?

时间:2022-04-09 18:21:26

I'm uploading hundreds of millions of items to my database via a REST API from a cloud server on Heroku to a database in AWS EC2. I'm using Python and I am constantly seeing the following INFO log message in the logs.

我正在通过REST API从Heroku上的云服务器向AWS EC2中的数据库上传数亿个项目到我的数据库。我正在使用Python,我不断在日志中看到以下INFO日志消息。

[requests.packages.urllib3.connectionpool] [INFO] Resetting dropped connection: <hostname>

This "resetting of the dropped connection" seems to take many seconds (sometimes 30+ sec) before my code continues to execute again.

在我的代码继续执行之前,这种“重置掉线连接”似乎需要几秒钟(有时30秒以上)。

  • Firstly what exactly is happening here and why?
  • 首先,究竟发生了什么,为什么?

  • Secondly is there a way to stop the connection from dropping so that I am able to upload data faster?
  • 其次有没有办法阻止连接丢失,以便我能够更快地上传数据?

Thanks for your help. Andrew.

谢谢你的帮助。安德鲁。

3 个解决方案

#1


10  

Requests uses Keep-Alive by default. Resetting dropped connection, from my understanding, means a connection that should be alive was dropped somehow. Possible reasons are:

请求默认使用Keep-Alive。根据我的理解,重置丢弃的连接意味着以某种方式丢弃应该存活的连接。可能的原因是:

  1. Server doesn't support Keep-Alive.
  2. 服务器不支持Keep-Alive。

  3. There's no data transfer in established connections for a while, so server drops connections.
  4. 已建立的连接中暂时没有数据传输,因此服务器会断开连接。

See https://*.com/a/25239947/2142577 for more details.

有关详细信息,请参阅https://*.com/a/25239947/2142577。

#2


7  

The problem is really that the server has closed the connection even though the client has requested it be kept alive.

问题实际上是服务器已关闭连接,即使客户端已请求它保持活动状态。

This is not necessarily because the server doesn't support keepalives, but could be that the server is configured to only allow a certain number of requests on a connection. This could be done to help spread out requests on different servers, but I think this practice is/was common as a practical defence against poorly written code that operates in the server (eg. PHP) that doesn't clean up after itself after serving a request (perhaps due to an error condition etc.)

这不一定是因为服务器不支持keepalive,但可能是服务器配置为仅允许连接上的特定数量的请求。这样做可以帮助在不同的服务器上传播请求,但我认为这种做法很常见,可以防止在服务器中运行的编写不良的代码(例如PHP)在服务后不能自行清理请求(可能是由于错误条件等)

If you think this is the case for you and you'd like to not see these logs (which are logged at INFO level), then you can add the following to quieten that part of the logging:

如果您认为这是您的情况,并且您不希望看到这些日志(在INFO级别记录),那么您可以添加以下内容来平息该部分日志记录:

# Really don't need to hear about connections being brought up again after server has closed it
logging.getLogger("requests.packages.urllib3.connectionpool").setLevel(logging.WARNING)

#3


4  

This is common practice for services that expose RESTful APIs to avoid abuse (or DoS).
If you're stressing their API they'll drop your connection.
Try getting your script to sleep a bit every once in a while to avoid the drop.

对于公开RESTful API以避免滥用(或DoS)的服务,这是常见做法。如果你强调他们的API,他们会放弃你的连接。尝试让你的脚本偶尔睡一会儿以避免跌落。

#1


10  

Requests uses Keep-Alive by default. Resetting dropped connection, from my understanding, means a connection that should be alive was dropped somehow. Possible reasons are:

请求默认使用Keep-Alive。根据我的理解,重置丢弃的连接意味着以某种方式丢弃应该存活的连接。可能的原因是:

  1. Server doesn't support Keep-Alive.
  2. 服务器不支持Keep-Alive。

  3. There's no data transfer in established connections for a while, so server drops connections.
  4. 已建立的连接中暂时没有数据传输,因此服务器会断开连接。

See https://*.com/a/25239947/2142577 for more details.

有关详细信息,请参阅https://*.com/a/25239947/2142577。

#2


7  

The problem is really that the server has closed the connection even though the client has requested it be kept alive.

问题实际上是服务器已关闭连接,即使客户端已请求它保持活动状态。

This is not necessarily because the server doesn't support keepalives, but could be that the server is configured to only allow a certain number of requests on a connection. This could be done to help spread out requests on different servers, but I think this practice is/was common as a practical defence against poorly written code that operates in the server (eg. PHP) that doesn't clean up after itself after serving a request (perhaps due to an error condition etc.)

这不一定是因为服务器不支持keepalive,但可能是服务器配置为仅允许连接上的特定数量的请求。这样做可以帮助在不同的服务器上传播请求,但我认为这种做法很常见,可以防止在服务器中运行的编写不良的代码(例如PHP)在服务后不能自行清理请求(可能是由于错误条件等)

If you think this is the case for you and you'd like to not see these logs (which are logged at INFO level), then you can add the following to quieten that part of the logging:

如果您认为这是您的情况,并且您不希望看到这些日志(在INFO级别记录),那么您可以添加以下内容来平息该部分日志记录:

# Really don't need to hear about connections being brought up again after server has closed it
logging.getLogger("requests.packages.urllib3.connectionpool").setLevel(logging.WARNING)

#3


4  

This is common practice for services that expose RESTful APIs to avoid abuse (or DoS).
If you're stressing their API they'll drop your connection.
Try getting your script to sleep a bit every once in a while to avoid the drop.

对于公开RESTful API以避免滥用(或DoS)的服务,这是常见做法。如果你强调他们的API,他们会放弃你的连接。尝试让你的脚本偶尔睡一会儿以避免跌落。