In my app the state of a common object is changed by making requests, and the response depends on the state.
在我的应用程序中,通过发出请求来更改公共对象的状态,响应取决于状态。
class SomeObj(): def __init__(self, param): self.param = param def query(self): self.param += 1 return self.paramglobal_obj = SomeObj(0)@app.route('/')def home(): flash(global_obj.query()) render_template('index.html')
If I run this on my development server, I expect to get 1, 2, 3 and so on. If requests are made from 100 different clients simultaneously, can something go wrong? The expected result would be that the 100 different clients each see a unique number from 1 to 100. Or will something like this happen:
如果我在我的开发服务器上运行它,我希望得到1,2,3等等。如果同时向100个不同的客户提出请求,可能出现问题吗?预期的结果是100个不同的客户端每个都看到1到100之间的唯一数字。或者会发生这样的事情:
- Client 1 queries.
self.param
is incremented by 1. - 客户端1查询。 self.param增加1。
- Before the return statement can be executed, the thread switches over to client 2.
self.param
is incremented again. - 在可以执行return语句之前,线程切换到客户端2. self.param再次递增。
- The thread switches back to client 1, and the client is returned the number 2, say.
- 线程切换回客户端1,然后客户端返回数字2,比如说。
- Now the thread moves to client 2 and returns him/her the number 3.
- 现在线程移动到客户端2并返回他/她的数字3。
Since there were only two clients, the expected results were 1 and 2, not 2 and 3. A number was skipped.
由于只有两个客户端,预期结果为1和2,而不是2和3.跳过了一个数字。
Will this actually happen as I scale up my application? What alternatives to a global variable should I look at?
当我扩展我的应用程序时,这会发生吗?我应该看一下全局变量的替代方案?
1 个解决方案
#1
29
You can't use global variables to hold this sort of data. Not only is it not thread safe, it's not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.
您不能使用全局变量来保存此类数据。它不仅不是线程安全的,而且不是过程安全的,并且生产中的WSGI服务器产生多个进程。如果您使用线程处理请求,您的计数不仅会出错,而且还会根据处理请求的进程而有所不同。
Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs. If you need to load and access Python data, consider multiprocessing.Manager
. You could also use the session for simple data that is per-user.
使用Flask外部的数据源来保存全局数据。数据库,memcached或redis都是适当的独立存储区域,具体取决于您的需求。如果需要加载和访问Python数据,请考虑multiprocessing.Manager。您还可以将会话用于每个用户的简单数据。
The development server may run in single thread and process. You won't see the behavior you describe since each request will be handled synchronously. Enable threads or processes and you will see it. app.run(threaded=True)
or app.run(processes=10)
. (In 1.0 the server is threaded by default.)
开发服务器可以在单线程和进程中运行。您将看不到您描述的行为,因为将同步处理每个请求。启用线程或进程,您将看到它。 app.run(threaded = True)或app.run(processes = 10)。 (在1.0中,服务器默认是线程化的。)
Some WSGI servers may support gevent or another async worker. Global variables are still not thread safe because there's still no protection against most race conditions. You can still have a scenario where one worker gets a value, yields, another modifies it, yields, then the first worker also modifies it.
某些WSGI服务器可能支持gevent或其他异步工作程序。全局变量仍然不是线程安全的,因为仍然没有针对大多数竞争条件的保护。您仍然可以拥有一个方案,其中一个工人获得一个值,产量,另一个工人获得它,产量,然后第一个工人也修改它。
If you need to store some global data during a request, you may use Flask's g
object. Another common case is some top-level object that manages database connections. The distinction for this type of "global" is that it's unique to each request, not used between requests, and there's something managing the set up and teardown of the resource.
如果您需要在请求期间存储一些全局数据,则可以使用Flask的g对象。另一种常见情况是一些管理数据库连接的*对象。这种“全局”的区别在于它对每个请求都是唯一的,不是在请求之间使用,而是管理资源的设置和拆除。
#1
29
You can't use global variables to hold this sort of data. Not only is it not thread safe, it's not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.
您不能使用全局变量来保存此类数据。它不仅不是线程安全的,而且不是过程安全的,并且生产中的WSGI服务器产生多个进程。如果您使用线程处理请求,您的计数不仅会出错,而且还会根据处理请求的进程而有所不同。
Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs. If you need to load and access Python data, consider multiprocessing.Manager
. You could also use the session for simple data that is per-user.
使用Flask外部的数据源来保存全局数据。数据库,memcached或redis都是适当的独立存储区域,具体取决于您的需求。如果需要加载和访问Python数据,请考虑multiprocessing.Manager。您还可以将会话用于每个用户的简单数据。
The development server may run in single thread and process. You won't see the behavior you describe since each request will be handled synchronously. Enable threads or processes and you will see it. app.run(threaded=True)
or app.run(processes=10)
. (In 1.0 the server is threaded by default.)
开发服务器可以在单线程和进程中运行。您将看不到您描述的行为,因为将同步处理每个请求。启用线程或进程,您将看到它。 app.run(threaded = True)或app.run(processes = 10)。 (在1.0中,服务器默认是线程化的。)
Some WSGI servers may support gevent or another async worker. Global variables are still not thread safe because there's still no protection against most race conditions. You can still have a scenario where one worker gets a value, yields, another modifies it, yields, then the first worker also modifies it.
某些WSGI服务器可能支持gevent或其他异步工作程序。全局变量仍然不是线程安全的,因为仍然没有针对大多数竞争条件的保护。您仍然可以拥有一个方案,其中一个工人获得一个值,产量,另一个工人获得它,产量,然后第一个工人也修改它。
If you need to store some global data during a request, you may use Flask's g
object. Another common case is some top-level object that manages database connections. The distinction for this type of "global" is that it's unique to each request, not used between requests, and there's something managing the set up and teardown of the resource.
如果您需要在请求期间存储一些全局数据,则可以使用Flask的g对象。另一种常见情况是一些管理数据库连接的*对象。这种“全局”的区别在于它对每个请求都是唯一的,不是在请求之间使用,而是管理资源的设置和拆除。