
时间:2022-04-21 19:37:02

I'm working with a somewhat large set (~30000 records) of data that my Django app needs to retrieve on a regular basis. This data doesn't really change often (maybe once a month or so), and the changes that are made are done in a batch, so the DB solution I'm trying to arrive at is pretty much read-only.


The total size of this dataset is about 20mb, and my first thought is that I can load it into memory (possibly as a singleton on an object) and access it very fast that way, though I'm wondering if there are other, more efficient ways of decreasing the fetch time by avoiding disk I/O. Would memcached be the best solution here? Or would loading it into an in-memory SQLite DB be better? Or loading it on app startup simply as an in-memory variable?

这个数据集的总大小是大约20 mb,我首先想到的是我可以加载到内存中(可能是一个单例对象)和访问它非常快,但是我想知道如果有其他更有效的方法减少避免磁盘I / O的获取时间。memcached是这里的最佳解决方案吗?或者将它加载到内存中的SQLite DB中会更好吗?还是仅仅将它作为内存变量加载到app startup中?

2 个解决方案



The simplest solution I think it's to load all the objects into memory with


cached_records = Record.objects.all()
list(cached_records) # by using list() we force Django load all data into memory

Then you are free to use this cached_records in your app, and you also can use QuerySet methods like filter, etc. But filter on the cached records will trigger DB query.


If you will query these records based on conditions, using cache would be a good idea.




Does the disk IO really become the bottleneck of your application's performance and affect your user experience? If not, I don't think this kind of optimization is necessary.


Operating system and RDBMS (e.g MySQL , PostgresQL) are really smart nowdays. The data in the disk will be cached in memory by RDBMS and OS automatically.

操作系统和RDBMS。g MySQL, PostgresQL)现在真的很聪明。磁盘中的数据将被RDBMS和OS自动缓存在内存中。



The simplest solution I think it's to load all the objects into memory with


cached_records = Record.objects.all()
list(cached_records) # by using list() we force Django load all data into memory

Then you are free to use this cached_records in your app, and you also can use QuerySet methods like filter, etc. But filter on the cached records will trigger DB query.


If you will query these records based on conditions, using cache would be a good idea.




Does the disk IO really become the bottleneck of your application's performance and affect your user experience? If not, I don't think this kind of optimization is necessary.


Operating system and RDBMS (e.g MySQL , PostgresQL) are really smart nowdays. The data in the disk will be cached in memory by RDBMS and OS automatically.

操作系统和RDBMS。g MySQL, PostgresQL)现在真的很聪明。磁盘中的数据将被RDBMS和OS自动缓存在内存中。