handling large datasets in web api & odata

时间:2023-01-28 23:46:26

I have been working with asp.net web api over recent weeks with great success. It has really assisted me with producing an interface for mobile clients to programme against over http.

最近几周我一直在使用asp.net web api取得了巨大的成功。它真的帮助我为移动客户端生成一个界面,以便通过http进行编程。

I reached a point where I need some assistance.


I have a new endpoint which will can a database and could return 100K results. I am using OData to filter the data and return a paginated set of the data.


As this could happen for mutliple requests, I am concerned with performance. Returning 100K records from the database every time is not ideal. So I have some ideas.


First one is to cache the 100K results and let OData do its magic on this every time. I am working with AppFabric distributed cache as its a load balanced environment. However caching such an amount of data in AppFabric could result in memory complications so think I am best avoiding this.


Next option is to forget about the magic of OData and send the filters I use in to the database and return only the required data each time. So in other words hit the db every time.


I could look at using a caching handler like the version outlined in this article to cache in the http cache -> http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html The drawback of this is if the data gets update via another system which it may, the cached data is not expired.

我可以看看使用缓存处理程序,如本文中概述的版本缓存在http缓存中 - > http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html缺点这是因为如果数据通过另一个系统更新,则缓存的数据不会过期。

Any other tips as to how I may handle this scenario, large amount of data, filtered with odata in conjunction with web api?

关于我如何处理这种情况的任何其他提示,大量数据,使用odata与web api一起过滤?

4 个解决方案



This is a question that's likely to result in a wide variety of answers. That said, let me put on my pre-MSFT hat and give you my two cents.


A lot of architecture questions are best answered with the consultant's answer, "It depends." The answer depends in your case on a few things specifically. Some developers have a problem with caching layers because there are additional things to think about. An ACID-compliant database buys you a lot of insurance that you have at least a very finite amount of eventual consistency.


If it were me making this decision, I would be considering a few things:


  • How many rows am I returning on a regular basis?
  • 我定期返回多少行?
  • Are they the same rows over and over?
  • 它们是一遍又一遍的相同行吗?
  • How big is that in memory? (100k is really not that many rows; you're right about not wanting those 100k rows to hit the disk every time, but it's probably not a problem to keep them all in memory; SQL Server would probably do this for you anyway.)
  • 在记忆中有多大? (100k实际上并不是那么多行;你是不是每次都不希望那些100k行到达磁盘,但是将它们全部留在内存中可能不是问题;无论如何,SQL Server可能会为你做这件事。)
  • What am I willing to deal with re: eventual consistency? Do I want some other software to deal with it? (What frequently scares people about caches are things like ensuring that invalidation and insertion get done properly and consistently from different applications/different places in the application.)
  • 我愿意处理什么:最终的一致性?我想要一些其他软件来处理它吗? (经常让人们了解缓存的方法就是确保从应用程序中的不同应用程序/不同位置正确且一致地完成无效和插入。)

Given the information you've already provided (tiered architecture, willingness to try a distributed cache) I think you should pursue a caching layer. There are lots of good caches out there. AppFabric worked fine for us before I worked at Microsoft, but I've also dealt with a variety of other caching layers as well.




Assuming you use Entity Framework it would be the best option to return the IQueryable of EF directly. This way the magic of OData will work directly on your database. $limit and $take will be mapped directly to your SQL query.

假设您使用Entity Framework,它将是直接返回EF的IQueryable的最佳选择。这样,OData的神奇之处就在于您的数据库。 $ limit和$ take将直接映射到您的SQL查询。



best way is to a distributed cache, which you are already using. but the cache provider which you are using i.e. AppFabric, has some limitations. by limitations i mean the feature limitations. check out NCache which is a well mature and feature rich third party distributed cache provider.


if you want to understand the differences of NCache and Appfabric, check the youtube link below, this is FYI....

如果你想了解NCache和Appfabric的区别,请查看下面的youtube链接,这是FYI ....





The caching that I have pointed out in the blog http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html applies to HTTP caching also known as output caching. Actually the data itself is not cached on the server but on the client or mid-stream cache servers, so is not suitable for what you have it mind.




This is a question that's likely to result in a wide variety of answers. That said, let me put on my pre-MSFT hat and give you my two cents.


A lot of architecture questions are best answered with the consultant's answer, "It depends." The answer depends in your case on a few things specifically. Some developers have a problem with caching layers because there are additional things to think about. An ACID-compliant database buys you a lot of insurance that you have at least a very finite amount of eventual consistency.


If it were me making this decision, I would be considering a few things:


  • How many rows am I returning on a regular basis?
  • 我定期返回多少行?
  • Are they the same rows over and over?
  • 它们是一遍又一遍的相同行吗?
  • How big is that in memory? (100k is really not that many rows; you're right about not wanting those 100k rows to hit the disk every time, but it's probably not a problem to keep them all in memory; SQL Server would probably do this for you anyway.)
  • 在记忆中有多大? (100k实际上并不是那么多行;你是不是每次都不希望那些100k行到达磁盘,但是将它们全部留在内存中可能不是问题;无论如何,SQL Server可能会为你做这件事。)
  • What am I willing to deal with re: eventual consistency? Do I want some other software to deal with it? (What frequently scares people about caches are things like ensuring that invalidation and insertion get done properly and consistently from different applications/different places in the application.)
  • 我愿意处理什么:最终的一致性?我想要一些其他软件来处理它吗? (经常让人们了解缓存的方法就是确保从应用程序中的不同应用程序/不同位置正确且一致地完成无效和插入。)

Given the information you've already provided (tiered architecture, willingness to try a distributed cache) I think you should pursue a caching layer. There are lots of good caches out there. AppFabric worked fine for us before I worked at Microsoft, but I've also dealt with a variety of other caching layers as well.




Assuming you use Entity Framework it would be the best option to return the IQueryable of EF directly. This way the magic of OData will work directly on your database. $limit and $take will be mapped directly to your SQL query.

假设您使用Entity Framework,它将是直接返回EF的IQueryable的最佳选择。这样,OData的神奇之处就在于您的数据库。 $ limit和$ take将直接映射到您的SQL查询。



best way is to a distributed cache, which you are already using. but the cache provider which you are using i.e. AppFabric, has some limitations. by limitations i mean the feature limitations. check out NCache which is a well mature and feature rich third party distributed cache provider.


if you want to understand the differences of NCache and Appfabric, check the youtube link below, this is FYI....

如果你想了解NCache和Appfabric的区别,请查看下面的youtube链接,这是FYI ....





The caching that I have pointed out in the blog http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html applies to HTTP caching also known as output caching. Actually the data itself is not cached on the server but on the client or mid-stream cache servers, so is not suitable for what you have it mind.
