handling large datasets in web api & odata

时间:2023-01-28 23:46:26

I have been working with asp.net web api over recent weeks with great success. It has really assisted me with producing an interface for mobile clients to programme against over http.

最近几周我一直在使用asp.net web api取得了巨大的成功。它真的帮助我为移动客户端生成一个界面,以便通过http进行编程。

I reached a point where I need some assistance.

我达到了需要一些帮助的地步。

I have a new endpoint which will can a database and could return 100K results. I am using OData to filter the data and return a paginated set of the data.

我有一个新的端点,它可以是一个数据库,可以返回100K的结果。我正在使用OData过滤数据并返回一组分页的数据。

As this could happen for mutliple requests, I am concerned with performance. Returning 100K records from the database every time is not ideal. So I have some ideas.

由于多重请求可能会发生这种情况,我担心性能问题。每次从数据库返回100K记录并不理想。所以我有一些想法。

First one is to cache the 100K results and let OData do its magic on this every time. I am working with AppFabric distributed cache as its a load balanced environment. However caching such an amount of data in AppFabric could result in memory complications so think I am best avoiding this.

第一个是缓存100K结果,让OData每次都做到这一点。我正在使用AppFabric分布式缓存作为其负载平衡环境。然而,在AppFabric中缓存如此大量的数据可能会导致内存并发症,所以我认为最好避免这种情况。

Next option is to forget about the magic of OData and send the filters I use in to the database and return only the required data each time. So in other words hit the db every time.

接下来的选择是忘记OData的神奇之处,并将我使用的过滤器发送到数据库,每次只返回所需的数据。所以换句话说,每次都打到数据库。

I could look at using a caching handler like the version outlined in this article to cache in the http cache -> http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html The drawback of this is if the data gets update via another system which it may, the cached data is not expired.

我可以看看使用缓存处理程序,如本文中概述的版本缓存在http缓存中 - > http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html缺点这是因为如果数据通过另一个系统更新,则缓存的数据不会过期。

Any other tips as to how I may handle this scenario, large amount of data, filtered with odata in conjunction with web api?

关于我如何处理这种情况的任何其他提示,大量数据,使用odata与web api一起过滤?

4 个解决方案

#1


2  

This is a question that's likely to result in a wide variety of answers. That said, let me put on my pre-MSFT hat and give you my two cents.

这个问题很可能导致各种各样的答案。那就是说,让我戴上我的前MSFT帽子并给你我两美分。

A lot of architecture questions are best answered with the consultant's answer, "It depends." The answer depends in your case on a few things specifically. Some developers have a problem with caching layers because there are additional things to think about. An ACID-compliant database buys you a lot of insurance that you have at least a very finite amount of eventual consistency.

咨询顾问的回答是“最大程度上取决于”。答案取决于你的具体情况。一些开发人员在缓存层时遇到问题,因为还有其他事情要考虑。符合ACID标准的数据库为您购买了大量保险,您至少拥有非常有限的最终一致性。

If it were me making this decision, I would be considering a few things:

如果是我做出这个决定,我会考虑一些事情:

  • How many rows am I returning on a regular basis?
  • 我定期返回多少行?
  • Are they the same rows over and over?
  • 它们是一遍又一遍的相同行吗?
  • How big is that in memory? (100k is really not that many rows; you're right about not wanting those 100k rows to hit the disk every time, but it's probably not a problem to keep them all in memory; SQL Server would probably do this for you anyway.)
  • 在记忆中有多大? (100k实际上并不是那么多行;你是不是每次都不希望那些100k行到达磁盘,但是将它们全部留在内存中可能不是问题;无论如何,SQL Server可能会为你做这件事。)
  • What am I willing to deal with re: eventual consistency? Do I want some other software to deal with it? (What frequently scares people about caches are things like ensuring that invalidation and insertion get done properly and consistently from different applications/different places in the application.)
  • 我愿意处理什么:最终的一致性?我想要一些其他软件来处理它吗? (经常让人们了解缓存的方法就是确保从应用程序中的不同应用程序/不同位置正确且一致地完成无效和插入。)

Given the information you've already provided (tiered architecture, willingness to try a distributed cache) I think you should pursue a caching layer. There are lots of good caches out there. AppFabric worked fine for us before I worked at Microsoft, but I've also dealt with a variety of other caching layers as well.

鉴于您已经提供的信息(分层架构,尝试分布式缓存的意愿),我认为您应该寻求一个缓存层。那里有很多好的缓存。在我在微软工作之前,AppFabric对我们工作得很好,但我也处理过各种其他缓存层。

#2


1  

Assuming you use Entity Framework it would be the best option to return the IQueryable of EF directly. This way the magic of OData will work directly on your database. $limit and $take will be mapped directly to your SQL query.

假设您使用Entity Framework,它将是直接返回EF的IQueryable的最佳选择。这样,OData的神奇之处就在于您的数据库。 $ limit和$ take将直接映射到您的SQL查询。

#3


0  

best way is to a distributed cache, which you are already using. but the cache provider which you are using i.e. AppFabric, has some limitations. by limitations i mean the feature limitations. check out NCache which is a well mature and feature rich third party distributed cache provider.

最好的方法是使用已经使用的分布式缓存。但是您正在使用的缓存提供程序(即AppFabric)有一些限制。限制我的意思是功能限制。看看NCache是​​一个成熟且功能丰富的第三方分布式缓存提供商。

if you want to understand the differences of NCache and Appfabric, check the youtube link below, this is FYI....

如果你想了解NCache和Appfabric的区别,请查看下面的youtube链接,这是FYI ....

http://www.youtube.com/watch?v=3CPi1QlskrU

http://www.youtube.com/watch?v=3CPi1QlskrU

#4


0  

The caching that I have pointed out in the blog http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html applies to HTTP caching also known as output caching. Actually the data itself is not cached on the server but on the client or mid-stream cache servers, so is not suitable for what you have it mind.

我在博客http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html中指出的缓存适用于HTTP缓存,也称为输出缓存。实际上,数据本身并未缓存在服务器上,而是缓存在客户端或中流缓存服务器上,因此不适合您的想法。

#1


2  

This is a question that's likely to result in a wide variety of answers. That said, let me put on my pre-MSFT hat and give you my two cents.

这个问题很可能导致各种各样的答案。那就是说,让我戴上我的前MSFT帽子并给你我两美分。

A lot of architecture questions are best answered with the consultant's answer, "It depends." The answer depends in your case on a few things specifically. Some developers have a problem with caching layers because there are additional things to think about. An ACID-compliant database buys you a lot of insurance that you have at least a very finite amount of eventual consistency.

咨询顾问的回答是“最大程度上取决于”。答案取决于你的具体情况。一些开发人员在缓存层时遇到问题,因为还有其他事情要考虑。符合ACID标准的数据库为您购买了大量保险,您至少拥有非常有限的最终一致性。

If it were me making this decision, I would be considering a few things:

如果是我做出这个决定,我会考虑一些事情:

  • How many rows am I returning on a regular basis?
  • 我定期返回多少行?
  • Are they the same rows over and over?
  • 它们是一遍又一遍的相同行吗?
  • How big is that in memory? (100k is really not that many rows; you're right about not wanting those 100k rows to hit the disk every time, but it's probably not a problem to keep them all in memory; SQL Server would probably do this for you anyway.)
  • 在记忆中有多大? (100k实际上并不是那么多行;你是不是每次都不希望那些100k行到达磁盘,但是将它们全部留在内存中可能不是问题;无论如何,SQL Server可能会为你做这件事。)
  • What am I willing to deal with re: eventual consistency? Do I want some other software to deal with it? (What frequently scares people about caches are things like ensuring that invalidation and insertion get done properly and consistently from different applications/different places in the application.)
  • 我愿意处理什么:最终的一致性?我想要一些其他软件来处理它吗? (经常让人们了解缓存的方法就是确保从应用程序中的不同应用程序/不同位置正确且一致地完成无效和插入。)

Given the information you've already provided (tiered architecture, willingness to try a distributed cache) I think you should pursue a caching layer. There are lots of good caches out there. AppFabric worked fine for us before I worked at Microsoft, but I've also dealt with a variety of other caching layers as well.

鉴于您已经提供的信息(分层架构,尝试分布式缓存的意愿),我认为您应该寻求一个缓存层。那里有很多好的缓存。在我在微软工作之前,AppFabric对我们工作得很好,但我也处理过各种其他缓存层。

#2


1  

Assuming you use Entity Framework it would be the best option to return the IQueryable of EF directly. This way the magic of OData will work directly on your database. $limit and $take will be mapped directly to your SQL query.

假设您使用Entity Framework,它将是直接返回EF的IQueryable的最佳选择。这样,OData的神奇之处就在于您的数据库。 $ limit和$ take将直接映射到您的SQL查询。

#3


0  

best way is to a distributed cache, which you are already using. but the cache provider which you are using i.e. AppFabric, has some limitations. by limitations i mean the feature limitations. check out NCache which is a well mature and feature rich third party distributed cache provider.

最好的方法是使用已经使用的分布式缓存。但是您正在使用的缓存提供程序(即AppFabric)有一些限制。限制我的意思是功能限制。看看NCache是​​一个成熟且功能丰富的第三方分布式缓存提供商。

if you want to understand the differences of NCache and Appfabric, check the youtube link below, this is FYI....

如果你想了解NCache和Appfabric的区别,请查看下面的youtube链接,这是FYI ....

http://www.youtube.com/watch?v=3CPi1QlskrU

http://www.youtube.com/watch?v=3CPi1QlskrU

#4


0  

The caching that I have pointed out in the blog http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html applies to HTTP caching also known as output caching. Actually the data itself is not cached on the server but on the client or mid-stream cache servers, so is not suitable for what you have it mind.

我在博客http://byterot.blogspot.ie/2012/06/aspnet-web-api-caching-handler.html中指出的缓存适用于HTTP缓存,也称为输出缓存。实际上,数据本身并未缓存在服务器上,而是缓存在客户端或中流缓存服务器上,因此不适合您的想法。