From: https://www.codeproject.com/articles/1076854/overcoming-the-list-view-threshold-in-sharepoint-c
Introduction
When your CAML queries start to hit the list view threshold, you'll think it will never work. It can work, but it's tough. This article brings together the tips and tricks for building CAML queries that I've gathered over the past year or so.
When using large lists in SharePoint, you will undoubtedly encounter the List View Threshold. This is a fixed limit of 5000 rows which can be returned in a single view. Now, that's a vast oversimplification - in reality there are ways to avoid seeing this limit. In this article, I will focus on methods of handling this limit in your CAML query code. Specifically, I will be using C# and the Client-Side Object Model (CSOM), although the JavaScript Object Model will be exactly the same and most of the issues are also relevant in the Server Object Model.
Do not be confused between the list view threshold (5000) and the limit of capacity of lists, which is somewhere in the region of 50 million items - or 50,000 items with unique permissions.
History
First a quick history of the 5000 items limit. It is a hard limit, and it's present in SharePoint 2010, 2013 and 2016, as well as SharePoint Online (Office 365). You can change the limit in your on-premise environment but that's not recommended so I'm not even going to say how. You could change your limit from 5,000 to 20,000, for example, but what happens when your list grows to 20,000 items? You will be better served by changing your schema and writing queries to address this limit, using the techniques in this article.
Underlying a SharePoint list is an SQL Server table. When you perform a CAML query, the query results in an SQL query against SQL Server. Now, in SQL Server, locking items during a query execution is a small performance hit. When you lock a large enough number of items, the lock is escalated to the *entire* table - which, as you can imagine, causes a general performance hit with other queries against that table. So, SharePoint prevents this from happening by enforcing a threshold of 5000 items returned in a single query. This way, as developers, we're forced to improve our schema and querying skills to avoid this situation.
In SharePoint 2016, this problem is mitigated slightly in a few ways:
- List View Auto-Indexing
This causes columns to be indexed automatically if SharePoint detects that it would result in a performance improvement. - Allows retrospective creation of indices
In SP2013, you cannot add an index to a column of a list containing more than 5000 items. In SP2016, this will be allowed. - Smarter list-view-threshold violation detection
It will more reliably detect when a query should be throttled. -
Improving default Document Library views
The out-of-the-box document library view will no longer sort folders first, avoiding a potential list view threshold error.We can see from the above points that some progress has been made in managing large lists. However, the list view threshold remains - so from a querying perspective nothing has changed.
For more information, see Bill Baer's blog post on the topic: http://blogs.technet.com/b/wbaer/archive/2015/08/27/navigating-list-view-thresholds-in-sharepoint-server-2016-it-preview.aspx
SharePoint UI
If you have more than 5000 items in a list you'll get a warning in the list settings - "The number of items in this list exceeds the list view threshold". This means that many UI functions will no longer work, and your custom views will probably no longer function.
The list above has about three quarters of a million items, and is a test list for Repstor custodian - so this proves that yes, you can use large lists with some smart querying!
Column indexing
Sorting will no longer work except on indexed columns. Unfortunately, you can't even add an index to a column while the list contains more than 5000 items, so if your list may grow to this size, you need to prepare in advance. This will be improved in SharePoint 2016, though.
The ID column is automatically indexed, so by default, you can sort on the ID column with 5k+ items present. You can have up to 20 columns indexed. As described above, in SharePoint 2016, column indices can be automatically managed - however, if you're planning to do some querying then you will want to explicitly specify your indices.
Filtered views
Even when all relevant columns are indexed, you can't present a filtered view when that view would display more than 5000 items, even when it is paged. Unfortunately paging doesn't really help at all when navigating the issue, since you're still forcing an underlying scan of more than 5000 items. One of the tough things to understand is that the query, excluding paging, must never exceed 5000 results except in some trivial circumstances.
CAML
In these examples, I'll use a few conventions. My table has, let's say, a million items. It has the following columns: ID, IndexedCol, and NonIndexedCol, which should be fairly self explanatory; IndexedCol is indexed, NonIndexedCol is not. All of the following are completely valid CAML and will always work if you have fewer than 5k items.
This simple CAML query will work:
Hide Copy Code
<Query>
<View>
<RowLimit>10</RowLimit>
</View>
</Query>
Now, even though it's not including a filter, only the start of the table is being scanned: just the first 10 items are being picked up. However, if we don't restrict it to 10 items, we'll get an error - this query will not work:
Hide Copy Code
<Query>
<View>
</View>
</Query>
Let's assume there are only 1000 rows where IndexedCol equals 'match1k'. This query will work, even though we don't include a <RowLimit> tag:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
</Query>
That makes sense - in SQL, only 1000 rows are matched by the WHERE clause. Let's now assume there are 6000 rows where IndexedCol equals 'match6k'. This query will not work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
However, combining the queries using an AND operator will work in this instance:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
</Query>
Seems obvious, doesn't it? However, confusingly, the following query will not work even though it appears to be the same as the query above:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</And>
</Where>
</Query>
Why doesn't it work? Because 6000 matches are scanned from the first part of the query (IndexedCol = 'match6k'), and the threshold error occurs before hitting the second conditional of the WHERE clause. The lesson here is:
Order your WHERE conditionals with the most specific first.
Now, we'll try querying the non-indexed columns. This query will never work, even if it doesn't match any items:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='NonIndexedCol'
/><Value Type='Text'>matchNone</Value>
</Eq>
</Where>
</Query>
This is because:
Non-indexed columns can never be used for filtering in a list with 5000+ items - regardless of how many matches there are.
OR
Now we move on to the use of 'OR'. Unfortunately, we're pretty much stuck here. Using 'OR' against a list with more than 5000 items will ALWAYS result in a list view threshold error! So, the OR section is pretty short...Don't use OR! Your only option here is to run multiple queries.
Ordering
You can order your results as long as you meet two requirements:
- Your query is valid according to the above rules and does not break the list view threshold (obviously),
-
The field you are filtering on is indexed.
Hence this very simple query will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This very simple query will not work as it's on a non-indexed column:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='NonIndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
Remember, if you are including a WHERE clause with the above, your WHERE should match a maximum of 5000 results, regardless of your use of the RowLimit element. So, this will work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
</View>
</Query>
Paging
If you have large lists then you will almost always want to take advantage of paging. Paging works brilliantly when you have no filter, or a filter that returns less than 5000 items. So, you can query the first "page" of most recent items with a simple query like this which will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This query, without RowLimit, does not break the view threshold.
To retrieve the next page following on from the 10th item returned, you then specify the value to continue on from via the ListItemCollectionPosition field on the CamlQuery object:
Hide Copy Code
CamlQuery camlQuery = new CamlQuery();
camlQuery.ListItemCollectionPosition = "Paged=TRUE&p_ID=1034";
camlQuery.ViewXml = "..."; //Query View element
ListItemCollection listItems = list.GetItems(camlQuery);
clientContext.Load(listItems);
clientContext.ExecuteQuery();
//Note listItems.ListItemCollectionPosition for the next page
The value of the ListItemCollectionPosition property comes from the ListItemCollection.ListItemCollectionPosition of the previous page.
Again, this works if there is not filter, or there's a filter that returns less than 5000 items.
Advanced paging techniques
The paged query above works because there is no WHERE clause in the query that can potentially cause a list view threshold error. For example, there's no way to retrieve all the items of this query that we saw earlier (this will not work):
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
If you really need to execute a query like this that potentially exceeds the list view threshold, then you may be able to craft your queries to achieve the effect of paging by adding additional WHERE clauses. For example, by adding a filter on ID, this will work:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>0</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>5000</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
In the above example, the first part of the query narrows the result set down to items with ID between 0 and 5000. This prevents any possibility of exceeding the list view threshold. Then, it filters those into the items where IndexedCol = match6k. Finally, the RowLimit ensures that only 60 of the items are returned.
There are a few implications of this technique:
- You cannot predict how many results are returned, only that it's less than or equal to the RowLimit (60, in this case).
-
You may need to re-run the query repeatedly to receive sufficient results
To retrieve the next page of results, you must get the last returned item's ID (the highest ID, assuming we're sorted ascending). Using that ID, form a new query - with ID greater than that value, and less than that value plus 5000.
For example, if the highest ID returned previously was 2074, the next query to execute looks like this:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>2074</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>7074</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
The above query will reliably return the items without exceeding the list view threshold. Simply repeat until you have reached the list's Max ID (which you'll have to retrieve separately).
There's a potential problem with this method, though. Consider the following scenario:
- There are 1 million items in the list
-
The first item to match the 'IndexedCol' = 'match6' clause is item 900,000
In this case, the query will have to run 900,000 / 5000 = 180 times before it returns even one item!
There is a very effective enhancement to make to this technique, and that's to intelligently adjust the min and max IDs to span a range greater than 5000. You can follow the following rules:
- If no items are returned, then for the next query, double the ID span (eg. increase 5000 items to 10000)
-
If the list view threshold is exceeded, then repeat the same query but halve the ID span (eg. reduce 10000 to 5000)
In this way, the query will only have to run 8 times to start retrieving items. On the 8th iteration, when it hits items, it'll be attempting to retrieve items 635k to 1.2m (big numbers!). If it exceeds the list view threshold at this point, that's ok - the algorithm above will ensure that the range then scales down until it runs successfully.
Don't forget to cache the results, so that this doesn't need to happen too often.
You can start with a number less than 5000, if you wish to tweak performance. Likewise, you can triple instead of double the 'scale-up' factor, if that makes more sense for your data set.
Less than & Greater than
You can't use Less than or Equal (Lte) or Greater than or Equal (Gte) in CAML queries that involve large numbers of items. I don't know why, but it doesn't work. Stick to Less than or Greater than (Lt/Gt).
Indexable field types
Not all field types are indexable. For example, User fields can be indexed and used in queries involving large numbers of items. However, a Mult-User field cannot. Please see the following link for more information: Creating SharePoint indexed columns.
Other List Thresholds
The List View Threshold is not the only limit you need to be aware of! Another list limit is the item-level permissions limit of 50,000 items. Often, permissions are set at the list level. However, if you choose to set unique permissions for each individual list item, then you can only do this for 50k items within any given list. This is a hard and absolute list in SharePoint and if you need to exceed it, then you need to split your data across multiple lists.
See the following page for more information about boundaries and limits in SharePoint 2013
Search API
If all of this is too much to handle, you might want to consider using the Search API. I would absolutely recommend this any time over grappling with the nuances of CAML!
However, if you choose to persevere with CAML, hopefully this guide helps. Please let me know in the comments any other tips or tricks, errors or omissions. Meanwhile, I'm going to go and cry in a corner and attempt to come to terms with all this CAML horribleness....
Overcoming the List View Threshold in SharePoint CAML queries的更多相关文章
-
SHAREPOINT - CAML列表查询
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
-
SharePoint CAML Query小结
CAML的结构. <View Type="HTML" Name="Summary"> <ViewBody ExpandXML="TR ...
-
SharePoint CAML In Action——Part I
在SharePoint中,我们经常要对List进行操作,比如要从List中取出相应的ListItem,利用CAML是个好办法.在没了解CAML之前,我是这样取数据的: MyList.Items.Cas ...
-
SharePoint - CAML
1. CAML是顺序操作,如果要实现类似 “A or B or C or D” 的结果,最好写成 “(((A or B) or C) or D)”的形式,但写成 “((A or B) or (C or ...
-
Sharepoint CAML 增删改查 List
Lists.UpdateListItems 方法 (websvcLists) Windows SharePoint Services 3 Adds, deletes, or updates the ...
-
SharePoint CAML In Action——Part II
在SharePoint中,相对于Linq to SharePoint而言,CAML是轻量化的.当然缺点也是显而易见的,"Hard Code"有时会让你抓狂.在实际场景中,经常会根据 ...
-
Build Tree View Structure for SharePoint List Data
博客地址 http://blog.csdn.net/foxdave 此文参考自->原文链接 版权归原作者所有,我只是进行一下大致的翻译 应坛友要求,帮助验证一下功能. SharePoint列表数 ...
-
深入浅出SharePoint——Caml快速开发
适用于Visual Studio 2010的Caml智能感知工具 http://visualstudiogallery.msdn.microsoft.com/15055544-fda0-42db-a6 ...
-
[总结]SHAREPOINT - CAML列表查询(上)
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
随机推荐
-
aa4
// 初始化地图 var BMapExt = new BMapExtension(domMain, BMap, require('echarts'), require('zrender')); var ...
-
apache配置Options详解
http://www.365mini.com/page/apache-options-directive.htm Options指令是Apache配置文件中一个比较常见也比较重要的指令,Options ...
-
工作日志 jquery slideDown slideUp
jquery里面使用 slideDown 和 slideUp会有一个像素的偏差
-
并发容器之CopyOnWriteArrayList
原文链接: http://ifeve.com/java-copy-on-write/ Copy-On-Write简称COW,是一种用于程序设计中的优化策略.其基本思路是,从一开始大家都在共享同一个内容 ...
-
十三、mysql 分区之 RANGE &;&; LIST
.RANGE 分区 创建实例: CREATE TABLE EMP ( id int not null primary key auto_increment, name ) not null defau ...
-
IOS SDK相机的详细解释/画廊(默认+他们的高清摄像头接口)
原版的blog,转载请注明出处 blog.csdn.net/hello_hwc 前言: 新NSURLSession的UploadTask的,结果写那个Demo的时候想要写成拍照上传.然后就想到先写一个 ...
-
笔记整理——linux程序设计
数据库 (2013/2/27 16:07:11) 线程 (2013/2/27 15:47:51) 信号 (2013/2/27 15:31:28) 消息队列.共享内存 (2013/2 ...
-
NSUserDefaults standardUserDefaults使用注意事项
NSUserDefaults可以存储NSString,NSNumber, NSDate, NSArray, NSDictionary,自定义类可以通过NSData的方式进行存储,当然要实现NSCodi ...
-
最简单的基于librtmp的示例:发布(FLV通过RTMP发布)
===================================================== 最简单的基于libRTMP的示例系列文章列表: 最简单的基于librtmp的示例:接收(RT ...
-
Linux压缩打包tar命令总结
命令简介 在Linux系统的维护.管理中,tar命令是一个使用频率很高的命令,tar命令的功能主要是将众多文件打包成一个tar文件并压缩,并且能保持文件的权限属性.tar其实最开始是用来做磁带 ...