为什么选择带有blob的表的SQL查询很慢,即使未选择blob?

时间:2021-02-09 23:43:06

SELECT queries on tables with BLOBs are slow, even if I don't include the BLOB column. Can someone explain why, and maybe how to circumvent it? I am using SQL Server 2012, but maybe this is more of a conceptual problem that would be common for other distributions as well.

即使我不包含BLOB列,对具有BLOB的表的SELECT查询也很慢。有人可以解释为什么,也许如何规避它?我正在使用SQL Server 2012,但这可能更像是一个概念问题,对于其他发行版也是如此。

I found this post: SQL Server: select on a table that contains a blob, which shows the same problem, but the marked answer doesn't explain why is this happening, neither provides a good suggestion on how to solve the problem.

我发现这篇文章:SQL Server:在包含blob的表上选择,这显示了同样的问题,但标记的答案并没有解释为什么会发生这种情况,也没有提供关于如何解决问题的好建议。

2 个解决方案

#1


2  

If you are asking for a way to solve the performance drag, there are a number of approaches that you can take. Adding indexes to your table should help massively provided you aren't simply selecting the entire recordset. Creating views over the table may also assist. It's also worth checking the levels of index fragmentation on the table as this can cause poor performance and could be addressed with a regular maintenance job. The suggestion of creating a linked table to store the blob data is also a genuinely good one.

如果您想要一种解决性能拖拽的方法,可以采取多种方法。如果您不是简单地选择整个记录集,那么向表中添加索引应该有很大帮助。创建表格视图也可能有所帮助。同样值得检查表上的索引碎片级别,因为这可能会导致性能不佳,并且可以通过定期维护工作来解决。创建链接表来存储blob数据的建议也是一个非常好的建议。

However, if your question is asking why it's happening, this is because of the fundamentals of the way MS SQL Server functions. Essentially your database, and all databases on the server and split into pages, 8kb chunks of data with a 96-byte header. Each page representing what is possible in a single I/O operation. Pages are collected contained and grouped within Exents, 64kb collections of eight contiguous pages. SQL Server therefore uses sixteen Exents per megabyte of data. There are a few differing page types, a data page type for example won't contain what are termed "Large Objects". This include the data types text, image, varbinary(max), xml data, etc... These also are used to store variable length columns which exceed 8kb (and don't forget the 96 byte header).

但是,如果你的问题是询问它为什么会发生,那是因为MS SQL Server运行方式的基础。基本上是您的数据库,以及服务器上的所有数据库,并分成页面,8kb数据块,带有96字节的标头。每个页面代表单个I / O操作中可能的内容。页面被收集并包含在Exents中,包含8个连续页面的64kb集合。因此,SQL Server每兆字节数据使用16个Exent。有一些不同的页面类型,例如数据页面类型将不包含所谓的“大对象”。这包括数据类型text,image,varbinary(max),xml data等...这些也用于存储超过8kb的可变长度列(并且不要忘记96字节头)。

At the end of each page will be a small amount of free space. Database operations obviously shift these pages around all the time and free space allocations can grow massively in a database dealing with large amounts of I/O and random record access / modification. This is why free space on a database can grow massively. There are tools available within the management suite to allow you to reduce or remove free space and basically this re-organizes pages and exents.

在每个页面的末尾将有少量的可用空间。数据库操作显然会使这些页面一直在移动,并且可用空间分配可以在处理大量I / O和随机记录访问/修改的数据库中大量增加。这就是为什么数据库上的可用空间可以大量增长的原因。管理套件中提供了一些工具,可以减少或删除可用空间,基本上可以重新组织页面和执行。

Now, I may be making a leap here but I'm guessing that the blobs you have in your table exceed 8kb. Bear in mind if they exceed 64kb they will not only span multiple pages but indeed span multiple exents. The net result of this will be that a "normal" table read will cause massive amounts of I/O requests. Even if you're not interested in the BLOB data, the server may have to read through the pages and exents to get the other table data. This will only be compounded as more transactions make pages and exents that make up a table to become non-contiguous.

现在,我可能在这里进行了一次飞跃,但我猜你桌上的斑点超过了8kb。请记住,如果它们超过64kb,它们不仅会跨越多个页面,而且会跨越多个行程。最终结果是“正常”表读取将导致大量I / O请求。即使您对BLOB数据不感兴趣,服务器也可能必须通读页面并执行以获取其他表数据。这只会因为更多事务使构成表的页面和执行变得不连续而复杂化。

Where "Large Objects" are used, SQL Server writes Row-Overflow values which include a 24bit pointer to where the data is actually stored. If you have several columns on your table which exceed the 8kb page size combined with blobs and impacted by random transactions, you will find that the majority of the work your server is doing is I/O operations to move pages in and out of memory, reading pointers, fetching associated row data, etc, etc... All of which represents serious overhead.

在使用“大对象”的情况下,SQL Server会写入Row-Overflow值,其中包含一个24位指针,指向实际存储数据的位置。如果您的表上有多个列超过8kb页面大小并与blob结合并受随机事务影响,您会发现服务器正在执行的大部分工作是将页面移入和移出内存的I / O操作,读取指针,获取相关的行数据等等......所有这些都代表着严重的开销。

#2


2  

I got a suggestion then, have all the blobs in a separate table with an identity ID, then only save the identity ID in your main table

我得到了一个建议,然后将所有blob放在一个带有身份ID的单独表中,然后只保存主表中的身份ID

it could be because - maybe SQL cannot cache the table pages as easily, and you have to go to the disk more often. I'm no expert as to why though.

可能是因为 - 也许SQL无法轻松缓存表格页面,您必须更频繁地访问磁盘。虽然如此,我不是专家。

A lot of people frown at BLOBS/images in databases - In SQL 2012 there is some sort of compromise where you can configure the DB to keep objects in a file structure, not in the actual DB anymore - you might want to look for that

许多人对BLOBS /数据库中的图像感到不满 - 在SQL 2012中,存在某种折衷方案,您可以将数据库配置为将对象保留在文件结构中,而不是在实际数据库中 - 您可能希望查找

#1


2  

If you are asking for a way to solve the performance drag, there are a number of approaches that you can take. Adding indexes to your table should help massively provided you aren't simply selecting the entire recordset. Creating views over the table may also assist. It's also worth checking the levels of index fragmentation on the table as this can cause poor performance and could be addressed with a regular maintenance job. The suggestion of creating a linked table to store the blob data is also a genuinely good one.

如果您想要一种解决性能拖拽的方法,可以采取多种方法。如果您不是简单地选择整个记录集,那么向表中添加索引应该有很大帮助。创建表格视图也可能有所帮助。同样值得检查表上的索引碎片级别,因为这可能会导致性能不佳,并且可以通过定期维护工作来解决。创建链接表来存储blob数据的建议也是一个非常好的建议。

However, if your question is asking why it's happening, this is because of the fundamentals of the way MS SQL Server functions. Essentially your database, and all databases on the server and split into pages, 8kb chunks of data with a 96-byte header. Each page representing what is possible in a single I/O operation. Pages are collected contained and grouped within Exents, 64kb collections of eight contiguous pages. SQL Server therefore uses sixteen Exents per megabyte of data. There are a few differing page types, a data page type for example won't contain what are termed "Large Objects". This include the data types text, image, varbinary(max), xml data, etc... These also are used to store variable length columns which exceed 8kb (and don't forget the 96 byte header).

但是,如果你的问题是询问它为什么会发生,那是因为MS SQL Server运行方式的基础。基本上是您的数据库,以及服务器上的所有数据库,并分成页面,8kb数据块,带有96字节的标头。每个页面代表单个I / O操作中可能的内容。页面被收集并包含在Exents中,包含8个连续页面的64kb集合。因此,SQL Server每兆字节数据使用16个Exent。有一些不同的页面类型,例如数据页面类型将不包含所谓的“大对象”。这包括数据类型text,image,varbinary(max),xml data等...这些也用于存储超过8kb的可变长度列(并且不要忘记96字节头)。

At the end of each page will be a small amount of free space. Database operations obviously shift these pages around all the time and free space allocations can grow massively in a database dealing with large amounts of I/O and random record access / modification. This is why free space on a database can grow massively. There are tools available within the management suite to allow you to reduce or remove free space and basically this re-organizes pages and exents.

在每个页面的末尾将有少量的可用空间。数据库操作显然会使这些页面一直在移动,并且可用空间分配可以在处理大量I / O和随机记录访问/修改的数据库中大量增加。这就是为什么数据库上的可用空间可以大量增长的原因。管理套件中提供了一些工具,可以减少或删除可用空间,基本上可以重新组织页面和执行。

Now, I may be making a leap here but I'm guessing that the blobs you have in your table exceed 8kb. Bear in mind if they exceed 64kb they will not only span multiple pages but indeed span multiple exents. The net result of this will be that a "normal" table read will cause massive amounts of I/O requests. Even if you're not interested in the BLOB data, the server may have to read through the pages and exents to get the other table data. This will only be compounded as more transactions make pages and exents that make up a table to become non-contiguous.

现在,我可能在这里进行了一次飞跃,但我猜你桌上的斑点超过了8kb。请记住,如果它们超过64kb,它们不仅会跨越多个页面,而且会跨越多个行程。最终结果是“正常”表读取将导致大量I / O请求。即使您对BLOB数据不感兴趣,服务器也可能必须通读页面并执行以获取其他表数据。这只会因为更多事务使构成表的页面和执行变得不连续而复杂化。

Where "Large Objects" are used, SQL Server writes Row-Overflow values which include a 24bit pointer to where the data is actually stored. If you have several columns on your table which exceed the 8kb page size combined with blobs and impacted by random transactions, you will find that the majority of the work your server is doing is I/O operations to move pages in and out of memory, reading pointers, fetching associated row data, etc, etc... All of which represents serious overhead.

在使用“大对象”的情况下,SQL Server会写入Row-Overflow值,其中包含一个24位指针,指向实际存储数据的位置。如果您的表上有多个列超过8kb页面大小并与blob结合并受随机事务影响,您会发现服务器正在执行的大部分工作是将页面移入和移出内存的I / O操作,读取指针,获取相关的行数据等等......所有这些都代表着严重的开销。

#2


2  

I got a suggestion then, have all the blobs in a separate table with an identity ID, then only save the identity ID in your main table

我得到了一个建议,然后将所有blob放在一个带有身份ID的单独表中,然后只保存主表中的身份ID

it could be because - maybe SQL cannot cache the table pages as easily, and you have to go to the disk more often. I'm no expert as to why though.

可能是因为 - 也许SQL无法轻松缓存表格页面,您必须更频繁地访问磁盘。虽然如此,我不是专家。

A lot of people frown at BLOBS/images in databases - In SQL 2012 there is some sort of compromise where you can configure the DB to keep objects in a file structure, not in the actual DB anymore - you might want to look for that

许多人对BLOBS /数据库中的图像感到不满 - 在SQL 2012中,存在某种折衷方案,您可以将数据库配置为将对象保留在文件结构中,而不是在实际数据库中 - 您可能希望查找