Possible Duplicate:
is there an advantage to varchar(500) over varchar(8000)?可能重复:varchar(500)优于varchar(8000)有什么优势?
I understand that a VARCHAR(200)
column containing 10 characters takes same amount of space as a VARCHAR(20)
column containing same data.
我知道包含10个字符的VARCHAR(200)列占用的空间与包含相同数据的VARCHAR(20)列相同。
I want to know if changing a dozen VARCHAR(200)
columns of a specific table to VARCHAR(20)
would make the queries run faster, especially when:
我想知道将特定表的十二个VARCHAR(200)列更改为VARCHAR(20)是否会使查询运行得更快,尤其是在以下情况下:
- These columns will never contain more than 20 characters
- 这些列永远不会包含超过20个字符
- These columns are often used in
ORDER BY
clause - 这些列通常用于ORDER BY子句中
- These columns are often used in
WHERE
clause- Some of these columns are indexed so that they can be used in
WHERE
clause - 其中一些列已编制索引,以便可以在WHERE子句中使用它们
- Some of these columns are indexed so that they can be used in
- 这些列通常在WHERE子句中使用。其中一些列已编制索引,以便可以在WHERE子句中使用它们
PS: I am using SQL Server 2000 but will upgrade to later versions of SQL anytime soon.
PS:我正在使用SQL Server 2000,但很快就会升级到更高版本的SQL。
4 个解决方案
#1
8
Yes, the length of varchar affects estimation of the query, memory that will be allocated for internal operation (for example for sorting) and as consequence resources of CPU. You can reproduce it with the following simple example.
是的,varchar的长度会影响查询的估计,将分配给内部操作的内存(例如用于排序)以及CPU的资源。您可以使用以下简单示例重现它。
1.Create two tables:
1.创建两个表:
create table varLenTest1
(
a varchar(100)
)
create table varLenTest2
(
a varchar(8000)
)
2. Fill both of them with some data:
2.用两些数据填写它们:
declare @i int
set @i = 20000
while (@i > 0)
begin
insert into varLenTest1 (a) values (cast(NEWID() as varchar(36)))
set @i = @i - 1
end
3. Execute the following queries with "include actual execution plan":
3.使用“包含实际执行计划”执行以下查询:
select a from varLenTest1 order by a OPTION (MAXDOP 1) ;
select a from varLenTest2 order by a OPTION (MAXDOP 1) ;
If you inspect execution plans of these queries, you can see that estimated IO cost and estimated CPU cost is very different:
如果检查这些查询的执行计划,您可以看到估计的IO成本和估计的CPU成本是非常不同的:
#2
2
It does matter for the query optimiser when it will evaluate the best query path to perform your query. When more than one path will be available, it will calculate an I/O cost and other various parameters based on your query and from these, chose the one that will appears to him as the least costly.
查询优化器何时会评估执行查询的最佳查询路径,这一点很重要。当有多个路径可用时,它将根据您的查询计算I / O成本和其他各种参数,并从中选择一个对他而言成本最低的路径。
This is not an absolute calculation, it's only an approximation process. Therefore, it can easily be thrown off if the apparent mean size required to manipulate the records from one table in memory is much bigger then what will be really necessary and the optimiser might chose a less performing path based on what it thinks would have be necessary for the others paths.
这不是绝对计算,它只是一个近似过程。因此,如果从内存中的一个表操作记录所需的表观平均大小比实际需要的大得多,并且优化器可能会根据它认为必要的路径选择性能较差的路径,则可以很容易地抛弃它。对于其他路径。
Having a realistic max size is also usefull to any other programmer that will come along looking at your code. If I have a variable that I want to display in a GUI, I might allocate much more space than neededd if I see that is backed by something like nvarchar(200) or nvarchar(2000) instead of nvarchar(20) if its size is never greater than that.
拥有真实的最大大小对于任何其他程序员来说也很有用。如果我有一个我想要在GUI中显示的变量,如果我看到它由nvarchar(200)或nvarchar(2000)而不是nvarchar(20)支持,如果它的大小是永远不会超过那个。
#3
1
Here's a blog post that explains under what circumstances and why there are performance differences when using different column sizes (with tests and technical details):
这是一篇博客文章,解释了在什么情况下以及为什么在使用不同的列大小时存在性能差异(包括测试和技术细节):
Advanced TSQL Tuning: Why Internals Knowledge Matters
高级TSQL调优:为什么内部知识很重要
#4
-1
Size matters
大小事项
Always use the smallest data size that will accommodate the largest possible value. If a column is going to store values between 1 and 5, use tinyint instead of int.
始终使用可容纳最大可能值的最小数据大小。如果列要存储1到5之间的值,请使用tinyint而不是int。
This rule also applies to character columns. The smaller the data size, the less there is to read, so performance, over all, benefits. In addition, smaller size reduces network traffic. With newer technology, this tip seems less relevant, but don’t dismiss it out of hand. You’ll won’t regret being efficient from the get-go.
此规则也适用于字符列。数据量越小,读取的次数就越少,因此性能优势就越大。此外,较小的尺寸可减少网络流量。使用更新的技术,这个提示似乎不那么重要,但不要忽视它。你不会后悔从一开始就有效率。
For more info visit http://www.techrepublic.com/blog/10-things/10-plus-tips-for-getting-the-best-performance-out-of-your-sql-server-data-types/
欲了解更多信息,请访问http://www.techrepublic.com/blog/10-things/10-plus-tips-for-getting-the-best-performance-out-of-your-sql-server-data-types/
#1
8
Yes, the length of varchar affects estimation of the query, memory that will be allocated for internal operation (for example for sorting) and as consequence resources of CPU. You can reproduce it with the following simple example.
是的,varchar的长度会影响查询的估计,将分配给内部操作的内存(例如用于排序)以及CPU的资源。您可以使用以下简单示例重现它。
1.Create two tables:
1.创建两个表:
create table varLenTest1
(
a varchar(100)
)
create table varLenTest2
(
a varchar(8000)
)
2. Fill both of them with some data:
2.用两些数据填写它们:
declare @i int
set @i = 20000
while (@i > 0)
begin
insert into varLenTest1 (a) values (cast(NEWID() as varchar(36)))
set @i = @i - 1
end
3. Execute the following queries with "include actual execution plan":
3.使用“包含实际执行计划”执行以下查询:
select a from varLenTest1 order by a OPTION (MAXDOP 1) ;
select a from varLenTest2 order by a OPTION (MAXDOP 1) ;
If you inspect execution plans of these queries, you can see that estimated IO cost and estimated CPU cost is very different:
如果检查这些查询的执行计划,您可以看到估计的IO成本和估计的CPU成本是非常不同的:
#2
2
It does matter for the query optimiser when it will evaluate the best query path to perform your query. When more than one path will be available, it will calculate an I/O cost and other various parameters based on your query and from these, chose the one that will appears to him as the least costly.
查询优化器何时会评估执行查询的最佳查询路径,这一点很重要。当有多个路径可用时,它将根据您的查询计算I / O成本和其他各种参数,并从中选择一个对他而言成本最低的路径。
This is not an absolute calculation, it's only an approximation process. Therefore, it can easily be thrown off if the apparent mean size required to manipulate the records from one table in memory is much bigger then what will be really necessary and the optimiser might chose a less performing path based on what it thinks would have be necessary for the others paths.
这不是绝对计算,它只是一个近似过程。因此,如果从内存中的一个表操作记录所需的表观平均大小比实际需要的大得多,并且优化器可能会根据它认为必要的路径选择性能较差的路径,则可以很容易地抛弃它。对于其他路径。
Having a realistic max size is also usefull to any other programmer that will come along looking at your code. If I have a variable that I want to display in a GUI, I might allocate much more space than neededd if I see that is backed by something like nvarchar(200) or nvarchar(2000) instead of nvarchar(20) if its size is never greater than that.
拥有真实的最大大小对于任何其他程序员来说也很有用。如果我有一个我想要在GUI中显示的变量,如果我看到它由nvarchar(200)或nvarchar(2000)而不是nvarchar(20)支持,如果它的大小是永远不会超过那个。
#3
1
Here's a blog post that explains under what circumstances and why there are performance differences when using different column sizes (with tests and technical details):
这是一篇博客文章,解释了在什么情况下以及为什么在使用不同的列大小时存在性能差异(包括测试和技术细节):
Advanced TSQL Tuning: Why Internals Knowledge Matters
高级TSQL调优:为什么内部知识很重要
#4
-1
Size matters
大小事项
Always use the smallest data size that will accommodate the largest possible value. If a column is going to store values between 1 and 5, use tinyint instead of int.
始终使用可容纳最大可能值的最小数据大小。如果列要存储1到5之间的值,请使用tinyint而不是int。
This rule also applies to character columns. The smaller the data size, the less there is to read, so performance, over all, benefits. In addition, smaller size reduces network traffic. With newer technology, this tip seems less relevant, but don’t dismiss it out of hand. You’ll won’t regret being efficient from the get-go.
此规则也适用于字符列。数据量越小,读取的次数就越少,因此性能优势就越大。此外,较小的尺寸可减少网络流量。使用更新的技术,这个提示似乎不那么重要,但不要忽视它。你不会后悔从一开始就有效率。
For more info visit http://www.techrepublic.com/blog/10-things/10-plus-tips-for-getting-the-best-performance-out-of-your-sql-server-data-types/
欲了解更多信息,请访问http://www.techrepublic.com/blog/10-things/10-plus-tips-for-getting-the-best-performance-out-of-your-sql-server-data-types/