
时间:2021-01-25 00:10:16

Let's say I have this query:


select * from table1 r where r.x = 5

Does the speed of this query depend on the number of rows that are present in table1?


6 个解决方案



The are many factors on the speed of a query, one of which can be the number of rows.


Others include:


  • index strategy (if you index column "x", you will see better performance than if it's not indexed)
  • 索引策略(如果索引列“x”,您将看到比未编入索引更好的性能)
  • server load
  • 服务器负载
  • data caching - once you've executed a query, the data will be added to the data cache. So subsequent reruns will be much quicker as the data is coming from memory, not disk. Until such point where the data is removed from the cache
  • 数据缓存 - 一旦执行了查询,数据就会被添加到数据缓存中。因此,随着数据来自内存而不是磁盘,后续重新运行会更快。直到从缓存中删除数据的那一点
  • execution plan caching - to a lesser extent. Once a query is executed for the first time, the execution plan SQL Server comes up with will be cached for a period of time, for future executions to reuse.
  • 执行计划缓存 - 在较小程度上。一旦第一次执行查询,SQL Server提出的执行计划将被缓存一段时间,以便将来执行重用。
  • server hardware
  • 服务器硬件
  • the way you've written the query (often one of the biggest contibutors to poor performance!). e.g. writing something using a cursor instead of a set-based operation
  • 你编写查询的方式(通常是表现不佳的最大的contibutors之一!)。例如使用游标而不是基于集合的操作来编写内容

For databases with a large number of rows in tables, partitioning is usually something to consider (with SQL Server 2005 onwards, Enterprise Edition there is built-in support). This is to split the data down into smaller units. Generally, smaller units = smaller tables = smaller indexes = better performance.

对于表中包含大量行的数据库,通常需要考虑分区(从SQL Server 2005开始,Enterprise Edition有内置支持)。这是将数据拆分为更小的单位。通常,较小的单位=较小的表=较小的索引=较好的性能。



Yes, and it can be very significant.


If there's 100 million rows, SQL server has to go through each of them and see if it matches. That takes a lot more time compared to there being 10 rows.


You probably want an index on the 'x' column, in which case the sql server might check the index rather than going through all the rows - which can be significantly faster as the sql server might not even need to check all the values in the index.

你可能想要一个'x'列的索引,在这种情况下,sql server可能会检查索引而不是遍历所有行 - 这可能会明显更快,因为sql server可能甚至不需要检查所有的值指数。

On the other hand, if there's 100 million rows matching x = 5, it's slower than 10 rows.

另一方面,如果有1亿行匹配x = 5,则它比10行慢。



Almost always yes. The real question is: what is the rate at which the query slows down as the table size increases? And the answer is: by not much if r.x is indexed, and by a large amount if not.




Not the rows (to a certain degree of course) per se, but the amount of data (columns) is what can make a query slow. The data also needs to be transfered from the backend to the frontend.




The Answer is Yes. But not the only factor. if you did appropriate optimizations and tuning the performance drop will be negligible Main Performance factors


  • Indexing Clustered or None clustered
  • 索引聚簇或无聚簇
  • Data Caching
  • 数据缓存
  • Table Partitioning
  • 表分区
  • Execution Plan caching
  • 执行计划缓存
  • Data Distribution
  • 数据分布
  • Hardware specs
  • 硬件规格

There are some other factors but these are mainly considered. Even how you designed your Schema makes effect on the performance.




You should assume that your query always depends on the number of rows. In fact, you should assume the worst case (linear or O(N) for the example you provided) and exponential for more complex queries. There are database specific manuals filled with tricks to help you avoid the worst case but SQL itself is a language and doesn't specify how to execute your query. Instead, the database implementation decides how to execute any given query: if you have indexed a column or set of columns in your database then you will get O(log(N)) performance for a simple lookup; if the system has effective query caching you might get O(1) response. Here is a good introductory article: High scalability: SQL and computational complexity




The are many factors on the speed of a query, one of which can be the number of rows.


Others include:


  • index strategy (if you index column "x", you will see better performance than if it's not indexed)
  • 索引策略(如果索引列“x”,您将看到比未编入索引更好的性能)
  • server load
  • 服务器负载
  • data caching - once you've executed a query, the data will be added to the data cache. So subsequent reruns will be much quicker as the data is coming from memory, not disk. Until such point where the data is removed from the cache
  • 数据缓存 - 一旦执行了查询,数据就会被添加到数据缓存中。因此,随着数据来自内存而不是磁盘,后续重新运行会更快。直到从缓存中删除数据的那一点
  • execution plan caching - to a lesser extent. Once a query is executed for the first time, the execution plan SQL Server comes up with will be cached for a period of time, for future executions to reuse.
  • 执行计划缓存 - 在较小程度上。一旦第一次执行查询,SQL Server提出的执行计划将被缓存一段时间,以便将来执行重用。
  • server hardware
  • 服务器硬件
  • the way you've written the query (often one of the biggest contibutors to poor performance!). e.g. writing something using a cursor instead of a set-based operation
  • 你编写查询的方式(通常是表现不佳的最大的contibutors之一!)。例如使用游标而不是基于集合的操作来编写内容

For databases with a large number of rows in tables, partitioning is usually something to consider (with SQL Server 2005 onwards, Enterprise Edition there is built-in support). This is to split the data down into smaller units. Generally, smaller units = smaller tables = smaller indexes = better performance.

对于表中包含大量行的数据库,通常需要考虑分区(从SQL Server 2005开始,Enterprise Edition有内置支持)。这是将数据拆分为更小的单位。通常,较小的单位=较小的表=较小的索引=较好的性能。



Yes, and it can be very significant.


If there's 100 million rows, SQL server has to go through each of them and see if it matches. That takes a lot more time compared to there being 10 rows.


You probably want an index on the 'x' column, in which case the sql server might check the index rather than going through all the rows - which can be significantly faster as the sql server might not even need to check all the values in the index.

你可能想要一个'x'列的索引,在这种情况下,sql server可能会检查索引而不是遍历所有行 - 这可能会明显更快,因为sql server可能甚至不需要检查所有的值指数。

On the other hand, if there's 100 million rows matching x = 5, it's slower than 10 rows.

另一方面,如果有1亿行匹配x = 5,则它比10行慢。



Almost always yes. The real question is: what is the rate at which the query slows down as the table size increases? And the answer is: by not much if r.x is indexed, and by a large amount if not.




Not the rows (to a certain degree of course) per se, but the amount of data (columns) is what can make a query slow. The data also needs to be transfered from the backend to the frontend.




The Answer is Yes. But not the only factor. if you did appropriate optimizations and tuning the performance drop will be negligible Main Performance factors


  • Indexing Clustered or None clustered
  • 索引聚簇或无聚簇
  • Data Caching
  • 数据缓存
  • Table Partitioning
  • 表分区
  • Execution Plan caching
  • 执行计划缓存
  • Data Distribution
  • 数据分布
  • Hardware specs
  • 硬件规格

There are some other factors but these are mainly considered. Even how you designed your Schema makes effect on the performance.




You should assume that your query always depends on the number of rows. In fact, you should assume the worst case (linear or O(N) for the example you provided) and exponential for more complex queries. There are database specific manuals filled with tricks to help you avoid the worst case but SQL itself is a language and doesn't specify how to execute your query. Instead, the database implementation decides how to execute any given query: if you have indexed a column or set of columns in your database then you will get O(log(N)) performance for a simple lookup; if the system has effective query caching you might get O(1) response. Here is a good introductory article: High scalability: SQL and computational complexity
