I have a use-case where I need to perform a very high number of SELECT SQL
我有一个用例,我需要执行非常多的SELECT SQL
I have two approaches at this moment:
我现在有两种方法:
-
Query by a list of identifiers. So, I first used WHERE IN clause:
按标识符列表查询。所以,我首先使用WHERE IN子句:
- SELECT COL1, COL2, COL3, COL4 FROM MAIN_TABLE WHERE COL1 IN ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 )
选择COL1,COL2,COL3,COL4从MAIN_TABLE到COL1 IN(1,2,3,8,11,78,59,65,74,25,36,54558,78854,558)
-
I can create a table, let's say, CACHE_TABLE, and first INSERT the identifiers ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 ) into it by a unique key CACHEID and the JOIN this CACHE_TABLE with MAIN_TABLE to get the desired result:
我可以创建一个表,比方说,CACHE_TABLE,并首先通过一个唯一的INSERT标识符(1,2,3,8,11,78,59,65,74,25,36,54558,78854,558)键CACHEID和JOIN此CACHE_TABLE与MAIN_TABLE以获得所需的结果:
- SELECT MT.COL1, MT.COL2, MT.COL3, MT.COL4 FROM MAIN_TABLE MT JOIN CACHE_TABLE CT ON CT.IDENTIFIER = MT.COL1 WHERE CT.CACHEID =
选择MT.COL1,MT.COL2,MT.COL3,MT.COL4来自MAIN_TABLE MT JOIN CACHE_TABLE CT ON CT.IDENTIFIER = MT.COL1 WHERE CT.CACHEID =
Performance is really critical in my use-case. So I wanted to know if the approach #2 would yield better performance than approach #1. Also, if there is a better alternative approach(s) for this
在我的用例中,性能非常重要。所以我想知道方法#2是否会产生比方法#1更好的性能。此外,如果有更好的替代方法
Thanks a ton in Advance!!
非常感谢!
2 个解决方案
#1
2
your answer is best performance approach #2. In my experience IN is a very slow operator, since SQL normally evaluates it as a series of WHERE clauses separated by "OR" (WHERE x=Y OR x=Z OR...). As with ALL THINGS SQL though, your mileage may vary. The speed will depend a lot on indexes
你的答案是最好的表现方法#2。根据我的经验,IN是一个非常慢的运算符,因为SQL通常将它计算为由“OR”分隔的一系列WHERE子句(WHERE x = Y OR x = Z OR ...)。与ALL THINGS SQL一样,您的里程可能会有所不同。速度将取决于索引
#2
0
You need to test the two approaches.
您需要测试这两种方法。
For a single query, I would expect in
to win in most cases -- simply because creating the table and then uses it requires multiple round-trips to the database.
对于单个查询,我希望在大多数情况下获胜 - 仅仅因为创建表然后使用它需要多次往返数据库。
In addition, some databases optimize constant lists (for instance, MySQL does a binary search on values rather than a sequential search).
此外,一些数据库优化常量列表(例如,MySQL对值进行二进制搜索而不是顺序搜索)。
The one thing that will help either version is an index on (col1)
or (col1, col2, col3, col4)
.
有助于任一版本的一件事是(col1)或(col1,col2,col3,col4)上的索引。
#1
2
your answer is best performance approach #2. In my experience IN is a very slow operator, since SQL normally evaluates it as a series of WHERE clauses separated by "OR" (WHERE x=Y OR x=Z OR...). As with ALL THINGS SQL though, your mileage may vary. The speed will depend a lot on indexes
你的答案是最好的表现方法#2。根据我的经验,IN是一个非常慢的运算符,因为SQL通常将它计算为由“OR”分隔的一系列WHERE子句(WHERE x = Y OR x = Z OR ...)。与ALL THINGS SQL一样,您的里程可能会有所不同。速度将取决于索引
#2
0
You need to test the two approaches.
您需要测试这两种方法。
For a single query, I would expect in
to win in most cases -- simply because creating the table and then uses it requires multiple round-trips to the database.
对于单个查询,我希望在大多数情况下获胜 - 仅仅因为创建表然后使用它需要多次往返数据库。
In addition, some databases optimize constant lists (for instance, MySQL does a binary search on values rather than a sequential search).
此外,一些数据库优化常量列表(例如,MySQL对值进行二进制搜索而不是顺序搜索)。
The one thing that will help either version is an index on (col1)
or (col1, col2, col3, col4)
.
有助于任一版本的一件事是(col1)或(col1,col2,col3,col4)上的索引。