Is it particularly bad to have a very, very large SQL query with lots of (potentially redundant) WHERE clauses?
如果有一个非常非常大的SQL查询,其中有很多WHERE子句(可能是冗余的),是不是特别糟糕?
For example, here's a query I've generated from my web application with everything turned off, which should be the largest possible query for this program to generate:
例如,这是我从我的web应用程序中生成的一个查询,所有东西都关闭了,这应该是这个程序可以生成的最大的查询:
SELECT *
FROM 4e_magic_items
INNER JOIN 4e_magic_item_levels
ON 4e_magic_items.id = 4e_magic_item_levels.itemid
INNER JOIN 4e_monster_sources
ON 4e_magic_items.source = 4e_monster_sources.id
WHERE (itemlevel BETWEEN 1 AND 30)
AND source!=16 AND source!=2 AND source!=5
AND source!=13 AND source!=15 AND source!=3
AND source!=4 AND source!=12 AND source!=7
AND source!=14 AND source!=11 AND source!=10
AND source!=8 AND source!=1 AND source!=6
AND source!=9 AND type!='Arms' AND type!='Feet'
AND type!='Hands' AND type!='Head'
AND type!='Neck' AND type!='Orb'
AND type!='Potion' AND type!='Ring'
AND type!='Rod' AND type!='Staff'
AND type!='Symbol' AND type!='Waist'
AND type!='Wand' AND type!='Wondrous Item'
AND type!='Alchemical Item' AND type!='Elixir'
AND type!='Reagent' AND type!='Whetstone'
AND type!='Other Consumable' AND type!='Companion'
AND type!='Mount' AND (type!='Armor' OR (false ))
AND (type!='Weapon' OR (false ))
ORDER BY type ASC, itemlevel ASC, name ASC
It seems to work well enough, but it's also not particularly high traffic (a few hundred hits a day or so), and I wonder if it would be worth the effort to try and optimize the queries to remove redundancies and such.
它似乎运行得很好,但是它也不是特别高的流量(每天有几百次点击),我想知道是否值得尝试和优化查询以删除冗余等等。
6 个解决方案
#1
19
Reading your query makes me want to play an RPG.
阅读您的查询使我想要播放RPG。
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
这绝对不会太长。只要它们格式良好,我认为实际的限制是大约100行。在此之后,您最好将子查询拆分为视图,以防止您的眼睛交叉。
I've worked with some queries that are 1000+ lines, and that's hard to debug.
我曾经处理过一些1000多行的查询,这很难调试。
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
顺便问一下,我可以建议一个重新格式化的版本吗?这主要是为了说明格式化的重要性;我相信这会更容易理解。
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/
#2
16
Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
默认的MySQL 5.0服务器限制是“1MB”,可配置为1GB。
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
这是通过客户端和服务器上的max_allowed_packet设置进行配置的,有效的限制是这两者的不足。
Caveats:
警告:
- It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)
- 很可能这种“包”限制不会直接映射到SQL语句中的字符。当然,您需要考虑客户机中的字符编码、一些包元数据等等。
#3
3
SELECT @@global.max_allowed_packet
选择@@global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer
这是服务器上唯一可调的真正限制,因此没有真正的直接答案
#4
1
From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
从实际的角度来看,我通常认为任何最终需要写超过10行(将每个子句/条件放在一个单独的行上)的选择都太长,难以维护。此时,它可能应该作为某种存储过程来完成,或者我应该尝试找到一种更好的方式来表达相同的概念——可能是通过创建一个中间表来捕获一些我似乎经常查询的关系。
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
您的里数可能会有所不同,并且有一些非常长的查询有很好的理由。但我的经验法则是10行。
Example (mildly improper SQL):
例(轻度不当SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
这可能太大了;然而,优化将很大程度上取决于环境。
Just remember, the longer and more complex the query, the harder it's going to be to maintain.
记住,查询越长越复杂,维护起来就越困难。
#5
0
Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
大多数数据库支持存储过程以避免这个问题。如果您的代码执行速度足够快,并且易于阅读,您不希望为了降低编译时间而更改它。
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call
另一种方法是使用准备好的语句,以便在每个客户端连接上只获得一次命中,然后在每次调用中只传递参数
#6
0
I'm assuming you mean by 'turned off' that a field doesn't have a value?
我假设你的意思是“关闭”一个字段没有值?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.
而不是检查某个东西是不是这个,也不是那个等等你能不能检查一下这个字段是不是空的?或者将字段设置为“off”,并检查输入或其他什么内容是否等于“off”。
#1
19
Reading your query makes me want to play an RPG.
阅读您的查询使我想要播放RPG。
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
这绝对不会太长。只要它们格式良好,我认为实际的限制是大约100行。在此之后,您最好将子查询拆分为视图,以防止您的眼睛交叉。
I've worked with some queries that are 1000+ lines, and that's hard to debug.
我曾经处理过一些1000多行的查询,这很难调试。
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
顺便问一下,我可以建议一个重新格式化的版本吗?这主要是为了说明格式化的重要性;我相信这会更容易理解。
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/
#2
16
Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
默认的MySQL 5.0服务器限制是“1MB”,可配置为1GB。
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
这是通过客户端和服务器上的max_allowed_packet设置进行配置的,有效的限制是这两者的不足。
Caveats:
警告:
- It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)
- 很可能这种“包”限制不会直接映射到SQL语句中的字符。当然,您需要考虑客户机中的字符编码、一些包元数据等等。
#3
3
SELECT @@global.max_allowed_packet
选择@@global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer
这是服务器上唯一可调的真正限制,因此没有真正的直接答案
#4
1
From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
从实际的角度来看,我通常认为任何最终需要写超过10行(将每个子句/条件放在一个单独的行上)的选择都太长,难以维护。此时,它可能应该作为某种存储过程来完成,或者我应该尝试找到一种更好的方式来表达相同的概念——可能是通过创建一个中间表来捕获一些我似乎经常查询的关系。
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
您的里数可能会有所不同,并且有一些非常长的查询有很好的理由。但我的经验法则是10行。
Example (mildly improper SQL):
例(轻度不当SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
这可能太大了;然而,优化将很大程度上取决于环境。
Just remember, the longer and more complex the query, the harder it's going to be to maintain.
记住,查询越长越复杂,维护起来就越困难。
#5
0
Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
大多数数据库支持存储过程以避免这个问题。如果您的代码执行速度足够快,并且易于阅读,您不希望为了降低编译时间而更改它。
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call
另一种方法是使用准备好的语句,以便在每个客户端连接上只获得一次命中,然后在每次调用中只传递参数
#6
0
I'm assuming you mean by 'turned off' that a field doesn't have a value?
我假设你的意思是“关闭”一个字段没有值?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.
而不是检查某个东西是不是这个,也不是那个等等你能不能检查一下这个字段是不是空的?或者将字段设置为“off”,并检查输入或其他什么内容是否等于“off”。