JAVA开发人员在编写SQL时最容易犯的10个错误

原文网址： 10 More Common Mistakes Java Developers Make when Writing SQL 或 http://blog.jooq.org/2013/08/12/10-more-common-mistakes-java-developers-make-when-writing-sql/

JAVA开发人员在编写SQL时最容易犯的10个错误

<10 More Common Mistakes Java Developers Make whenWriting SQL>

I was positivelysurprised to see how popular my recent listing about10 Common Mistakes Java Developers Make when Writing SQLwas, bothon my own blogand on my syndication partner DZone. The popularity shows acouple of things:

（我十分惊讶的发现我最近列出的JAVA开发者在写SQL时最容易犯的10个错误有多受欢迎，都在我的博客和DZONE上，这说明了一些事情：）

l How important SQL isto the professional Java world.

（在JAVA专业开发领域中有多重要）

l How common it is toforget about some basic SQL things.

（忘记一些基本的SQL常识有多常见）

l How well SQL-centriclibraries such asjOOQ or MyBatisare responding to market needs, byembracing SQL.An amusing fact is that users have even mentioned my blog post onSLICK’s mailing list. SLICK is a non-SQL-centricdatabase access library in Scala. LikeLINQ (and LINQ-to-SQL) it focuses on language integration,not on SQL code generation.

（SQL中心库比如JOOQ和MyBatis对市场需求的反应有多么的好，byembracing SQL,一个有趣的事实是用户甚至提到了我博客上贴的一篇“SLICK'smailing list”，SLICK是一个基于Scala的NON-SQL-centric 数据库访问库，像LINQ(还有LNQ-TO-LINQ)，它侧重语言整合，而不是SQL语句的产生。）

Anyway, the commonmistakes I listed previously are far from complete, so I will treat you to asequel of 10 subtly less common, yet equally interesting mistakes Javadevelopers make when writing SQL.

（无论如何，我之前列出的常见错误还没有完成，我会和你们再探讨一下后续10个没有那么常见的，JAVA开发者在写SQL时的犯的同样有趣的错误。）

一、不使用PreparedStstements<Not usingPreparedStatements>

Interestingly, this mistake or misbelief still surfacesblogs,forums and mailing lists many years after the appearance of JDBC, even if it isabout a very simple thing to remember and to understand. It appears that somedevelopers refrain from using PreparedStatements for any of these reasons:

（在JDBC出现很多年后，这个错误仍然在很多的博客、论坛以及邮件中出现，即使这是一个十分容易记住和理解的问题。Java开发人员拒绝使用PreparedStatement的原因有以下几个：）

l They don’t know about PreparedStatements

（他们对Preparedstatement不了解。）

l They think that PreparedStatements are slower

（他们认为Preparedstatement更慢。）

l They think that writing a PreparedStatementtakes more effort

（他们认为写一个PreparedStatement更费劲。）

First off, let’s bust the above myths. In 96% of the cases, you’re betteroff writing a PreparedStatement rather than a static statement. Why? For simplereasons:

（首先，让我们来看看PreparedStatement的好处，在96%的情况下，使用PreparedStatements比使用静态语句要更好。为什么这么说呢？下面列出了几个简单的原因：）

1. 你可以不必再担心由于拼接字符串错误时而产生的语法错误。<You can omit syntax errors originating from badstring concatenation when inlining bind values.>

因为使用PreparedStatement不需要拼接字符串，而使用Statement就需要进行字符串的拼接。

例如：

stringsql = "select * from people pwhere p.id = ? and p.name = ?";

preparedstatement ps = connection.preparestatement(sql);

ps.setint(1,id);

ps.setstring(2,name);

如果使用拼接字符串就是这样写：

stringsql = "select * from people pwhere p.id = " + id + " and p.name = " + name;

这样就很容易出现拼接错误，如少写或者多写了引号！而且可读性还很差。

2. 你可以不必在担心由于拼接字符串产生一个不好的字符串连接导致的SQL注入。<You can omit SQLinjection vulnerabilities from bad string concatenation when inlining bindvalues.>

例如：

String sql= "select * from tb_name where name= '"+varname+"' and passwd='"+varpasswd+"'";

如果我们把[' or '1' ='1]作为varpasswd传入进来.用户名随意,看看会成为什么?

select *from tb_name = '随意' and passwd = '' or '1' = '1';

因为'1'='1'肯定成立,所以可以任何通过验证.更有甚者:把[';droptable tb_name;]作为varpasswd传入进来,则:

select *from tb_name = '随意' and passwd = '';drop table tb_name;（有些数据库是不会让你成功的,但也有很多数据库就可以使这些语句得到执行.）

而如果你使用预编译语句.你传入的任何内容就不会和原来的语句发生任何匹配的关系.只要全使用预编译语句,你就用不着对传入的数据做任何过虑.而如果使用普通的statement,有可能要对drop,;等做费尽心机的判断和过虑.

3. 当插入更多“复杂的”数据类型（如TIMESTAMP、binary data或者其他的）时，可以避免出现边缘现象（edge-cases）。<You can avoid edge-cases when inlining more“sophisticated” data types, such as TIMESTAMP, binary data, and others.>

（边缘现象：出现值为null或“”的情况<个人理解>）（在修复bug的过程中就会碰到很多这类的问题！）

4. 你可以保持开启PreparedStatement一段时间而不用马上关闭，这样就可以使用新的绑定值去复用它。<You can keep open PreparedStatements around for awhile, reusing them with new bind values instead of closing them immediately(useful in Postgres, for instance).>

（因为PreparedStatement是预编译语句，因为预编译语句有可能被重复调用.所以语句在被DB的编译器编译后的执行代码被缓存下来,那么下次调用时只要是相同的预编译语句就不需要编译,只要将参数直接传入编译过的语句执行代码中(相当于一个涵数)就会得到执行.这并不是说只有一个Connection中多次执行的预编译语句被缓存,而是对于整个DB中,只要预编译的语句语法和缓存中匹配.那么在任何时候就可以不需要再次编译而可以直接执行.而statement的语句中,即使是相同一操作,而由于每次操作的数据不同所以使整个语句相匹配的机会极小,几乎不太可能匹配.比如:

insertinto tb_name (col1,col2) values ('11','22');

insertinto tb_name (col1,col2) values ('11','23');

即使是相同操作但因为数据内容不一样,所以整个个语句本身不能匹配,没有缓存语句的意义.事实是没有数据库会对普通语句编译后的执行代码缓存.

当然并不是所以预编译语句都一定会被缓存,数据库本身会用一种策略,比如使用频度等因素来决定什么时候不再缓存已有的预编译结果.以保存有更多的空间存储新的预编译语句.）

5. 你可以在更多复杂的数据库中使用自适应游标共享（adaptive cursor sharing）（Oracle中提到），这对于新设定的绑定值，可以防止SQL语句的硬解析（hard-parsing）。<You can make use of adaptive cursor sharing (Oracle-speak) in moresophisticated databases. This helps prevent hard-parsing SQL statements forevery new set of bind values.>

（硬解析的弊端：硬解析即整个SQL语句的执行需要完完全全的解析，生成执行计划。而硬解析，生成执行计划需要耗用CPU资源，以及SGA资源。在此不得不提的是对库缓存中闩的使用。闩是锁的细化，可以理解为是一种轻量级的串行化设备。当进程申请到闩后，则这些闩用于保护共享内存的数在同一时刻不会被两个以上的进程修改。在硬解析时，需要申请闩的使用，而闩的数量在有限的情况下需要等待。大量的闩的使用由此造成需要使用闩的进程排队越频繁，性能则逾低下。------参考：Oracle硬解析与软解析以及Oracle SQL的硬解析和软解析）

Convinced? Yes. Note, there are some rare cases when you actually want toinline bind values in order to give your database’s cost-based optimiser someheads-up about what kind of data is really going to be affected by the query.Typically, this results in “constant” predicates such as:

（确定吗？是的，注意，当你真的想要通过内联绑定值（inline bind values：我个人理解是拼接字符串，就是Statement的那种SQL写法）来优化你的数据库的开销的话，会出现一些很罕见的问题，那就是哪些数据真的会影响你的查询。具有代表性的是常量的判断，比如：）

l DELETED = 1

l STATUS = 42

But it shouldn’t result in “variable” predicates such as:

（而不是一些变量的判断比如：）

l FIRST_NAME LIKE “Jon%”

l AMOUNT > 19.95

Note that modern databases implement bind-variable peeking. Hence, bydefault, you might as well use bind values for all your query parameters. Notealso that higher-level APIs such asJPA CriteriaQuery orjOOQ will help you generate PreparedStatements and bind values very easily andtransparently when writing embedded JPQL or embedded SQL.

（现代数据库实现了绑定变量窥探（bind-variable peeking）。因此，一般情况下，你也可以在所有的查询参数上使用绑定值。注意一些高级的API比如JPA、JOOQ可以帮助你在写嵌入式的JPQL和SQL时十分简单和明显的产生PreparedStatment和绑定变量）

More background info:

（更多信息可以参考：）

Caveats of bind value peeking: An interestingblog post by Oracle Guru Tanel Poder on the subject
Cursor sharing. An interestingStack Overflow question.

参考资料：

Oracle自适应共享游标

Oracle绑定变量窥探

bind variable peeking

The Cure（解决方法）：

By default, always use PreparedStatements instead ofstaticstatements. By default, never inline bind valuesinto your SQL

默认情况下，总是使用PreparedStatement来代替静态语句。绝不要内联绑定值（拼接字符串）到你的SQL语句上（后面这句话，我的理解是像使用Statement时的那种SQL的写法：如“select * from user_tabwhere username = ” + username + “ and password = ” + password ）。

二、返回太多列<Returning too many columns>

This mistake is quite frequent and can lead to very bad effects both inyour database’s execution plan and in your Java application.

（这个错误很频繁的出现在数据库的执行计划和java应用程序中。）

（执行计划可以参考：优化SQL查询：如何写出高性能SQL语句和SQL点滴27—性能分析之执行计划）

l 让我们先来看看对于java应用程序的影响：<Bad effects on the Javaapplication:>

If you’re selecting * (star) or a “default” set of 50 columns, which you’rereusing among various DAOs, you’re transferring lots of data from the databaseinto a JDBC ResultSet. Even if you’re not reading the data from the ResultSet,it has been transferred over the wire and loaded into your memory by the JDBCdriver. That’s quite a waste of IO and memory if youknowthat you’re only going to need 2-3 of those columns.

（如果你是为了将某些数据重用于不同的DAO层中，而查询出所有数据或者默认的50列的数据，这样，将会有大量的数据从数据库读入到JDBC结果集中。即使你可能不会读取到（使用到）所有结果集中的数据。但是，它已经通过JDBC驱动从线路中加载到了你的内存中。如果你知道你仅仅打算使用那些列中的2至3条数据，你会发现那将很浪费IO和内存。）

也就是说，如果你只是需要数据库中2-3列的数据，但是你通过使用(SELECT * …)查询出了所有数据。这将是一件很浪费IO和内存的事。

This was obvious, but beware also of…

这（问题的严重性）也就显而易见了，但是也要小心…

l 让我们再来看看对于数据库的执行计划的影响：<Bad effectson the database execution plan:>

Theseeffects may actually be much worse than the effects on the Java application.Sophisticated databases perform a lot of SQL transformation when calculatingthe bestexecution plan for yourquery.It may well be that some parts of your query can be“transformed away”, knowing that they won’t contribute to theprojection (SELECT clause) or to thefilteringpredicates. I’verecently blogged about this in the context of schema meta data:
How schema meta data impacts Oracle querytransformations

（事实上，这个错误对于数据库执行计划的影响比对JAVA应用程序的影响更严重。当为了给你的SQL语句计算最佳的执行计划时，复杂的数据库会进行大量的SQL转换。It may well be that some parts of your query can be “transformedaway”, knowing that they won’t contribute to the projection (SELECT clause) orto the filtering predicates. 相关内容参考：How schema meta data impacts Oracle querytransformations）

Now,this is quite a beast. Think about a sophisticated SELECT that will join twoviews:

（现在，这个是十分可怕的。想想一个加入两个视图的复杂的查询：）

1.SELECT*

2.FROMcustomer_view c

3.JOINorder_view o

4. ON c.cust_id= o.cust_id

Each of the views that are joined to the above joined table reference mightagain join data from dozens of tables, such as CUSTOMER_ADDRESS, ORDER_HISTORY,ORDER_SETTLEMENT, etc. Given the SELECT *projection,your database has no choice but to fully perform the loading of all thosejoined tables, when in fact, the only thing that you were interested in wasthis:

每个关联了上面关联的两张表的引用的视图也许还需要关联很多张表的数据。比如：CUSTOMER_ADDRESS, ORDER_HISTORY, ORDER_SETTLEMENT,等。使用select *查询，你的数据库除了只能够执行加载所有的那些关联的表，别无他法。但你实际上只需要这些数据：

1.SELECTc.first_name, c.last_name, o.amount

2.FROMcustomer_view c

3.JOINorder_view o

4.ON c.cust_id= o.cust_id

因此，要尽量避免使用SELECT *。

A good database will transform your SQL in a way that most of the “hidden”joins can be removed, which results in much less IO and memory consumptionwithin the database.

一个好的数据库会转换你的SQL，将大部分不需要的关联移除，来减少IO操作和数据库的内存消耗。

The Cure（解决方法）:

Never execute SELECT *. Never reuse the same projection for variousqueries.Always try to reduce the projection to the datathat you really need.

绝对不要使用SELECT *（这样的查询语句）。永远不要在不同的查询中复用同一个规则，总是尝试将查询减少为你真正需要的数据。

Note that this can be quite hard to achieve with ORMs.

注意：通过ORMs（对象-关系映射（Object/Relation Mapping，简称ORM）），这是很难实现的。

三、认为JOIN是SELECT的子句<Thinking that JOIN is aSELECT clause>

This isn’t a mistake with a lot of impact on performance or SQLcorrectness, but nevertheless, SQL developers should be aware of the fact thatthe JOIN clause is not part of the SELECT statement per se. TheSQL standard 1992 defines atable reference as such:

（这不是一个对性能或者SQL准确性带来很多影响的错误，但尽管如此，SQL开发人员应该意识到JOIN子句不是SELECT语句的一部分的事实。SQL standard 1992定义了一个表引用像这样（如下）：）

6.3 <tablereference>

<tablereference> ::= //表引用

<table name> [ [ AS ] <correlationname> //表名 AS相关名称

[ <left paren> <derived columnlist> <right paren> ] ] //（派生列列表）

| <derived table> [ AS ]<correlation name> //派生表 AS相关名称

[ <left paren> <derived columnlist> <right paren> ] //(派生列列表)

| <joined table> //连接表

The FROM clause and also thejoined table can then make use of suchtable references:

（FROM子句以及the joined table（被连接的表）可以利用这样的表引用：）

7.4 <fromclause>

<fromclause> ::=

FROM <table reference> [ {<comma> <table reference> }... ]

7.5 <joinedtable>

<joinedtable> ::=

| <qualified join>

| <left paren> <joined table><right paren>

<crossjoin> ::=

<table reference> CROSS JOIN<table reference>

<qualifiedjoin> ::=

<table reference> [ NATURAL ] [<join type> ] JOIN

<table reference> [ <joinspecification> ]

Relational databases are very table-centric. Many operations are performedon physical, joined or derived tables in one way or another.Towrite SQL effectively, it is important to understand that theSELECT .. FROM clause expects a comma-separated list oftable references in whatever form they may be provided.

（关系型数据库是非常以表为中心的。许多操作是在物理表、连接表或者派生表中以一种或另一种方式执行的。为了有效的编写SQL，重要的是要理解SELECT...FROM子句是一个以逗号分隔的表引用列表。）

Depending on the complexity of the table reference, some databases alsoaccept sophisticated table references in other statements, such as INSERT,UPDATE, DELETE, MERGE. SeeOracle’s manuals for instance, explaining how to create updatable views.

（根据表引用的复杂性，在其他的语句中，数据库也接收复杂的表引用。例如：INSERT, UPDATE, DELETE, MERGE。通过Oracle’smanuals for instance这篇文章，可以查看关于如何创建可更新的视图的解释说明。）

The Cure（解决方法）：

Always think of your FROM clause to expect a table reference as awhole. If you write aJOIN clause, think of thisJOIN clause to be part of a complex table reference:

（总的来说，总是把FROM子句认为是一个表引用。如果你写了一个JOIN子句，就把这个JOIN子句认为是一个复杂表引用的一部分：）

1.SELECTc.first_name, c.last_name,o.amount

2.FROM

3.

4. customer_view c

5. JOINorder_view o

6. ON c.cust_id= o.cust_id

四、使用以前的ANSI JOIN语法<Usingpre-ANSI JOIN syntax>

from T1, T2

where where_filter

这是ANSI SQL 1989引入的旧语法，使用逗号来分隔联接表，在where子句中指定联接字段。这种语法仅支持cross join和inner join，不支持outer join。

from T1 <join_type> join T2 on <on_filter>

where where_filter

这是ANSI SQL 1992引入的新语法，使用join关键字来分隔联接表，在on子句中指定联接字段。这种语法增加了对outer join的支持。

以上两种语法都是ANSI SQL标准支持的，因此SQL Server也都支持。

在SQL Server增加对ANSI SQL 1992语法的支持之前，已经有了对outer join的实际需求，因此T-SQL使用了专属的语法来支持outer join，*=和=*分别代表left outer join和right outer join。该专属语法并不支持full outer join，也不属于ANSI SQL标准。

SQL Server 2005中建议不使用该专属语法（"*=" 或"=*"），但为了向后兼容，允许在兼容级别设置为80或更低的情况下继续使用，但是在将来的SQL Server版本中，即使在向后兼容模式下，也不支持该语法。

上面一段来自：http://blog.sina.com.cn/s/blog_7d2afc6c0100q3vk.html（JOIN语句的不同ANSI版本）

Nowthat we’ve clarified how table references work (see the previous point), itshould become a bit more obvious that the pre-ANSI JOIN syntax should beavoided at all costs. To execution plans, it usually makes no difference if youspecify join predicates in the JOIN .. ON clauseor in theWHERE clause.But from a readability and maintenance perspective, using the WHERE clausefor both filtering predicates and join predicates is a major quagmire. Considerthis simple example:

（现在，我们已经讲清了表引用是如何工作的（详情看前面几点），以前的ANSI JOIN语法要不惜一切代价的避免应该变的很显然了。对于执行计划，如果你在JOIN...ON子句或者在WHERE子句中指定连接词，这通常是没有区别的。但是从可读性和可维护性的角度考虑，同时对过滤关键词（filtering predicates）和连接关键词（join predicates）使用WHERE子句是一个主要的困境。看一个简单的例子：）

1.SELECTc.first_name, c.last_name, o.amount

2.FROM customer_viewc,

3. order_view o

4.WHERE o.amount >100

5.ANDc.cust_id = o.cust_id

6.AND c.language='en'

Canyou spot the join predicate? What if we joined dozensof tables? This gets much worse when applying proprietary syntaxes for outerjoin, such asOracle’s (+) syntax.

（你能找出JOIN关键词吗？如果我们连接了几十张表呢？这对于应用外连接（outer join）专有语法将变得更糟糕，例如Oracle’s (+) syntax这篇文章讲到的。）

我对上面一段代码的理解：（意思是不要省略JOIN连接词的写法，正确的写法是：

SELECT c.first_name,c.last_name, o.amount

FROM customer_view c

JOIN order_view o ONc.cust_id = o.cust_id AND c.language = 'en'

WHERE o.amount > 100

虽然这两段代码具有相同的执行计划，但是正确的写法在可读性和维护性方面都比例子给的代码要好）

TheCure（解决方法：

Always use the ANSI JOIN syntax. Never put JOIN predicates into theWHERE clause. There is absolutely no advantage to using the pre-ANSI JOINsyntax.

（总是使用ANSI JOIN语法<我的理解是使用最新的、标准的JOIN语法，不要再使用以前的JOIN语法>。绝不要将JOIN放在WHERE子句里面。使用以前的ANSI JOIN语法是绝对无利的。）

五、写LIKE时忘记写escape<Forgetting to escape input to the LIKE predicate>

The SQL standard 1992 specifies the like predicate as such:

（92版的SQL标准提出的LIKE是这样的：）

8.5 <likepredicate>

<like predicate> ::= //LIKE谓词

<match value> [ NOT ] LIKE<pattern> //匹配值 [NOT]LIKE规则

[ ESCAPE <escape character> ] // ESCAPE 转义字符

The ESCAPE keywordshould be used almost always when allowing for user input to be used in yourSQL queries. While it may be rare that the percent sign (%) is actuallysupposed to be part of the data, the underscore (_) might well be:

（当在允许用户（进行）输入（操作）时，ESCAPE关键字都应该在你的查询语句中被使用。虽然以百分号“%”作为数据的一部分是不常见的（虽然数据中出现百分号“%”的情况是很少的），但是出现下划线“_”还是很常见的：）

1.SELECT*

2.FROM t

3.WHEREt.x LIKE'some!_prefix%'ESCAPE'!'

上面的含义就是查询出以some_prefix开头的x数据。

如果想在SQL LIKE里查询有下划线'_'或是'%'等值的记录，直接写成like'XXX_XX'，则会把'_'当成是like的通配符。SQL里提供了 escape子句来处理这种情况，escape可以指定like中使用的转义符是什么，而在转义符后的字符将被当成原始字符，这和C里的'\'很像，但是escape要求自定义一个转义符，而不是指定了'\'字符。如：

select *from user_all_tables where table_name like 'YW_%' escape '_'

上面的含义是查出当前用户表下所有以'YW_'开头的表，其中第一个'_'是转义符，第二个是被转义的字符，等效可以写成：

select *from user_all_tables where table_name like 'YW\_%' escape '\'

上面两段参考：http://www.cnblogs.com/Render/archive/2008/08/01/1258086.html

SQL Like 通配符特殊用法：Escape参考：http://www.cnblogs.com/icebutterfly/archive/2012/02/06/2340313.html（SQL Server：SQL Like通配符特殊用法：Escape）

The Cure（解决方法）:

Always think of proper escaping when using the LIKE predicate.

（当使用LIKE关键词时总是适当的选择ESCAPE。）

六、认为NOT(A IN(X,Y))和A IN(X,Y)的布尔值是相反的<Thinkingthat NOT(A IN(X,Y)) is the boolean inverse of A IN(X,Y)>

This one is subtlebut very important with respect toNULLs! Let’s review whatA IN (X,Y) really means:

（这个错误虽然很微妙，但是再处理NULL值时很重要。让我们回顾下AIN (X, Y)的真正含义是什么：）

A IN (X, Y)

is the same as A = ANY (X, Y)

is the same as A = X OR A = Y

When at the same time, NOT (A IN (X, Y)) really means:

（同时，NOT (A IN (X, Y))的真正含义：）

NOT (A IN(X, Y))

is the same as A NOT IN (X, Y)

is the same as A != ANY (X, Y)

is the same as A != X AND A != Y

That looks like the boolean inverse of the previous predicate, but itisn’t! If any ofX orY isNULL, theNOT IN predicate will result inUNKNOWN whereas theIN predicate might still return a boolean value.

（他们看起来是布尔相反的意思，但不是的。如果X和Y中有一个为NULL，则在NOT IN中的结果是UNKNOWN，然而在IN中的结果仍然是一个布尔值。）

Or in other words, when A IN (X, Y)yieldsTRUE or FALSE,NOT(AIN (X, Y)) may still yieldUNKNOWN instead ofFALSE or TRUE. Note, that this is also true if theright-hand side of theIN predicate is a subquery.

（或者换句话说，当A IN (X, Y)的值为TRUE或FALSE，NOT(A IN (X, Y))的值可能仍然是UNKNOWN而不是FALSE或TRUE。注意：如果右手边的IN里面是个子查询，这个也是TRUE。）

Don’t believe it? See this SQL Fiddle for yourself. It shows that the following query yields no result:

（不相信这些？自己去看这里：SQL Fiddle。它指出了下面的查询没有结果：）

1.SELECT1

2.WHERE 1IN (NULL)

3.UNIONALL

4.SELECT 2

5.WHERENOT(1 IN(NULL))

More details can be seen in my previous blog post on that subject, which also shows some SQL dialect incompatibilities in that area.

（详细信息请看我以前的博客：my previous blogpost on that subject，它也展示了一些SQL方言在哪方面不一致的问题。）

The Cure（解决方法）:

Beware of the NOT IN predicate when nullable columns are involved!

（当涉及到允许为NULL的列时当心NOT IN（的使用））

So in fact,this can be said:

IDNOT IN (1, NULL) is equivalent to

NOT(ID IN (1, NULL)), equivalent to

NOT(ID = ANY(1, NULL)), equivalent to

NOT(ID = 1 OR ID = NULL), equivalent to

NOT(ID = 1) AND NOT (ID = NULL), which is always UNKNOWN

七、认为NOT (A IS NULL) 和A IS NOT NULL)是相同的<Thinking that NOT (A IS NULL) is the same as A IS NOTNULL>

Right, so weremembered that SQL implements three-valued logic when it comes to handling NULL.That’s why we can use theNULL predicate to check for NULLvalues.Right? Right.

（没错，所以我们记得当出现NULL值情况的时候，SQL实现了三值逻辑（三值逻辑：http://blog.sina.com.cn/s/blog_a865607b01010cc2.html）。那就是我们能够用NULL检查NULL值的原因。对吗？对。）

But even the NULL predicate is subtle.Beware that the two following predicates are only equivalent for row valueexpressions of degree 1:

（但是甚至NULL也是不易察觉的（容易遗漏的）。注意以下两个行值表达式只等价于行值表达式的值为1（的情况）：）

NOT (A ISNULL) is not the same as A IS NOT NULL

（NOT (A IS NULL)和A IS NOT NULL是不等价的。）

If A is a row value expression with a degree of more than 1,then the truth table is transformed such that:

（如果A是一个大于1的行值表达式，那真实表被转换像这样：）

l A IS NULL yields true only if all values in AareNULL

只有A中所有值为NULL时，A ISNULL为TRUE。

l NOT(A IS NULL)yields false only if all values inA are NULL

只有A中所有值为NULL时，NOT(A IS NULL)为FALSE。

l A IS NOT NULL yields true only if all values in AareNOT NULL

只有A中所有值为NOT NULL时，A ISNOT NULL为TRUE。

l NOT(A IS NOT NULL)yields false only if all values in AareNOT NULL

只有A中所有值为NOT NULL时，NOT(AIS NOT NULL)为FALSE。

See more details in my previous blog post on that subject.

（详细内容请看我的前一篇博客：http://blog.jooq.org/2012/12/24/row-value-expressions-and-the-null-predicate/ ）

The Cure（解决方法）:

When using row value expressions, beware of the NULL predicate, which mightnot work as expected.（当使用行值表达式时，留意NULL这个词，它可能不按你预想的工作。）

八、在支持行值表达式的地方不使用行值表达式<Not using row value expressions where they are supported>

Row value expressions are an awesome SQL feature. When SQL is a verytable-centric language, tables are also very row-centric. Row value expressionslet you describe complex predicates much more easily, by creating localad-hoc rows that can be compared with other rows of the same degree and rowtype. A simple example is to query customers for first names andlast names at the same time.

（行值表达式是SQL中一个非常棒的特性。同时SQL是一个非常以表为中心的语言，表又是非常以行为中心的。行值表达式可以让你更容易的描述出复杂的谓词，by creating local ad-hoc rows that can becompared with other rows of the same degree and row type.一个简单的例子就是同时查询顾客的名字和姓氏。）

1.SELECTc.address

2.FROM customer c,

3.WHERE(c.first_name, c.last_name) = (?, ?)

As can be seen, this syntax is slightly more concise than the equivalentsyntax where each column from the predicate’s left-hand side is compared withthe corresponding column on the right-hand side. This is particularly true ifmany independent predicates are combined with AND. Using row value expressionsallows you to combine correlated predicates into one. This is most useful forjoin expressions on composite foreign keys:

（在通过谓词左边的每一列和相对应的右边的列比较可以看出这个语法比同等的语法要稍微简洁些。特别是许多独立的谓词和AND结合的时候。使用行值表达式允许你将相关联的谓词组合在一起。对于复合外键上的连接表达式，这是最有用的：）

1.SELECT c.first_name,c.last_name, a.street

2.FROM customer c

3.JOIN address a

4. ON (c.id,c.tenant_id) = (a.id, a.tenant_id)

Unfortunately, not all databases support row value expressions in the sameway. But the SQL standard had defined them already in1992, and if you use them, sophisticated databases like Oracle or Postgres canuse them for calculating better execution plans. This is explained on thepopularUse The Index, Luke page.

(不幸的是，不是所有的数据库都以相同的方式支持行值表达式。但是SQL标准已经在1992中对行值表达式进行了定义，如果你使用行值表达式,像Oracle或者PostgreSQL这些复杂的数据库就可以使用行值表达式来计算更好的执行计划。在Use The Index, Luke这个页面有相关的解释。)

The Cure（解决方法）:

Use row value expressions whenever you can. They will make your SQL moreconcise andpossiblyeven faster.（能够使用行值表达式的地方就用行值表达式。这会使你的SQL语句更加简洁甚至更快。）

九、不定义足够多的约束<Not defining enough constraints>

So, I’m going to cite Tom Kyte and Use The Index,Luke again. You cannot have enough constraints in your meta data. First off,constraints help you keep your data from corrupting, which is already veryuseful. But to me, more importantly, constraints will help the database performSQL transformations, as the database can decide that

（因此，我又要引用Tom Kyte和Use The Index, Luke。在你的元数据中没有足够的约束。首先，约束可以帮助防止你的数据变质，这已经很有用了。但是对于我来说，更重要的是约束可以有助于数据库执行SQL转换，正如数据库可以决定那些：）

l Some values are equivalent

有些值是等价的

l Some clauses are redundant

有些子句是多余的

l Some clauses are “void” (i.e. they will notreturn any values)

有些子句是“空”的（也就是：它们不会反回任何值）

Some developers may think that constraints are slow. The opposite is thecase, unless you insert lots and lots of data, in case of which you can eitherdisable constraints for a large operation, or use a temporary “load table”without constraints, transferring data offline to the real table.

（一些开发者可能认为（加）约束会（使运行速度）变慢。相反，除非你插入许多许多的数据，对于大型的操作你可以禁用约束，或者使用一个没有约束的临时“加载表”（简化SQL语句的重要方法就是采用临时表暂存中间结果，但是，临时表的好处远远不止这些，将临时结果暂存在临时表，后面的查询就在tempdb中了，这可以避免程序中多次扫描主表，也大大减少了程序执行中“共享锁”阻塞“更新锁”，减少了阻塞，提高了并发性能。<参考：优化SQL查询：如何写出高性能SQL语句>），线下再转移数据到真实的表。）

The Cure（解决方法）:

Define as many constraints as you can. They will help your database toperform better when querying.（尽可能多的定义约束。这样你的数据库会更好的执行查询（操作）。）

十、认为50毫秒是比较快的查询<Thinkingthat 50ms is fast query execution>

The NoSQL hype is still ongoing, and many companies still think they’reTwitter or Facebook in dire need of faster, more scalable solutions, escapingACID and relational models to scale horizontally. Some may succeed (e.g.Twitter or Facebook), others may run into this:

（NoSQL炒作仍然在进行中，许多公司仍然认为他们是Twitter或者Facebook，急需更快、更可扩展的解决办法，脱离ACID（ACID，指数据库事务正确执行的四个基本要素的缩写。包含：原子性(Atomicity）、一致性（Consistency）、隔离性（Isolation）、持久性（Durability）。一个支持事务（Transaction）的数据库系统，必需要具有这四种特性，否则在事务过程（Transactionprocessing）当中无法保证数据的正确性，交易过程极可能达不到交易方的要求。）和关系模式横向扩展（SQL横向扩展：http://www.microsoft.com/china/msdn/library/data/sqlserver/ScalOutSQL.mspx?mfr=true）。有些可能会成功（如：Twitter、Facebook），其他的可能像下面一样：)

Found here: https://twitter.com/codinghorror/status/347070841059692545

For the others who are forced (or chose) to stick with proven relationaldatabases, don’t be tricked into thinking that modern databases are slow.They’re hyper fast. In fact, they’re so fast, they can parse your 20kb querytext, calculate 2000-line execution plans, and actually execute that monster inless than a millisecond, if you and your DBA get along well and tune yourdatabase to the max.

（对于*（或者选择）坚持证明关系数据库的其他人，请不要认为现代数据库是很慢的。他们超级快。事实上，他们是如此的快，可以在不到一毫秒的时间内解析20kb的查询文本、计算出200行数据的执行计划并实际执行。如果你和你的数据库管理员（DBA：DatabaseAdministrator）相处的和谐，你的数据库将最大限度的运行好。）

They may be slow because of your application misusing apopular ORM, or because that ORM won’t be able to produce fast SQL for your complexquerying logic. In that case, you may want to chose a more SQL-centric API likeJDBC,jOOQ orMyBatis that will let you get back in control of your SQL.

（他们可能慢的原因是因为你的应用程序误用了流行的ORM框架，或者是因为ORM框架无法为你的复杂的查询逻辑产生出更快的SQL查询。对于这种情况，你可以选择一个以SQL为中的的API，如：JDBC、JOOQ、MyBatis，他们可以让你自己去编写SQL控制语句。）

So, don’t think that a query execution of 50ms is fast or even acceptable.It’s not.If youget these speeds at development time, make sure you investigate executionplans. Those monsters might explode in production, where you have more complexcontexts and data.

（所以，不要认为一个50毫秒的查询是很快的甚至（认为这个速度）是可接受的。它不是。如果你开发时（的SQL语句运行速度）是这个速度的话，（最好先）确保你研究过执行计划。因为它们（The Monsters）可能在你更复杂的上下文以及数据中爆发。）

Conclusion<结论>

SQL is a lot of fun,but also very subtle in various ways. It’s not easy to get it right as myprevious blog post about10 common mistakeshas shown. But SQL can be mastered and it’s worth the trouble. Data is yourmost valuable asset. Treat data with respect and write better SQL.

SQL是非常有趣的，但是也非常的微妙。它不容易掌握，正如我前面发表的一篇博客（10 commonmistakes）里面提到的一样。但是SQL还是可以被掌控的，而且它值得我们去花功夫、值得我们去研究。（对于SQL开发人员来说）数据是你最宝贵的财富。（让我们）重视数据并且写出更好的SQL语句吧！

部分翻译内容参考了这里：http://blog.csdn.net/u011687117/article/details/9981573

秒客网

JAVA开发人员在编写SQL时最容易犯的10个错误

相关文章