17章 组合查询
UNION
操作符将多条SELECT
语句组合
17.1 组合查询
SELECT
语句查询,结果作为单条查询返回。称为并(union)或者复合查询(compound query)。
可能需要的地方:
- 单个查询中从不同表返回类似结构的数据。
- 单表执行多次查询,按单个查询返回。
注意:多数情况下,组合相同的表的两个查询工作,与具有多个SELECT
子句条件的单条查询相同。
17.2 创建组合查询
UNION
来组合SQL查询。
17.2.1 UNION
使用
- 在每条
SELECT
语句之间放入UNION
语句。
例如,需要查询价格小于5的列表,和供应商1001、1002的所有物品。
1.使用多条SELECT
方式
mysql> SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5;
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1003 | FC | 2.50 |
| 1002 | FU1 | 3.42 |
| 1003 | SLING | 4.49 |
| 1003 | TNT1 | 2.50 |
+---------+---------+------------+
mysql> SELECT vend_id,prod_id, prod_price FROM products WHERE vend_id IN (1001, 1002);
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1001 | ANV01 | 5.99 |
| 1001 | ANV02 | 9.99 |
| 1001 | ANV03 | 14.99 |
| 1002 | FU1 | 3.42 |
| 1002 | OL1 | 8.99 |
+---------+---------+------------+
2.使用组合查询
mysql> SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5 UNION SELECT vend_id, prod_id, prod_price FROM products WHERE vend_id IN (1001, 1002);
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1003 | FC | 2.50 |
| 1002 | FU1 | 3.42 |
| 1003 | SLING | 4.49 |
| 1003 | TNT1 | 2.50 |
| 1001 | ANV01 | 5.99 |
| 1001 | ANV02 | 9.99 |
| 1001 | ANV03 | 14.99 |
| 1002 | OL1 | 8.99 |
+---------+---------+------------+
注意看:两条组合语句中SELECT的三个对象分别相同。若删除最后一个,结果如下。
SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5 UNION SELECT vend_id, prod_id FROM products WHERE vend_id IN (1001, 1002);
ERROR 1222 (21000): The used SELECT statements have a different number of columns
若有相同的列,但两个组合之间,列的取值不同,同样可以显示结果。我们将第二个查询中,显示两次vend_id
。
mysql> SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5 UNION SELECT vend_id, prod_id,vend_id FROM products WHERE vend_id IN (1001, 1002);
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1003 | FC | 2.50 |
| 1002 | FU1 | 3.42 |
| 1003 | SLING | 4.49 |
| 1003 | TNT1 | 2.50 |
| 1001 | ANV01 | 1001.00 |
| 1001 | ANV02 | 1001.00 |
| 1001 | ANV03 | 1001.00 |
| 1002 | FU1 | 1002.00 |
| 1002 | OL1 | 1002.00 |
+---------+---------+------------+
17.2.2 UNION
规则
-
UNION
必须由两条或以上的SELECT
语句组成。 -
UNION
中每次查询必须包含相同的列、表达式或聚集函数。 -
UNION
中,列数据类型必须兼容:类型不必完全相同,但可以在DBMS中转换。
17.2.3 包含或取消重复的行
UNION
自动排除重复目录。
若要返回所有值,使用UNION ALL
mysql> SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5 UNION ALL SELECT vend_id, prod_id, prod_price FROM products WHERE vend_id IN (1001, 1002);
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1003 | FC | 2.50 |
| 1002 | FU1 | 3.42 |
| 1003 | SLING | 4.49 |
| 1003 | TNT1 | 2.50 |
| 1001 | ANV01 | 5.99 |
| 1001 | ANV02 | 9.99 |
| 1001 | ANV03 | 14.99 |
| 1002 | FU1 | 3.42 |
| 1002 | OL1 | 8.99 |
+---------+---------+------------+
17.2.4 结果排序
ORDER BY
子句应得放在末尾处。
mysql> SELECT vend_id, prod_id, prod_price FROM products WHERE prod_price <= 5 UNION ALL SELECT vend_id, prod_id, prod_price FROM products WHERE vend_id IN (1001, 1002) ORDER BY vend_id,prod_price;
+---------+---------+------------+
| vend_id | prod_id | prod_price |
+---------+---------+------------+
| 1001 | ANV01 | 5.99 |
| 1001 | ANV02 | 9.99 |
| 1001 | ANV03 | 14.99 |
| 1002 | FU1 | 3.42 |
| 1002 | FU1 | 3.42 |
| 1002 | OL1 | 8.99 |
| 1003 | FC | 2.50 |
| 1003 | TNT1 | 2.50 |
| 1003 | SLING | 4.49 |
+---------+---------+------------+
18章 全文本搜索
18.1 全文本搜索
MyISAM
和InnoDB
,前者支持全文本搜索,后者不支持。因此本系列中大部分例表使用InnoDB
,而productontes表使用MyISAM
。
LIKE
和REGEXP
正则表达两种匹配方式,但缺点是:
- 性能,通配符和正则表达式都是全文匹配,耗时间
- 明确控制,通配符和正则表达式很难明确控制匹配。
- 智能化,通配符和正则表达式都不能提供智能化选择结果。例如,一个特殊的词的搜索会返回包含改词的所有行,而不区分包含单个匹配的行和包含多个匹配的行。
18.2 使用全文检索
在索引之后,SELECT
使用Match()
、Against()
一起使用。
18.2.1 启动全文检索
CREATE TABLE
语句接受FULLTEXT
子句,它给出被索引列的一个逗号分割的列表。例如,
mysql> CREATE TABLE productnotes(note_id int NOT NULL AUTO_INCREMENT, prod_id char(10) NOT NULL, note_date datetime NOT NULL, note_text text NULL, PRIMARY KEY(note_id), FULLTEXT(note_text)) ENGINE = MyISAM;
- 注意:不要在导入数据时开启
FULLTEXT
,因为每次导入都需要更新索引,请在导入结束后,再修改表,定义FULLTEXT
。一切都是性能优化方向。
18.2.2 进行全文检索
SELECT
使用Match()
、Against()
一起使用。 Match()
指定被搜索的列,Against()
指定要使用的搜索表达式。例如,
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('rabbit');
+----------------------------------------------------------------------------------------------------------------------+
| note_text |
+----------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now. |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+----------------------------------------------------------------------------------------------------------------------+
- 注意:传递给
Match()
的值必须与FULLTEXT
定义中的相同。如果指定多个列,必须按照正确次序列出;同时搜索不区分大小写,除非使用BINARY
。
LIKE
语句同样可以检索,但返回的次序不同
mysql> SELECT note_text FROM productnotes WHERE note_text LIKE '%rabbit%';
+----------------------------------------------------------------------------------------------------------------------+
| note_text |
+----------------------------------------------------------------------------------------------------------------------+
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now. |
+----------------------------------------------------------------------------------------------------------------------+
虽然同样检索出,但是词rabbit作为第三个词的行的等级比作为第20个词的行高。同时由于全文本是经过索引的,全文搜索相当快。
18.2.3 使用查询扩展
- MySQL 4.1.1版本引入
- 首先进行基本全文搜索,找出匹配行。
- MySQL检查匹配行并选择所有有用的词。
- 再次进行全文本搜索,使用所有词。
简单全文检索,未扩展:
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('anvils');
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Multiple customer returns, anvils failing to drop fast enough or falling backwards on purchaser. Recommend that customer considers using heavier anvils. |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
增加全文查询扩展:
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('anvils' WITH QUERY EXPANSION);
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Multiple customer returns, anvils failing to drop fast enough or falling backwards on purchaser. Recommend that customer considers using heavier anvils. |
| Customer complaint:
Sticks not individually wrapped, too easy to mistakenly detonate all at once.
Recommend individual wrapping. |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead. |
| Please note that no returns will be accepted if safe opened using explosives. |
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now. |
| Customer complaint:
Circular hole in safe floor can apparently be easily cut with handsaw. |
| Matches not included, recommend purchase of matches or detonator (item DTNTR). |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
分析:第一行包含anvils,因此等级最高。第二行与anvils无关,但与第一行其他词相关。查询扩展增加了返回的行数,但也增加了不想要的行的数目。
18.2.4 布尔文本搜索
FULLTEXT
索引也可以使用。细节,
1. 要匹配的词;
2. 要排斥的词(纵使包含其他指定的词);
3. 排序提示(指定某些词比其他词重要);
4. 表达式分组;
5. 其他。
演示IN BOOLEAN MODE的作用,
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy' IN BOOLEAN MODE);
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Item is extremely heavy. Designed for dropping, not recommended for use with slings, ropes, pulleys, or tightropes. |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead. |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
而匹配heavy,且不包含rope开始的词:
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy -rope*' IN BOOLEAN MODE);
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead. |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
全文本布尔操作符:
布尔操作符 | 说明 |
---|---|
+ | 包含,词必须存在 |
- | 排除,词必须排除 |
> | 包含,且增加等级值 |
< | 包含,且减小等级值 |
() | 把词组成子表达式 |
~ | 取消一个词的排序值 |
* | 词尾的通配符 |
” “ | 定义一个短语(与单个词列表不同,它匹配整个短语) |
18.2.5 全文本搜索的使用说明
- 自动忽略3个或以下的词;
- MySQL带有内建的非用词(stopword)表,这些词总是被忽略;
- 许多词频过高,搜索无用。MySQL自动忽略50%以上的词。此规则不适用于IN BOOLEAN MODE;
- 行数小于3,全文本搜索不返回结果;
- 忽略单引号。例如don’t索引为dont;
- 不具有词分隔符的语言(日语)不能恰当返回结果;
- 仅在MyISAM引擎中支持全文本检索。
- 注意:MySQL暂不支持邻近操作符。