mysql中的JOIN用法总结

时间:2023-01-23 19:33:31

join是mysql中一个基础的关键词,一般在多表连接查询中使用,这里做一下总结

1、JOIN的语法格式

table_references:
table_reference [, table_reference] ... table_reference:
table_factor
| join_table table_factor:
tbl_name [[AS] alias]
[{USE|IGNORE|FORCE} INDEX (key_list)]
| ( table_references )
| { OJ table_reference LEFT OUTER JOIN table_reference
ON conditional_expr } join_table:
table_reference [INNER | CROSS] JOIN table_factor [join_condition]
| table_reference STRAIGHT_JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor ON condition
| table_reference LEFT [OUTER] JOIN table_reference join_condition
| table_reference NATURAL [LEFT [OUTER]] JOIN table_factor
| table_reference RIGHT [OUTER] JOIN table_reference join_condition
| table_reference NATURAL [RIGHT [OUTER]] JOIN table_factor join_condition:
ON conditional_expr
| USING (column_list)

 2、JOIN解析说明

我们先准备实验例子(mysql 版本:mysql  Ver 14.12 Distrib 5.0.95, for redhat-linux-gnu (i686) using readline 5.1)

CREATE TABLE `test_join_a` (
`id` bigint(20) NOT NULL DEFAULT '',
`name` varchar(200),
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `test_join_b` (
`id` bigint(20) NOT NULL DEFAULT '',
`sex` varchar(200),
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `test_join_c` (
`id` bigint(20) NOT NULL DEFAULT '',
`class` varchar(200),
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8; insert into test_join_a (id) values (1);
insert into test_join_a (id, name) values (2, "abc");
insert into test_join_a (id, name) values (3, "abcd");
insert into test_join_a (id, name) values (4, "abcde"); insert into test_join_b (id) values (1);
insert into test_join_b (id, sex) values (2, "abc");
insert into test_join_b (id, sex) values (3, "abc"); insert into test_join_c (id) values (1);
insert into test_join_c (id, class) values (2, "abc");
insert into test_join_c (id, class) values (3, "abc");

a、关于LEFT JOIN

对于 A LEFT JOIN (B, C)  ON join_condition ,A表m,B表n, C表t 条记录   【说明:由于未发现专业术语描述A,B,C三表区别,此处,将A表暂称基础表,B,C表称附加表】
    * 以A表为基础表,从B表和C表中挑选符合ON join_condition条件的数据,如果没有符合的,则附加表(B,C)的列对应的行设置为NULL。
    * 通过这种连接方式,查询后的数据条数,最多为 m * n * t条, 最少为 m条   
    
    同理,对于(A, B) LEFT JOIN C ON join_condition
    * 以A, B表为基础表,从C表中挑选符合ON join_condition条件的数据,如果没有符合的,则附加表(C)的列对应的行设置为NULL。
    * 通过这种连接方式,查询后的数据条数,最多为 m * n * t条, 最少为 m * n条
    
    在LEFT JOIN 和 RIGHT JOIN 中,ON 子句后还有 WHERE 子句的筛选条件,区别在于
    1、where子句可以省略,ON 子句不能省略,如果不用ON join_condition,则可以使用ON 1=1 或者 ON 1
    2、ON join_condtion 会以基础表的笛卡尔积为基础,附加表没有符合ON join_condtion的记录,则将附加表列对应的行设为NULL。
    3、where 子句会将 ON join_condition生成的虚拟表做筛选。所以通过where子句后,获取的记录数,最少为 0条

 注意: NULL=NULL 既不符合 join的on condition 也不符合 where子句的筛选   
    
     下面是两个实验过程,a表4条记录,b表3条,c表3条:

  1、可以看到生成记录条数4 * 3 * 3 = 36 条

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on 1=1;
+-----+-------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+------+------+------+-------+
| 1 | NULL | 1 | NULL | 1 | NULL |
| 1 | NULL | 1 | NULL | 2 | abc |
| 1 | NULL | 1 | NULL | 3 | abc |
| 1 | NULL | 2 | abc | 1 | NULL |
| 1 | NULL | 2 | abc | 2 | abc |
| 1 | NULL | 2 | abc | 3 | abc |
| 1 | NULL | 3 | abc | 1 | NULL |
| 1 | NULL | 3 | abc | 2 | abc |
| 1 | NULL | 3 | abc | 3 | abc |
| 2 | abc | 1 | NULL | 1 | NULL |
| 2 | abc | 1 | NULL | 2 | abc |
| 2 | abc | 1 | NULL | 3 | abc |
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 3 | abc | 3 | abc |
| 3 | abcd | 1 | NULL | 1 | NULL |
| 3 | abcd | 1 | NULL | 2 | abc |
| 3 | abcd | 1 | NULL | 3 | abc |
| 3 | abcd | 2 | abc | 1 | NULL |
| 3 | abcd | 2 | abc | 2 | abc |
| 3 | abcd | 2 | abc | 3 | abc |
| 3 | abcd | 3 | abc | 1 | NULL |
| 3 | abcd | 3 | abc | 2 | abc |
| 3 | abcd | 3 | abc | 3 | abc |
| 4 | abcde | 1 | NULL | 1 | NULL |
| 4 | abcde | 1 | NULL | 2 | abc |
| 4 | abcde | 1 | NULL | 3 | abc |
| 4 | abcde | 2 | abc | 1 | NULL |
| 4 | abcde | 2 | abc | 2 | abc |
| 4 | abcde | 2 | abc | 3 | abc |
| 4 | abcde | 3 | abc | 1 | NULL |
| 4 | abcde | 3 | abc | 2 | abc |
| 4 | abcde | 3 | abc | 3 | abc |
+-----+-------+------+------+------+-------+
36 rows in set (0.00 sec)

 

  2、筛选符合条件的记录,如果没有符合的,则设置为NULL,参看实验表中aid=1,3, 4情况。

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on 
a.name=b.sex;
+-----+-------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+------+------+------+-------+
| 1 | NULL | NULL | NULL | NULL | NULL |
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 3 | abc | 3 | abc |
| 3 | abcd | NULL | NULL | NULL | NULL |
| 4 | abcde | NULL | NULL | NULL | NULL |
+-----+-------+------+------+------+-------+
9 rows in set (0.00 sec)

  3、针对 (A, B) LEFT JOIN C ON join_condition 的情况,筛选符合条件的记录,以A,B表的笛卡尔积为基础,查找C中符合 on join_condition的记录数,如果没有,则将C中对应列的行值设置未NULL。

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a, test_join_b b) left join  test_join_c c on b.id>5;
+-----+-------+-----+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+-----+------+------+-------+
| 1 | NULL | 1 | NULL | NULL | NULL |
| 1 | NULL | 2 | abc | NULL | NULL |
| 1 | NULL | 3 | abc | NULL | NULL |
| 2 | abc | 1 | NULL | NULL | NULL |
| 2 | abc | 2 | abc | NULL | NULL |
| 2 | abc | 3 | abc | NULL | NULL |
| 3 | abcd | 1 | NULL | NULL | NULL |
| 3 | abcd | 2 | abc | NULL | NULL |
| 3 | abcd | 3 | abc | NULL | NULL |
| 4 | abcde | 1 | NULL | NULL | NULL |
| 4 | abcde | 2 | abc | NULL | NULL |
| 4 | abcde | 3 | abc | NULL | NULL |
+-----+-------+-----+------+------+-------+
12 rows in set (0.00 sec)

  4、可以看出ON condition的执行在WHERE之前,ON condition会形成虚拟表。

mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5 where b.id>5;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+
| 1 | SIMPLE | b | range | PRIMARY | PRIMARY | 8 | NULL | 1 | Using where |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer |
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using join buffer |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+
3 rows in set (0.00 sec)

  5、WHERE 对虚拟表的筛选和ON条件是不一样的,ON condition会保存基础表的每条数据,附加表中没有符合的,则将附加表列设置为NULL,WHERE 只是筛选符合WHERE 子句的记录。 (可以从下面的第三次查询的结果看出,查询的是通过ON condition形成的虚拟表)

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5;
+-----+-------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+------+------+------+-------+
| 1 | NULL | NULL | NULL | NULL | NULL |
| 2 | abc | NULL | NULL | NULL | NULL |
| 3 | abcd | NULL | NULL | NULL | NULL |
| 4 | abcde | NULL | NULL | NULL | NULL |
+-----+-------+------+------+------+-------+
4 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5 where b.id>5;
Empty set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex where b.sex is NULL;
+-----+-------+------+------+------+-------+
| aid | name  | bid  | sex  | cid  | class |
+-----+-------+------+------+------+-------+
|   1 | NULL  | NULL | NULL | NULL | NULL  |
|   3 | abcd  | NULL | NULL | NULL | NULL  |
|   4 | abcde | NULL | NULL | NULL | NULL  |
+-----+-------+------+------+------+-------+

6、NULL=NULL 既不符合 join的on condition 也不符合 where子句的筛选

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex;
+-----+-------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+------+------+------+-------+
| 1 | NULL | NULL | NULL | NULL | NULL |
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 3 | abc | 3 | abc |
| 3 | abcd | NULL | NULL | NULL | NULL |
| 4 | abcde | NULL | NULL | NULL | NULL |
+-----+-------+------+------+------+-------+
9 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex where a.name=b.sex;
+-----+------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+------+------+------+------+-------+
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 3 | abc |
+-----+------+------+------+------+-------+
6 rows in set (0.00 sec)

b、关于RIGHT JOIN

  right join和 left join类似,区别在于
  1、对于 A right join (B, C) on join_condition ,会以 B,C表为基础表,A表为附加表
  2、对于 (B, C) right join A on join_condition ,会以 A 表为基础表,B,C表为附加表

C、关于JOIN

JOIN与LEFT JOIN 和RIGHT JOIN的区别在于 JOIN 不区别左右基础表,可以认为他的ON condition 退化成where 子句形式,
例如:对于下面的SQL

select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex;

结果上等价于

select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON  1=1 where a.name=b.sex;

同时对于 A JOIN (B, C) 等价于 (A,B) JOIN C

关于JOIN的解析过程,我们可以理解成, 将A,B,C表做笛卡尔积,构成新的虚拟表, ON condition转化成where 子句,结合后面的where 子句做筛选,得出筛选结果。

我们来看一下实验情况:

mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer |
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
3 rows in set (0.00 sec) mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON 1=1 where a.name=b.sex;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer |
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
3 rows in set (0.00 sec)

可以看出,两个sql的explain结果是一样的,拥有相同的执行计划

我们再看看这两个sql的执行结果

mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex;
+-----+------+-----+------+-----+-------+
| aid | name | bid | sex | cid | class |
+-----+------+-----+------+-----+-------+
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 3 | abc |
+-----+------+-----+------+-----+-------+
6 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON 1=1 where a.name=b.sex;
+-----+------+-----+------+-----+-------+
| aid | name | bid | sex | cid | class |
+-----+------+-----+------+-----+-------+
| 2 | abc | 2 | abc | 1 | NULL |
| 2 | abc | 3 | abc | 1 | NULL |
| 2 | abc | 2 | abc | 2 | abc |
| 2 | abc | 3 | abc | 2 | abc |
| 2 | abc | 2 | abc | 3 | abc |
| 2 | abc | 3 | abc | 3 | abc |
+-----+------+-----+------+-----+-------+
6 rows in set (0.00 sec)