MySQL巧用自定义函数进行查询优化

时间:2024-07-14 20:38:08

用户自定义变量是一个很容易被遗忘的MySQL特性,但是用的好,发挥其潜力,在很多场景都可以写出非常高效的查询语句。

一. 实现一个按照actorid排序的列

 mysql> set @rownum :=0;
Query OK, 0 rows affected (0.00 sec) mysql> select actor_id ,@rownum :=@rownum + 1 as rownum
-> from sakila.actor limit 3;
+----------+--------+
| actor_id | rownum |
+----------+--------+
| 58 | 1 |
| 92 | 2 |
| 182 | 3 |
+----------+--------+
3 rows in set (0.00 sec)

二. 扩展一下,现在需要获取演过最多电影的前十位,针对数量作一个排名,如果数量一样,则排名相同

 mysql> set @curr_cnt :=0 ,@pre_cnt :=0 ,@rank :=0;
Query OK, 0 rows affected (0.00 sec) mysql> select actor_id,
-> @prev_cnt :=@curr_cnt as dummy,
-> @curr_cnt := cnt as cnt,
-> @rank := IF(@prev_cnt <> @curr_cnt,@rank+1,@rank) as rank
-> FROM(
-> SELECT actor_id ,count(*) as cnt
-> FROM sakila.film_actor
-> GROUP BY actor_id
-> ORDER BY cnt DESC
-> LIMIT 10
-> )as der;
+----------+-------+-----+------+
| actor_id | dummy | cnt | rank |
+----------+-------+-----+------+
| 107 | 0 | 42 | 1 |
| 102 | 42 | 41 | 2 |
| 198 | 41 | 40 | 3 |
| 181 | 40 | 39 | 4 |
| 23 | 39 | 37 | 5 |
| 81 | 37 | 36 | 6 |
| 158 | 36 | 35 | 7 |
| 144 | 35 | 35 | 7 |
| 37 | 35 | 35 | 7 |
| 106 | 35 | 35 | 7 |
+----------+-------+-----+------+
10 rows in set (0.00 sec)

三. 避免重复查询刚更新的数据

如果想要高效的更新一条记录的时间戳 ,又想返回更新的数据

 mysql> create table t2 (id int,lastUpdated datetime);
Query OK, 0 rows affected (0.03 sec) mysql> insert into t2 (id ,lastupdated)values(1,sysdate());
Query OK, 1 row affected (0.02 sec) mysql> select * from t2;
+------+---------------------+
| id | lastUpdated |
+------+---------------------+
| 1 | 2017-07-24 16:03:34 |
+------+---------------------+
1 row in set (0.01 sec) mysql> update t2 set lastUpdated=NOW() WHERE id =1 and @now :=Now();
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0 mysql> select @now, sysdate();
+---------------------+---------------------+
| @now | sysdate() |
+---------------------+---------------------+
| 2017-07-24 16:05:42 | 2017-07-24 16:06:06 |
+---------------------+---------------------+
1 row in set (0.00 sec)

四. 统计更新和插入的数量
使用 INSERT ON DUPLICATE KEY UPDATE 时,查询插入成功的条数,冲突的条数

 mysql> set @x :=0;
Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO t3(c1,c2) values(1,2),(1,3),(2,2)
-> ON DUPLICATE KEY UPDATE
-> c2=VALUES(c2)+(0*(@x:=@x+1));
Query OK, 4 rows affected (0.01 sec)
Records: 3 Duplicates: 1 Warnings: 0 mysql> select @x;
+------+
| @x |
+------+
| 1 |
+------+
1 row in set (0.00 sec) mysql> select * from t3;
+----+------+
| c1 | c2 |
+----+------+
| 1 | 3 |
| 2 | 2 |
+----+------+
2 rows in set (0.00 sec)

五. 确定取值的顺序
想要获取sakila.actor中的一个结果

错误的查询一:
下面的查询看起来好像只返回一个结果,实际呢:

 mysql> set @row_num :=0;
Query OK, 0 rows affected (0.00 sec) mysql> SELECT actor_id,@row_num :=@row_num+1 AS cnt
-> FROM sakila.actor
-> WHERE @row_num <=1
-> ;
+----------+------+
| actor_id | cnt |
+----------+------+
| 58 | 1 |
| 92 | 2 |
+----------+------+
2 rows in set (0.00 sec) 看一下执行计划:
+----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
| 1 | SIMPLE | actor | index | NULL | idx_actor_last_name | 137 | NULL | 200 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

这是因为where 和 select 是在 查询的不同阶段执行的造成的。

错误的查询二:
如果加上按照 first_name 排序呢 :

 mysql> set @row_num :=0;
Query OK, 0 rows affected (0.00 sec) mysql> SELECT actor_id,@row_num :=@row_num+1 AS cnt
-> FROM sakila.actor
-> WHERE @row_num <=1
-> order by first_name;
+----------+------+
| actor_id | cnt |
+----------+------+
| 71 | 1 |
| 132 | 2 |
| 165 | 3 |
| 173 | 4 |
| 125 | 5 |
| 146 | 6 |
| 29 | 7 |
| 65 | 8 |
| 144 | 9 |
| 76 | 10 |
| 49 | 11 |
| 34 | 12 |
| 190 | 13 |
| 196 | 14 |
| 83 | 15 |
.. ...
返回了所有行,再看下查询计划: +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
| 1 | SIMPLE | actor | ALL | NULL | NULL | NULL | NULL | 200 | Using where; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)

可以看出原因是 Using where 是在排序操作之前取值的,所以输出了全部的行。

解决这个问题的方法是:让变量的赋值和取值发生在执行查询的统一阶段:

正确的查询:

 mysql> set @row_num :=0;
Query OK, 0 rows affected (0.00 sec) mysql> SELECT actor_id,@row_num AS cnt
-> FROM sakila.actor
-> WHERE (@row_num :=@row_num+1) <=1
-> ;
+----------+------+
| actor_id | cnt |
+----------+------+
| 58 | 1 |
+----------+------+
1 row in set (0.00 sec) 看一下执行计划 +----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
| 1 | SIMPLE | actor | index | NULL | idx_actor_last_name | 137 | NULL | 200 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

想一想 如果加上ORDER BY 该怎么写?

 mysql> set @row_num :=0;
Query OK, 0 rows affected (0.00 sec) mysql> SELECT actor_id,first_name ,@row_num AS row_num
-> FROM sakila.actor
-> WHERE @row_num<=1
-> ORDER BY first_name , least(0, @row_num :=@row_num+1)
-> ; +----------+------------+---------+
| actor_id | first_name | row_num |
+----------+------------+---------+
| 2 | NICK | 2 |
| 1 | PENELOPE | 1 |
+----------+------------+---------+
2 rows in set (0.00 sec) mysql> select @row_num;
+----------+
| @row_num |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec) 看一下执行计划: +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | actor | ALL | NULL | NULL | NULL | NULL | 200 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+
1 row in set (0.00 sec) SELECT actor_id,first_name ,@row_num:=@row_num+1 AS row_num
FROM sakila.actor
WHERE @row_num<=1
ORDER BY first_name , least(0, @row_num :=@row_num+1)

六. UNION的巧妙改写

假设有两张用户表,一张主用户表,存放着活跃用户;一些归档用户表,存放着长期不活跃的用户。现在需要查找id 为123的客户。
先看下这个语句

 select id from users where id= 123
union all
select id from users_archived where id =123

上面的语句是可以执行的,但是效率不好,因为两张表都必须查询一次

引入自定义变量的改写:

 SELECT GREATEST(@found:=-1,id) AS id ,'users' AS which_tbl
FROM users WHERE id =123
UNION ALL
SEELCT id,'users_archived' FROM users_archived WHERE id = 123 AND @found IS NULL
UNION ALL
SELECT 1,'reset' FROM DUAL WHERE (@found:=NULL) IS NOT NULL

上面的改写非常巧妙:
第一段,如果在users查询到记录,则为@found赋值,也不会查询第二段;如果没有查询到记录,@found 为 null ,执行第二段。
第三段没有输出 ,只是简单的重置@found 为null。另外 GREATEST(@found:=-1,id) 也不会影响输出!