1,准备测试数据
1.1 建库建表
mysql> create database hwdb;
Query OK, 1 row affected (0.34 sec)
mysql>
mysql> show create table t_huawei;
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t_huawei | CREATE TABLE `t_huawei` (
`hwid` int(11) DEFAULT NULL,
`content` varchar(4000) DEFAULT NULL,
`TYPE` char(2) DEFAULT NULL,
`creator_id` varchar(30) DEFAULT NULL,
`create_time` datetime DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql>
1.2 准备存储过程
DELIMITER $$
USE `hwdb`$$
DROP PROCEDURE IF EXISTS `pro_insert_hw`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE`pro_insert_hw`(num INT)
BEGIN
DECLARE var1 INT DEFAULT 0;
WHILE var1<num DO
SET var1=var1+1;
INSERT INTO hwdb.t_huawei(hwid,content,TYPE,creator_id,create_time)
VALUES (var1,CONCAT(var1,' - ','Players supposedly said they''d avoidteam facility this summer if Thibodeau remained the '),
0,10013,NOW());
END WHILE;
END$$
DELIMITER ;
mysql> show create table t_huawei;
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t_huawei | CREATE TABLE `t_huawei` (
`hwid` int(11) DEFAULT NULL,
`content` varchar(4000) DEFAULT NULL,
`TYPE` char(2) DEFAULT NULL,
`creator_id`varchar(30) DEFAULT NULL,
`create_time` datetime DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql>
1.3 开始录入数据
因为要录入千万级别数据,所以用后台mysql进程来录入,如下所示:
nohup /usr/local/mysql/bin/mysql -uroot--password="" -S /usr/local/mysql/mysql.sock -e "select now();truncate table hwdb.t_huawei; call hwdb.pro_insert_hw(10000000); select now();" > /data/call.log &
这里因为测试环境机器性能有限,在录入到7023850条的时候,我终止了进程,开始准备测试工作:
mysql> SELECT COUNT(1) FROM `t_huawei` ;
+----------+
| COUNT(1) |
+----------+
| 7023850 |
+----------+
1 row in set (5.52 sec)
mysql>
2,开始测试
2.1 普通的ORDER BY RAND()测试很慢很慢,如下所示,需要块1分钟的时间
mysql> SELECT SQL_NO_CACHE * FROM `t_huawei` AS t1 ORDER BY RAND() LIMIT 5;
+---------+-----------------------------------------------------------------------------------------------------+------+------------+---------------------+
| hwid | content | TYPE |creator_id | create_time |
+---------+-----------------------------------------------------------------------------------------------------+------+------------+---------------------+
| 3743323 | 3743323 - Players supposedlysaid they'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 20:43:01|
| 2418491 | 2418491 - Players supposedlysaid they'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 20:19:31|
| 1224667 | 1224667 - Players supposedlysaid they'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:58:35|
| 1639270 | 1639270 - Players supposedlysaid they'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 20:05:50|
| 2756470 | 2756470 - Players supposedlysaid they'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 20:25:28|
+---------+-----------------------------------------------------------------------------------------------------+------+------------+---------------------+
5 rows in set (54.92 sec)
mysql>
原因是:MYSQL的官方手册,里面针对RAND()的提示大概意思就是,在ORDER BY从句里面不能使用RAND()函数,因为这样会导致数据列被多次扫描。但是在MYSQL 3.23版本中,仍然可以通过ORDER BY RAND()来实现随机。
2.2 在JOIN中使用RAND()* (SELECTMAX(hwid) 的方式来实现
mysql> SELECT SQL_NO_CACHE t1.* FROM`t_huawei` AS t1 LEFT JOIN(SELECT ROUND(RAND()* (SELECT MAX(hwid) FROM `t_huawei` )) AS id ) AS t2 ONt1.`hwid`>=t2.id ORDER BY t1.`hwid` ASC LIMIT 5;
+------+-----------------------------------------------------------------------------------------------+------+------------+---------------------+
| hwid | content | TYPE | creator_id | create_time |
+------+-----------------------------------------------------------------------------------------------+------+------------+---------------------+
| 1 | 1 - Players supposedly said they'd avoid team facility this summerif Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:02 |
| 2 | 2 - Players supposedly said they'd avoid team facility this summerif Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:02 |
| 3 | 3 - Players supposedly said they'd avoid team facility this summerif Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:02 |
| 4 | 4 - Players supposedly said they'd avoid team facility this summerif Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:02 |
| 5 | 5 - Players supposedly said they'd avoid team facility this summerif Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:02 |
+------+-----------------------------------------------------------------------------------------------+------+------------+---------------------+
5 rows in set (50.04 sec)
mysql>
需要50面,比起直接的order by来说少了4秒,但是还是比较耗时的。
2.3 WHERE子句的方式来优化
mysql> SELECT SQL_NO_CACHE t1.* FROMhwdb.`t_huawei` t1 WHERE t1.`hwid` >=(SELECT FLOOR(RAND() * (SELECTMAX(t2.hwid) FROM hwdb.`t_huawei` t2 ) )) ORDER BY t1.hwid LIMIT 5;
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
| hwid | content | TYPE | creator_id | create_time |
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
| 1624 | 1624 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:05|
| 6076 | 6076 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:11|
| 7988 | 7988 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:14|
| 8016 | 8016 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:14|
| 8106 | 8106 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:14|
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
5 rows in set (7.25 sec)
mysql>
花了7.25秒,比上一种要快的多很多了。
2.4 MAX(hwid)- MIN(hwid)的方式来RAND来操作
mysql> SELECT SQL_NO_CACHE * FROMhwdb.`t_huawei` WHERE hwid >= (SELECT FLOOR(RAND() * ((SELECT MAX(hwid) FROMhwdb.`t_huawei`) - (SELECT MIN(hwid) FROM hwdb.`t_huawei`)) + (SELECT MIN(hwid)FROM hwdb.`t_huawei`))) ORDER BY hwid LIMIT 5;
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
| hwid | content |TYPE | creator_id | create_time |
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
| 5438 | 5438 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:10|
| 5883 | 5883 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:10|
| 6782 | 6782 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:12|
| 7797 | 7797 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:13|
| 8625 | 8625 - Players supposedly saidthey'd avoid team facility this summer if Thibodeau remained the | 0 | 10013 | 2015-06-08 19:37:15|
+------+--------------------------------------------------------------------------------------------------+------+------------+---------------------+
5 rows in set (13.50 sec)
mysql>
MAX(hwid)- MIN(hwid)的方式来RAND来操作需要13.5秒,比where子句花时间要多一些。
3,总结
WHERE子句的方式效率最高,样例sql:SELECTSQL_NO_CACHE t1.* FROM hwdb.`t_huawei` t1 WHERE t1.`hwid` >=(SELECTFLOOR(RAND() * (SELECT MAX(t2.hwid) FROM hwdb.`t_huawei` t2 ) )) ORDER BYt1.hwid LIMIT 5;