查询优化 - 耗时太长并停止服务器

时间:2022-07-13 09:39:50

My query generates some reports about speeding, last time, and average speed. This is my query:

我的查询会生成一些关于超速,上次和平均速度的报告。这是我的查询:

Select 
    r1 . *, r2.name, r2.notes, r2.serial
From
    (SELECT 
        k.idgps_unit,
            MIN(k.dt) AS DT_Start,
            MIN(CASE
                WHEN k.RowNumber = 1 THEN k.Lat
            END) AS Latitude_Start,
            MIN(CASE
                WHEN k.RowNumber = 1 THEN k.Long
            END) AS Longitude_Start,
            MIN(CASE
                WHEN k.RowNumber = 1 THEN k.Speed_kmh
            END) AS Speed_Start,
            MAX(k.dt) AS dt_end,
            MIN(CASE
                WHEN k.RowNumber = MaxRowNo THEN k.Lat
            END) AS Latitude_End,
            MIN(CASE
                WHEN k.RowNumber = MaxRowNo THEN k.Long
            END) AS Longitude_End,
            MIN(CASE
                WHEN k.RowNumber = MaxRowNo THEN k.Speed_kmh
            END) AS Speed_End,
            AVG(Speed_kmh) AS Average_Speed
    FROM
        (SELECT 
        gps_unit_location . *,
            @i:=CASE
                WHEN Speed_Kmh > 80 AND @b = 0 THEN @i + 1
                ELSE @i
            END AS IntervalID,
            @r:=CASE
                WHEN Speed_Kmh > 80 AND @b = 0 THEN 1
                ELSE @r + 1
            END AS RowNumber,
            @b:=CASE
                WHEN Speed_Kmh > 80 THEN 1
                ELSE 0
            END AS IntervalCheck
    FROM
        gps_unit_location, (SELECT @i:=0) i, (SELECT @r:=0) r, (SELECT @b:=0) b
    ORDER BY dt , idgps_unit_location) k
    INNER JOIN (SELECT 
        IntervalID, MAX(RowNumber) AS MaxRowNo
    FROM
        (SELECT 
        gps_unit_location . *,
            @i:=CASE
                WHEN Speed_Kmh > 80 AND @b = 0 THEN @i + 1
                ELSE @i
            END AS IntervalID,
            @r:=CASE
                WHEN Speed_Kmh > 80 AND @b = 0 THEN 1
                ELSE @r + 1
            END AS RowNumber,
            @b:=CASE
                WHEN Speed_Kmh > 80 THEN 1
                ELSE 0
            END AS IntervalCheck
    FROM
        gps_unit_location, (SELECT @i:=0) i, (SELECT @r:=0) r, (SELECT @b:=0) b
    ORDER BY dt , idgps_unit_location) d
    WHERE
        IntervalCheck = 1
    GROUP BY IntervalID) MaxInt ON MaxInt.IntervalID = k.IntervalID
    WHERE
        k.IntervalCheck = 1
            and k.idgps_unit in (SELECT 
                idgps_unit
            FROM
                instafleet.gps_unit
            where
                id_customer = (select 
                        idcustomer
                    from
                        user
                    where
                        iduser = 14))
    GROUP BY k.IntervalID , k.idgps_unit) r1
        Inner join
    gps_unit r2 ON r1.idgps_unit = r2.idgps_unit

Currently it takes 3 minutes for 783,723 records. I am thinking that proper indexes might help; although after some trial and error, I can't figure it out. If you think you can help, and need some additional info - I will be happy you provide it to you.

目前,783,723条记录需要3分钟。我认为适当的索引可能会有所帮助;虽然经过一些试验和错误,我无法弄清楚。如果您认为自己可以提供帮助,并需要一些额外的信息 - 我很乐意为您提供帮助。

Explain 查询优化 - 耗时太长并停止服务器

Result 查询优化 - 耗时太长并停止服务器

3 个解决方案

#1


2  

Adding an index helps in many cases, but you have a subquery joining another subquery, no index on your current table can help you speed up. The only way you can use indexes here is to create temporary table.

在许多情况下,添加索引会有所帮助,但是您有一个子查询加入另一个子查询,当前表上没有索引可以帮助您加快速度。您可以在此处使用索引的唯一方法是创建临时表。

So as Markus pointed you need to break your query into a couple of smaller ones which store their results in a temporary table. Than you can add indexes to them and hopefully speedup your query. Another good thing about breaking big query into couple of smaller ones is that you can better profile which part is the slower one and fix it.

因此,Markus指出,您需要将查询分解为几个较小的查询,这些查询将结果存储在临时表中。您可以为它们添加索引,并希望加快查询速度。将大查询分解为几个较小的查询的另一个好处是,您可以更好地分析哪个部分是较慢的并修复它。

You have also used one subquery two times which is bad for performance as the result was not cached.

您还使用了两次子查询,这对性能有害,因为结果未缓存。

Here is an example of how you could do this:

以下是如何执行此操作的示例:

DROP TEMPORARY TABLE IF EXISTS tmp_k;
CREATE TEMPORARY TABLE tmp_k
    ENGINE=Memory
SELECT 
    gps_unit_location.*,
    @i:= IF(((Speed_Kmh > 80) AND (@b = 0)), @i + 1, @i) AS IntervalID,
    @r:= IF(((Speed_Kmh > 80) AND (@b = 0)), 1, @r + 1) AS RowNumber,
    @b:= IF((Speed_Kmh > 80), 1, 0) AS IntervalCheck
FROM
    gps_unit_location,
    (SELECT @i:=0) i, 
    (SELECT @r:=0) r, 
    (SELECT @b:=0) b
ORDER BY
    dt,
    idgps_unit_location;

ALTER TABLE tmp_k ADD INDEX (IntervalID);

DROP TEMPORARY TABLE IF EXISTS tmp_max;
CREATE TEMPORARY TABLE tmp_max
    ENGINE=Memory
SELECT 
    IntervalID, 
    MAX(RowNumber) AS MaxRowNo
FROM
    temp_k
WHERE
    IntervalCheck = 1
GROUP BY 
    IntervalID;

ALTER TABLE tmp_max ADD INDEX (IntervalID);

SELECT 
    k.idgps_unit,
    MIN(k.dt) AS DT_Start,
    MIN(IF(k.RowNumber = 1, k.Lat, NULL)) AS Latitude_Start,
    MIN(IF(k.RowNumber = 1, k.Long, NULL)) AS Longitude_Start,
    MIN(IF(k.RowNumber = 1, k.Speed_kmh, NULL) AS Speed_Start,
    MAX(k.dt) AS DT_End,
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Lat, NULL)) AS Latitude_End
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Long, NULL)) AS Longitude_End
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Speed_kmh, NULL)) AS Speed_End,
    AVG(Speed_kmh) AS Average_Speed,
    gu.name,
    gu.notes,
    gu.serial
FROM
    tmp_k AS k
    INNER JOIN tmp_max AS m
        USING(IntervalID)
    INNER JOIN gps_unit AS gu
        USING(idgps_unit)
    INNER JOIN user AS u
    ON (gu.idcustomer = u.idcustomer)
WHERE
    (k.IntervalCheck = 1) 
     AND (u.iduser = 14)
GROUP BY 
    k.IntervalID, 
    k.idgps_unit;

DROP TEMPORARY TABLE tmp_k;
DROP TEMPORARY TABLE tmp_max;

#2


0  

If the size in bytes of the nested query(s) exceeds the size of the buffer pool (check innodb_buffer_pool_size) your query will take an extremely long time due to i/o paging.

如果嵌套查询的大小(以字节为单位)超过缓冲池的大小(请检查innodb_buffer_pool_size),由于i / o分页,查询将花费很长时间。

That said you can improve your performance with the following tips:

也就是说,您可以通过以下提示改善您的表现:

  • Select as little data as possible in the nested query
  • 在嵌套查询中选择尽可能少的数据

  • Increase the size of your buffer pool.
  • 增加缓冲池的大小。

#3


0  

My personal experience has shown that MySQL is rather bad at handling subqueries. The query optimizer of a database is a very intricate and delicious part of the database and commercial database vendors put much effort into it, so it is IMHO no wonder that MySQL performs rather poor when it comes to handling crazy SQL statements invented by even crazier developers ;-).

我的个人经验表明MySQL在处理子查询方面相当糟糕。数据库的查询优化器是数据库的一个非常复杂和美味的部分,商业数据库供应商付出了很多努力,所以恕我直言,毫无疑问MySQL在处理由更疯狂的开发人员发明的疯狂SQL语句时表现相当差;-)。

See here: http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html :

请参见此处:http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html:

The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.

优化器对于连接比对子查询更成熟,因此在许多情况下,如果将其重写为连接,则可以更有效地执行使用子查询的语句。

If the official mysql docs from Oracle state something like "more mature", then you can rest assured that it is actually something akin to crap (no pun intended, but I already had my issues with MySQL and most of the larger statements that run flawlessly with a commercial database, would rather kill mysql).

如果来自Oracle的官方mysql文档声明类似“更成熟”的东西,那么你可以放心它实际上类似于废话(没有双关语意图,但我已经遇到了MySQL的问题以及大多数运行完美的大型语句用商业数据库,宁可杀掉mysql)。

So the task is: rewrite it using JOINs....

所以任务是:使用JOIN重写它....

#1


2  

Adding an index helps in many cases, but you have a subquery joining another subquery, no index on your current table can help you speed up. The only way you can use indexes here is to create temporary table.

在许多情况下,添加索引会有所帮助,但是您有一个子查询加入另一个子查询,当前表上没有索引可以帮助您加快速度。您可以在此处使用索引的唯一方法是创建临时表。

So as Markus pointed you need to break your query into a couple of smaller ones which store their results in a temporary table. Than you can add indexes to them and hopefully speedup your query. Another good thing about breaking big query into couple of smaller ones is that you can better profile which part is the slower one and fix it.

因此,Markus指出,您需要将查询分解为几个较小的查询,这些查询将结果存储在临时表中。您可以为它们添加索引,并希望加快查询速度。将大查询分解为几个较小的查询的另一个好处是,您可以更好地分析哪个部分是较慢的并修复它。

You have also used one subquery two times which is bad for performance as the result was not cached.

您还使用了两次子查询,这对性能有害,因为结果未缓存。

Here is an example of how you could do this:

以下是如何执行此操作的示例:

DROP TEMPORARY TABLE IF EXISTS tmp_k;
CREATE TEMPORARY TABLE tmp_k
    ENGINE=Memory
SELECT 
    gps_unit_location.*,
    @i:= IF(((Speed_Kmh > 80) AND (@b = 0)), @i + 1, @i) AS IntervalID,
    @r:= IF(((Speed_Kmh > 80) AND (@b = 0)), 1, @r + 1) AS RowNumber,
    @b:= IF((Speed_Kmh > 80), 1, 0) AS IntervalCheck
FROM
    gps_unit_location,
    (SELECT @i:=0) i, 
    (SELECT @r:=0) r, 
    (SELECT @b:=0) b
ORDER BY
    dt,
    idgps_unit_location;

ALTER TABLE tmp_k ADD INDEX (IntervalID);

DROP TEMPORARY TABLE IF EXISTS tmp_max;
CREATE TEMPORARY TABLE tmp_max
    ENGINE=Memory
SELECT 
    IntervalID, 
    MAX(RowNumber) AS MaxRowNo
FROM
    temp_k
WHERE
    IntervalCheck = 1
GROUP BY 
    IntervalID;

ALTER TABLE tmp_max ADD INDEX (IntervalID);

SELECT 
    k.idgps_unit,
    MIN(k.dt) AS DT_Start,
    MIN(IF(k.RowNumber = 1, k.Lat, NULL)) AS Latitude_Start,
    MIN(IF(k.RowNumber = 1, k.Long, NULL)) AS Longitude_Start,
    MIN(IF(k.RowNumber = 1, k.Speed_kmh, NULL) AS Speed_Start,
    MAX(k.dt) AS DT_End,
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Lat, NULL)) AS Latitude_End
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Long, NULL)) AS Longitude_End
    MIN(IF(k.RowNumber = m.MaxRowNo, k.Speed_kmh, NULL)) AS Speed_End,
    AVG(Speed_kmh) AS Average_Speed,
    gu.name,
    gu.notes,
    gu.serial
FROM
    tmp_k AS k
    INNER JOIN tmp_max AS m
        USING(IntervalID)
    INNER JOIN gps_unit AS gu
        USING(idgps_unit)
    INNER JOIN user AS u
    ON (gu.idcustomer = u.idcustomer)
WHERE
    (k.IntervalCheck = 1) 
     AND (u.iduser = 14)
GROUP BY 
    k.IntervalID, 
    k.idgps_unit;

DROP TEMPORARY TABLE tmp_k;
DROP TEMPORARY TABLE tmp_max;

#2


0  

If the size in bytes of the nested query(s) exceeds the size of the buffer pool (check innodb_buffer_pool_size) your query will take an extremely long time due to i/o paging.

如果嵌套查询的大小(以字节为单位)超过缓冲池的大小(请检查innodb_buffer_pool_size),由于i / o分页,查询将花费很长时间。

That said you can improve your performance with the following tips:

也就是说,您可以通过以下提示改善您的表现:

  • Select as little data as possible in the nested query
  • 在嵌套查询中选择尽可能少的数据

  • Increase the size of your buffer pool.
  • 增加缓冲池的大小。

#3


0  

My personal experience has shown that MySQL is rather bad at handling subqueries. The query optimizer of a database is a very intricate and delicious part of the database and commercial database vendors put much effort into it, so it is IMHO no wonder that MySQL performs rather poor when it comes to handling crazy SQL statements invented by even crazier developers ;-).

我的个人经验表明MySQL在处理子查询方面相当糟糕。数据库的查询优化器是数据库的一个非常复杂和美味的部分,商业数据库供应商付出了很多努力,所以恕我直言,毫无疑问MySQL在处理由更疯狂的开发人员发明的疯狂SQL语句时表现相当差;-)。

See here: http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html :

请参见此处:http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html:

The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.

优化器对于连接比对子查询更成熟,因此在许多情况下,如果将其重写为连接,则可以更有效地执行使用子查询的语句。

If the official mysql docs from Oracle state something like "more mature", then you can rest assured that it is actually something akin to crap (no pun intended, but I already had my issues with MySQL and most of the larger statements that run flawlessly with a commercial database, would rather kill mysql).

如果来自Oracle的官方mysql文档声明类似“更成熟”的东西,那么你可以放心它实际上类似于废话(没有双关语意图,但我已经遇到了MySQL的问题以及大多数运行完美的大型语句用商业数据库,宁可杀掉mysql)。

So the task is: rewrite it using JOINs....

所以任务是:使用JOIN重写它....