如何避免让子句进入查询?

时间:2021-11-26 04:18:48

I am working in this query that runs succesfully

我正在这个成功运行的查询中工作

select 
hash,
SUM(DATE(TIMESTAMP) = CURDATE()) as today,
sum(DATE(TIMESTAMP) between DATE_SUB(CURDATE( ), INTERVAL 7 DAY) and  DATE_SUB(CURDATE( ), INTERVAL 1 DAY)) as last_week

from behaviour

group by hash
having last_week > 0 and today > last_week
order by today desc

and I am trying to optimize it.

我正在努力优化它。

I am trying this to avoid the last_week>0 into the having clause without any luck. I get an "invalid use of group function"

我试图避免last_week> 0进入having子句而没有任何运气。我得到“无效使用群组功能”

select 
hash,
SUM(DATE(TIMESTAMP) = CURDATE()) as today,
sum(DATE(TIMESTAMP) between DATE_SUB(CURDATE( ), INTERVAL 7 DAY) and  DATE_SUB(CURDATE( ), INTERVAL 1 DAY)) as last_week

from behaviour
where 
and (sum(DATE(TIMESTAMP) between DATE_SUB(CURDATE( ), INTERVAL 4 DAY) and  DATE_SUB(CURDATE( ), INTERVAL 1 DAY)) > 0)

group by hash
having today > last_week
order by today desc

How can I optimize it? Because in a big table it takes about 1 minute to execute.

我该如何优化它?因为在大表中执行大约需要1分钟。

1 个解决方案

#1


3  

You want to filter before doing the aggregation:

您想在进行聚合之前进行过滤:

select hash,
       sum(DATE(TIMESTAMP) = CURDATE()) as today,
       sum(DATE(TIMESTAMP) between DATE_SUB(CURDATE( ), INTERVAL 7 DAY) and DATE_SUB(CURDATE( ), INTERVAL 1 DAY)) as last_week
from behaviour
where timestamp >= curdate() - interval 7 day
      timestamp < curdate() + interval 1 day
group by hash
having today > last_week and last_week > 0
order by today desc;

This reduces the volume of data needed for the group by -- and that should significantly improve performance. You might be able to further improve performance with an index on (timestamp, hash).

这减少了组所需的数据量 - 这应该可以显着提高性能。您可以通过索引(时间戳,哈希)进一步提高性能。

You still need the having clause because you want additional filters on the results. The performance gain is from filtering before the aggregation, though.

您仍然需要having子句,因为您需要对结果进行其他过滤。但是,性能增益来自聚合之前的过滤。

#1


3  

You want to filter before doing the aggregation:

您想在进行聚合之前进行过滤:

select hash,
       sum(DATE(TIMESTAMP) = CURDATE()) as today,
       sum(DATE(TIMESTAMP) between DATE_SUB(CURDATE( ), INTERVAL 7 DAY) and DATE_SUB(CURDATE( ), INTERVAL 1 DAY)) as last_week
from behaviour
where timestamp >= curdate() - interval 7 day
      timestamp < curdate() + interval 1 day
group by hash
having today > last_week and last_week > 0
order by today desc;

This reduces the volume of data needed for the group by -- and that should significantly improve performance. You might be able to further improve performance with an index on (timestamp, hash).

这减少了组所需的数据量 - 这应该可以显着提高性能。您可以通过索引(时间戳,哈希)进一步提高性能。

You still need the having clause because you want additional filters on the results. The performance gain is from filtering before the aggregation, though.

您仍然需要having子句,因为您需要对结果进行其他过滤。但是,性能增益来自聚合之前的过滤。