将SQL查询与共享逻辑相结合

时间:2021-05-08 23:51:00

I'm currently duplicating a rather large SQL aggregation query so that I can run it once to return the metrics for the entire data set, then again to group the metrics for each day.

我目前正在复制一个相当大的SQL聚合查询,以便我可以运行一次以返回整个数据集的指标,然后再次对每天的指标进行分组。

Here's a simplified example of the query that calculates the overall metrics.

以下是计算总体指标的查询的简化示例。

SELECT
  sum(sentiment) FILTER (WHERE user = :user) AS total_sentiment,
  avg(sentiment) FILTER (WHERE user = :user) AS average_sentiment,
  count(messages) FILTER (WHERE sender = :user) AS total_messages
FROM
  "Messages"
WHERE
  date >= :start AND date < :end;

And here's the one that calculates the same metrics, but once for each day.

这是计算相同指标的计算器,但每天计算一次。

SELECT
  date_trunc('day', date) AS date,
  sum(sentiment) FILTER (WHERE user = :user) AS total_sentiment,
  avg(sentiment) FILTER (WHERE user = :user) AS average_sentiment,
  count(messages) FILTER (WHERE sender = :user) AS total_messages
FROM
  "Messages"
WHERE
  date >= :start AND date < :end;
GROUP BY 1
ORDER BY 1

Is there a way to combine these two queries without having to duplicate most of the logic?

有没有办法结合这两个查询而不必复制大多数逻辑?

Building the query strings programmatically is an option, but I'd definitely rather not go down that path.

以编程方式构建查询字符串是一种选择,但我绝对不会走这条路。

If the queries were actually as simple as the examples above, duplicating them wouldn't be as much of an issue, but they deal with more complex joins and statistical functions—keeping them in sync is already tricky.

如果查询实际上与上面的示例一样简单,那么复制它们就不会有多大问题,但是它们处理更复杂的连接和统计函数 - 保持它们同步已经很棘手了。

Ideally, the output would be a table whose first row contained the overall metrics and the rest of the rows would be the per-day calculations.

理想情况下,输出将是一个表,其第一行包含总体度量,其余行将是每日计算。

2 个解决方案

#1


2  

The simplest method would be to use grouping sets:

最简单的方法是使用分组集:

SELECT date_trunc('day', date) AS date,
       sum(sentiment) FILTER (WHERE user = :user) AS total_sentiment,
       avg(sentiment) FILTER (WHERE user = :user) AS average_sentiment,
       count(messages) FILTER (WHERE sender = :user) AS total_messages
FROM "Messages"
WHERE date >= :start AND date < :end
GROUP BY GROUPING SETS ( (), (date) );

#2


1  

You could use windowed functions:

你可以使用窗口函数:

SELECT DISTINCT
  date_trunc('day', date) AS date,
  sum(sentiment) FILTER (WHERE user = :user) OVER() AS total_sentiment,
  avg(sentiment) FILTER (WHERE user = :user) OVER() AS average_sentiment,
  count(messages) FILTER (WHERE sender = :user) OVER() AS total_messages,
  sum(sentiment) FILTER (WHERE user = :user) OVER(PARTITION BY date_trunc('day', date)) 
     AS total_sentiment_per_day,
  ...
FROM "Messages"
WHERE  date >= :start AND date < :end;
ORDER BY 1

#1


2  

The simplest method would be to use grouping sets:

最简单的方法是使用分组集:

SELECT date_trunc('day', date) AS date,
       sum(sentiment) FILTER (WHERE user = :user) AS total_sentiment,
       avg(sentiment) FILTER (WHERE user = :user) AS average_sentiment,
       count(messages) FILTER (WHERE sender = :user) AS total_messages
FROM "Messages"
WHERE date >= :start AND date < :end
GROUP BY GROUPING SETS ( (), (date) );

#2


1  

You could use windowed functions:

你可以使用窗口函数:

SELECT DISTINCT
  date_trunc('day', date) AS date,
  sum(sentiment) FILTER (WHERE user = :user) OVER() AS total_sentiment,
  avg(sentiment) FILTER (WHERE user = :user) OVER() AS average_sentiment,
  count(messages) FILTER (WHERE sender = :user) OVER() AS total_messages,
  sum(sentiment) FILTER (WHERE user = :user) OVER(PARTITION BY date_trunc('day', date)) 
     AS total_sentiment_per_day,
  ...
FROM "Messages"
WHERE  date >= :start AND date < :end;
ORDER BY 1