每小时,每小时聚合一次

时间:2021-06-23 22:55:27

I have a PostgreSQL 9.1 database with a table containing a timestamp and a measuring value

我有一个PostgreSQL 9.1数据库,其中包含一个包含时间戳和测量值的表

'2012-10-25 01:00'   2
'2012-10-25 02:00'   5
'2012-10-25 03:00'   12
'2012-10-25 04:00'   7
'2012-10-25 05:00'   1
...                  ...

I need to average the value over a range of 8 hours, every hour. In other words, I need the average of 1h-8h, 2h-9h, 3h-10h etc.

我需要在每小时8小时的范围内平均值。换句话说,我需要平均1h-8h,2h-9h,3h-10h等。

I have no idea how to proceed for such a query. I have looked everywhere but have also no clue what functionalities to look for.

我不知道如何进行这样的查询。我到处都看,但也不知道要寻找什么功能。

The closes I find are hourly/daily averages or block-averages (e.g. 1h-8h, 9h-16h etc.). But in these cases, the timestamp is simply converted using the date_trunc() function (as in the example below), which is not of use to me.

我发现的关闭是每小时/每日平均值或平均值(例如1小时-8小时,9小时-16小时等)。但在这些情况下,时间戳只是使用date_trunc()函数转换(如下例所示),这对我没用。

What I think I am looking for is a function similar to this

我认为我正在寻找的是与此类似的功能

SELECT    date_trunc('day', timestamp), max(value) 
FROM      table_name
GROUP BY  date_trunc('day', timestamp);

But then using some kind of 8-hour range for EVERY hour in the group-by clause. Is that even possible?

但是在分组条款中每小时使用一些8小时的范围。这有可能吗?

2 个解决方案

#1


6  

A window function with a custom frame makes this amazingly simple:

带有自定义框架的窗口功能使这非常简单:

SELECT ts
      ,avg(val) OVER (ORDER BY ts
                      ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING) AS avg_8h
FROM tbl;

Live demo on sqlfiddle.

在sqlfiddle上进行现场演示。

The frame for each average is the current row plus the following 7. This assumes you have exactly one row for every hour. Your sample data seems to imply that, but you did not specify.

每个平均值的帧是当前行加上以下7.这假设您每小时只有一行。您的示例数据似乎意味着,但您没有指定。

The way it is, avg_8h for the final (according to ts) 7 rows of the set is computed with fewer rows, until the value of the last row equals its own average. You did not specify how to deal with the special case.

就这样,最后的avg_8h(根据ts)7行集合用较少的行计算,直到最后一行的值等于它自己的平均值。您没有指定如何处理特殊情况。

#2


1  

The key is to make a virtual table against which to join your results sets. The generate_series function can help do that, in the following manner:

关键是要创建一个虚拟表来加入结果集。 generate_series函数可以通过以下方式帮助完成此操作:

SELECT
    start
    , start + interval '8 hours' as end
FROM (
    SELECT generate_series(
        date'2012-01-01'
        , date'2012-02-02'
        , '1 hour'
    ) AS start
) x;

This produces output something like this:

这会产生如下输出:

         start          |          end           
------------------------+------------------------
 2012-01-01 00:00:00+00 | 2012-01-01 08:00:00+00
 2012-01-01 01:00:00+00 | 2012-01-01 09:00:00+00
 2012-01-01 02:00:00+00 | 2012-01-01 10:00:00+00
 2012-01-01 03:00:00+00 | 2012-01-01 11:00:00+00

This gives you something to join your data to. In this way, the following query:

这可以为您提供加入数据的功能。这样,以下查询:

SELECT
    y.start
    , round(avg(ts_val.v))
FROM
    ts_val,
    (
        SELECT
            start
            , start + interval '8 hours' as end
        FROM (
            SELECT generate_series(
                date'2012-01-01'
                , date'2012-02-02'
                , '1 hour'
            ) AS start
        ) x
    ) y
WHERE
    ts BETWEEN y.start AND y.end
GROUP BY
    y.start
ORDER BY
    y.start
;

For the following data

对于以下数据

         ts          | v 
---------------------+---
 2012-01-01 01:00:00 | 2
 2012-01-01 09:00:00 | 2
 2012-01-01 10:00:00 | 5
(3 rows)

Will produce the following results:

将产生以下结果:

         start          | round 
------------------------+-------
 2012-01-01 00:00:00+00 |   2.0
 2012-01-01 01:00:00+00 |   2.0
 2012-01-01 02:00:00+00 |   3.5
 2012-01-01 03:00:00+00 |   3.5
 2012-01-01 04:00:00+00 |   3.5
 2012-01-01 05:00:00+00 |   3.5
 2012-01-01 06:00:00+00 |   3.5
 2012-01-01 07:00:00+00 |   3.5
 2012-01-01 08:00:00+00 |   3.5
 2012-01-01 09:00:00+00 |   3.5
 2012-01-01 10:00:00+00 |   5.0
(11 rows)

#1


6  

A window function with a custom frame makes this amazingly simple:

带有自定义框架的窗口功能使这非常简单:

SELECT ts
      ,avg(val) OVER (ORDER BY ts
                      ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING) AS avg_8h
FROM tbl;

Live demo on sqlfiddle.

在sqlfiddle上进行现场演示。

The frame for each average is the current row plus the following 7. This assumes you have exactly one row for every hour. Your sample data seems to imply that, but you did not specify.

每个平均值的帧是当前行加上以下7.这假设您每小时只有一行。您的示例数据似乎意味着,但您没有指定。

The way it is, avg_8h for the final (according to ts) 7 rows of the set is computed with fewer rows, until the value of the last row equals its own average. You did not specify how to deal with the special case.

就这样,最后的avg_8h(根据ts)7行集合用较少的行计算,直到最后一行的值等于它自己的平均值。您没有指定如何处理特殊情况。

#2


1  

The key is to make a virtual table against which to join your results sets. The generate_series function can help do that, in the following manner:

关键是要创建一个虚拟表来加入结果集。 generate_series函数可以通过以下方式帮助完成此操作:

SELECT
    start
    , start + interval '8 hours' as end
FROM (
    SELECT generate_series(
        date'2012-01-01'
        , date'2012-02-02'
        , '1 hour'
    ) AS start
) x;

This produces output something like this:

这会产生如下输出:

         start          |          end           
------------------------+------------------------
 2012-01-01 00:00:00+00 | 2012-01-01 08:00:00+00
 2012-01-01 01:00:00+00 | 2012-01-01 09:00:00+00
 2012-01-01 02:00:00+00 | 2012-01-01 10:00:00+00
 2012-01-01 03:00:00+00 | 2012-01-01 11:00:00+00

This gives you something to join your data to. In this way, the following query:

这可以为您提供加入数据的功能。这样,以下查询:

SELECT
    y.start
    , round(avg(ts_val.v))
FROM
    ts_val,
    (
        SELECT
            start
            , start + interval '8 hours' as end
        FROM (
            SELECT generate_series(
                date'2012-01-01'
                , date'2012-02-02'
                , '1 hour'
            ) AS start
        ) x
    ) y
WHERE
    ts BETWEEN y.start AND y.end
GROUP BY
    y.start
ORDER BY
    y.start
;

For the following data

对于以下数据

         ts          | v 
---------------------+---
 2012-01-01 01:00:00 | 2
 2012-01-01 09:00:00 | 2
 2012-01-01 10:00:00 | 5
(3 rows)

Will produce the following results:

将产生以下结果:

         start          | round 
------------------------+-------
 2012-01-01 00:00:00+00 |   2.0
 2012-01-01 01:00:00+00 |   2.0
 2012-01-01 02:00:00+00 |   3.5
 2012-01-01 03:00:00+00 |   3.5
 2012-01-01 04:00:00+00 |   3.5
 2012-01-01 05:00:00+00 |   3.5
 2012-01-01 06:00:00+00 |   3.5
 2012-01-01 07:00:00+00 |   3.5
 2012-01-01 08:00:00+00 |   3.5
 2012-01-01 09:00:00+00 |   3.5
 2012-01-01 10:00:00+00 |   5.0
(11 rows)