MySQL中的时间序列数据:采样

时间:2021-12-29 23:10:53

We have a MySQL database where we have added time series values.

我们有一个MySQL数据库,我们在其中添加了时间序列值。

-------------------------------------
|Col A | Col B | Timestamp          |
-------------------------------------
|1.23  | 4.48  |2013-09-03 10:45:27 |
-------------------------------------
|1.23  | 4.48  |2013-09-03 10:46:27 |
-------------------------------------
|1.23  | 4.48  |2013-09-03 10:47:27 |
-------------------------------------

The data is unevenly spaced w.r.t time, some points are separated by a minute & some by a few seconds.

数据在时间上间隔不均匀,有些点间隔一分钟,有些则间隔几秒钟。

Is there an efficient way I could query this database to pull data for every nth minute/second/hour? Ideally I would want the (linear) interpolated value at the nth minute, but the closest point to the nth minute or the last point just before or at the nth point would do too.

有没有一种有效的方法可以查询此数据库以每n秒/秒/小时提取数据?理想情况下,我希望在第n分钟获得(线性)插值,但是最接近第n分钟的点或者在第n点之前或第n点的最后一点也是如此。

The use case being I want to plot this into a graph, but do not want too many points than necessary. So for plotting for a year, I would prefer querying only a couple of points a day. While plotting for a day, I would want to plot a point every minute or so.

用例是我想将其绘制成图形,但不要求太多的点数。因此,对于一年的情节,我宁愿每天只查询几个点。在绘制一天时,我想要每分钟左右绘制一个点。

I can do all this in PHP, but is there a way to do it directly in the database? If not, I am contemplating the usage of a time series database, but the budget constraints restricts me to only the free ones. Is there any free time series database that gives out of the box sampling and preferably interpolation?

我可以在PHP中完成所有这些,但是有没有办法在数据库中直接进行?如果没有,我正在考虑使用时间序列数据库,但预算限制仅限于免费的数据库。是否有任何*时间序列数据库提供开箱即用的采样,最好是插值?

3 个解决方案

#1


1  

I've had a stab at this, I'm really interested to see how others would solve it.

我对此有所了解,我真的很想知道其他人如何解决它。

I had a similar problem like this before and solved it by creating a time index table and then joining the data table based on rewriting the time to fit a time frame. The problem is you need a new time index table and separate query or view for each time interval.

之前我有类似的问题并通过创建时间索引表然后根据重写时间来加入数据表来解决它以适应时间范围。问题是您需要一个新的时间索引表和每个时间间隔的单独查询或视图。

The advantage of joining the data in this way was that I was also interested in time frames where there was no reading or result, so I needed to see the nulls or no readings for certain time frames. There is just a little extra work required with the end data for that (ie: taking out the Placeholders) .

以这种方式加入数据的好处是我也对没有读数或结果的时间帧感兴趣,所以我需要查看某些时间帧的空值或没有读数。最终数据需要额外的工作(即:取出占位符)。

The first thing I did, was create a time index table, it looks something like this....

我做的第一件事是创建一个时间索引表,它看起来像这样....

mysql> select * from ctb_time_idx  WHERE YEAR( ctb_datetime ) = 2013  LIMIT 10 ;
+---------------------+
| ctb_datetime        |
+---------------------+
| 2013-01-01 00:00:00 | 
| 2013-01-01 00:15:00 | 
| 2013-01-01 00:30:00 | 
| 2013-01-01 00:45:00 | 
| 2013-01-01 01:00:00 | 
| 2013-01-01 01:15:00 | 
| 2013-01-01 01:30:00 | 
| 2013-01-01 01:45:00 | 
| 2013-01-01 02:00:00 | 
| 2013-01-01 02:15:00 | 
+---------------------+
10 rows in set (0.07 sec)

I then union my data in

然后我将我的数据联合起来

( select 
    ctb_datetime AS time1 , 
    'Placeholder' AS TimeInterval , 
    NULL AS `Col A` , 
    NULL AS `Col B` 
from my_time_idx 
    where YEAR ( ctb_time_idx.ctb_datetime  ) = 2013 )  
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:00' ) AS time1  , 
    '00min' AS TimeInterval , `Col A` , `Col B` from my_data_table  
    where MINUTE( time1 ) BETWEEN  00 AND 14  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:15' ) AS time1 , 
    '15min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  15 AND 29  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:30' ) AS time1 , 
    '30min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  30 AND 44  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:45' ) AS time1 , 
    '45min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  45 AND 59  )     
order by time1 

I tested this on my old tables and it seems to work fine, I had to re-edit my code to suit your example, so hopefully I didn't screw it up when doing that.

我在旧桌子上测试了它,它看起来工作正常,我不得不重新编辑我的代码以适合你的例子,所以希望我这样做时没有搞砸。

#2


1  

I've not used it myself but recently came across InfluxDB that sounds like it could meet your criteria - an open source time series database with built in aggregation queries - example

我自己没有使用它,但最近遇到了InfluxDB,听起来它符合你的标准 - 一个内置聚合查询的开源时间序列数据库 - 例子

SELECT MEAN(column_name) FROM series_name group by time(10m)

#3


-1  

select unix_timestamp(now());
select from_unixtime(unix_timestamp(now()));
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%300);
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%900);
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%1800);

+-----------------------+
| unix_timestamp(now()) |
+-----------------------+
|            1383077951 |
+-----------------------+
1 row in set (0.00 sec)

+--------------------------------------+
| from_unixtime(unix_timestamp(now())) |
+--------------------------------------+
| 2013-10-29 20:19:11                  |
+--------------------------------------+
1 row in set (0.00 sec)

+----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%300) |
+----------------------------------------------------------------+
| 2013-10-29 20:15:00                                            |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

+----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%900) |
+----------------------------------------------------------------+
| 2013-10-29 20:15:00                                            |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

+-----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%1800) |
+-----------------------------------------------------------------+
| 2013-10-29 20:00:00                                             |
+-----------------------------------------------------------------+
1 row in set (0.00 sec)

#1


1  

I've had a stab at this, I'm really interested to see how others would solve it.

我对此有所了解,我真的很想知道其他人如何解决它。

I had a similar problem like this before and solved it by creating a time index table and then joining the data table based on rewriting the time to fit a time frame. The problem is you need a new time index table and separate query or view for each time interval.

之前我有类似的问题并通过创建时间索引表然后根据重写时间来加入数据表来解决它以适应时间范围。问题是您需要一个新的时间索引表和每个时间间隔的单独查询或视图。

The advantage of joining the data in this way was that I was also interested in time frames where there was no reading or result, so I needed to see the nulls or no readings for certain time frames. There is just a little extra work required with the end data for that (ie: taking out the Placeholders) .

以这种方式加入数据的好处是我也对没有读数或结果的时间帧感兴趣,所以我需要查看某些时间帧的空值或没有读数。最终数据需要额外的工作(即:取出占位符)。

The first thing I did, was create a time index table, it looks something like this....

我做的第一件事是创建一个时间索引表,它看起来像这样....

mysql> select * from ctb_time_idx  WHERE YEAR( ctb_datetime ) = 2013  LIMIT 10 ;
+---------------------+
| ctb_datetime        |
+---------------------+
| 2013-01-01 00:00:00 | 
| 2013-01-01 00:15:00 | 
| 2013-01-01 00:30:00 | 
| 2013-01-01 00:45:00 | 
| 2013-01-01 01:00:00 | 
| 2013-01-01 01:15:00 | 
| 2013-01-01 01:30:00 | 
| 2013-01-01 01:45:00 | 
| 2013-01-01 02:00:00 | 
| 2013-01-01 02:15:00 | 
+---------------------+
10 rows in set (0.07 sec)

I then union my data in

然后我将我的数据联合起来

( select 
    ctb_datetime AS time1 , 
    'Placeholder' AS TimeInterval , 
    NULL AS `Col A` , 
    NULL AS `Col B` 
from my_time_idx 
    where YEAR ( ctb_time_idx.ctb_datetime  ) = 2013 )  
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:00' ) AS time1  , 
    '00min' AS TimeInterval , `Col A` , `Col B` from my_data_table  
    where MINUTE( time1 ) BETWEEN  00 AND 14  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:15' ) AS time1 , 
    '15min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  15 AND 29  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:30' ) AS time1 , 
    '30min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  30 AND 44  ) 
UNION 
( select DATE_FORMAT( time1 , '%Y-%m-%d %H:45' ) AS time1 , 
    '45min' AS TimeInterval, `Col A` , `Col B` from my_data_table 
    where MINUTE( time1 ) BETWEEN  45 AND 59  )     
order by time1 

I tested this on my old tables and it seems to work fine, I had to re-edit my code to suit your example, so hopefully I didn't screw it up when doing that.

我在旧桌子上测试了它,它看起来工作正常,我不得不重新编辑我的代码以适合你的例子,所以希望我这样做时没有搞砸。

#2


1  

I've not used it myself but recently came across InfluxDB that sounds like it could meet your criteria - an open source time series database with built in aggregation queries - example

我自己没有使用它,但最近遇到了InfluxDB,听起来它符合你的标准 - 一个内置聚合查询的开源时间序列数据库 - 例子

SELECT MEAN(column_name) FROM series_name group by time(10m)

#3


-1  

select unix_timestamp(now());
select from_unixtime(unix_timestamp(now()));
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%300);
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%900);
select from_unixtime(unix_timestamp(now())-unix_timestamp(now())%1800);

+-----------------------+
| unix_timestamp(now()) |
+-----------------------+
|            1383077951 |
+-----------------------+
1 row in set (0.00 sec)

+--------------------------------------+
| from_unixtime(unix_timestamp(now())) |
+--------------------------------------+
| 2013-10-29 20:19:11                  |
+--------------------------------------+
1 row in set (0.00 sec)

+----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%300) |
+----------------------------------------------------------------+
| 2013-10-29 20:15:00                                            |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

+----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%900) |
+----------------------------------------------------------------+
| 2013-10-29 20:15:00                                            |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

+-----------------------------------------------------------------+
| from_unixtime(unix_timestamp(now())-unix_timestamp(now())%1800) |
+-----------------------------------------------------------------+
| 2013-10-29 20:00:00                                             |
+-----------------------------------------------------------------+
1 row in set (0.00 sec)