I am working to select data from a price database. The rows I want to query are the ones which occur every whole minute, and distinctly. So, if there's a minute that has two prices, I would rather the first price.
我正在努力从价格数据库中选择数据。我想要查询的行是每隔一分钟发生的行,并且清楚地显示。所以,如果有一分钟有两个价格,我宁愿第一个价格。
Here's what the data looks like this this VVV query:
以下是此VVV查询的数据:
SELECT price, timestamp FROM [database] WHERE stock="appl" AND second(timestamp) = 0 ORDER BY timestamp
SELECT price,timestamp FROM [database] WHERE stock =“appl”AND second(timestamp)= 0 ORDER BY timestamp
Result:
Row price timestamp
1 0.097947 2018-02-14 03:42:00.000 UTC
2 0.09796 2018-02-14 03:43:00.000 UTC
3 0.097959 2018-02-14 03:45:00.000 UTC
4 0.097969 2018-02-14 03:46:00.000 UTC
5 0.097984 2018-02-14 03:47:00.000 UTC
6 0.097986 2018-02-14 03:47:00.000 UTC (Duplicate time ^) 7 0.097899 2018-02-14 03:48:00.000 UTC
8 0.097927 2018-02-14 03:49:00.000 UTC
9 0.097984 2018-02-14 03:50:00.000 UTC
10 0.097995 2018-02-14 03:51:00.000 UTC
11 0.097972 2018-02-14 03:52:00.000 UTC
12 0.097924 2018-02-14 03:53:00.000 UTC
13 0.097935 2018-02-14 03:54:00.000 UTC
行价时间戳1 0.097947 2018-02-14 03:42:00.000 UTC 2 0.09796 2018-02-14 03:43:00.000 UTC 3 0.097959 2018-02-14 03:45:00.000 UTC 4 0.097969 2018-02-14 03 :46:00.000 UTC 5 0.097984 2018-02-14 03:47:00.000 UTC 6 0.097986 2018-02-14 03:47:00.000 UTC(重复时间^)7 0.097899 2018-02-14 03:48:00.000 UTC 8 0.097927 2018-02-14 03:49:00.000 UTC 9 0.097984 2018-02-14 03:50:00.000 UTC 10 0.097995 2018-02-14 03:51:00.000 UTC 11 0.097972 2018-02-14 03:52:00.000 UTC 12 0.097924 2018-02-14 03:53:00.000 UTC 13 0.097935 2018-02-14 03:54:00.000 UTC
When I add "GROUP BY price, timestamp", the data has no difference.
当我添加“GROUP BY price,timestamp”时,数据没有区别。
I want distinct timestamps. So, for this case the result should be:
我想要不同的时间戳。因此,对于这种情况,结果应该是:
Row price timestamp
1 0.097947 2018-02-14 03:42:00.000 UTC
2 0.09796 2018-02-14 03:43:00.000 UTC
3 0.097959 2018-02-14 03:45:00.000 UTC
4 0.097969 2018-02-14 03:46:00.000 UTC
5 0.097984 2018-02-14 03:47:00.000 UTC
6 0.097899 2018-02-14 03:48:00.000 UTC
7 0.097927 2018-02-14 03:49:00.000 UTC
8 0.097984 2018-02-14 03:50:00.000 UTC
9 0.097995 2018-02-14 03:51:00.000 UTC
10 0.097972 2018-02-14 03:52:00.000 UTC
11 0.097924 2018-02-14 03:53:00.000 UTC
12 0.097935 2018-02-14 03:54:00.000 UTC
行价时间戳1 0.097947 2018-02-14 03:42:00.000 UTC 2 0.09796 2018-02-14 03:43:00.000 UTC 3 0.097959 2018-02-14 03:45:00.000 UTC 4 0.097969 2018-02-14 03 :46:00.000 UTC 5 0.097984 2018-02-14 03:47:00.000 UTC 6 0.097899 2018-02-14 03:48:00.000 UTC 7 0.097927 2018-02-14 03:49:00.000 UTC 8 0.097984 2018-02- 14 03:50:00.000 UTC 9 0.097995 2018-02-14 03:51:00.000 UTC 10 0.097972 2018-02-14 03:52:00.000 UTC 11 0.097924 2018-02-14 03:53:00.000 UTC 12 0.097935 2018- 02-14 03:54:00.000 UTC
3 个解决方案
#1
1
Below is for BigQuery Standard SQL (and assumes your ts
field is of timestamp type)
下面是BigQuery Standard SQL(假设你的ts字段是时间戳类型)
SELECT
ARRAY_AGG(price ORDER BY ts LIMIT 1)[SAFE_OFFSET(0)] price,
TIMESTAMP_TRUNC(ts, MINUTE) ts
FROM `yourproject.yourdataset.yourtable`
WHERE stock = 'appl'
GROUP BY 2
ORDER BY 2
Note: I use ts
instead of timestamp
as I prefer not using keywords as column names
注意:我使用ts而不是时间戳,因为我不喜欢使用关键字作为列名
#2
1
There is no such thing as a "first" price, unless another column specifies that value. You can get one price per timestamp with something like this:
除非另一列指定该值,否则不存在“第一”价格。每个时间戳可以得到一个价格,如下所示:
SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp;
If you do have another column with the ordering, then you can use array agg and choose the first value.
如果您确实有另一个具有排序的列,那么您可以使用数组agg并选择第一个值。
#3
0
SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp
ORDER BY timestamp
#1
1
Below is for BigQuery Standard SQL (and assumes your ts
field is of timestamp type)
下面是BigQuery Standard SQL(假设你的ts字段是时间戳类型)
SELECT
ARRAY_AGG(price ORDER BY ts LIMIT 1)[SAFE_OFFSET(0)] price,
TIMESTAMP_TRUNC(ts, MINUTE) ts
FROM `yourproject.yourdataset.yourtable`
WHERE stock = 'appl'
GROUP BY 2
ORDER BY 2
Note: I use ts
instead of timestamp
as I prefer not using keywords as column names
注意:我使用ts而不是时间戳,因为我不喜欢使用关键字作为列名
#2
1
There is no such thing as a "first" price, unless another column specifies that value. You can get one price per timestamp with something like this:
除非另一列指定该值,否则不存在“第一”价格。每个时间戳可以得到一个价格,如下所示:
SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp;
If you do have another column with the ordering, then you can use array agg and choose the first value.
如果您确实有另一个具有排序的列,那么您可以使用数组agg并选择第一个值。
#3
0
SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp
ORDER BY timestamp