如何计算PostgreSQL中几个日期范围内的出现次数

时间:2022-08-25 19:00:03

I'm able to do a query to get the number of customers who comes in a shop from age 18 to 24 per month and per shop. I'm doing it like this:

我可以进行查询,以获得每月和每个商店18至24岁的顾客数量。我是这样做的:

select year, month, shop_id, count(birthday) 
from customers 
where birthday 
BETWEEN '1992-01-01 00:00:00' AND '1998-01-01 00:00:00'
group by year, month, shop_id;

Now, I'm having an issue making this query for several ranges at the same time.

现在,我在同时对多个范围进行查询时遇到问题。

I have currently this database schema:

我目前有这个数据库架构:

shop_id | birthday | year | month |
--------+----------+------+--------
 567   | 1998-10-10 | 2014 | 10 |
 567   | 1996-10-10 | 2014 | 10 |
 567   | 1985-10-10 | 2014 | 10 |
 234   | 1990-10-10 | 2014 | 10 |
 123   | 1970-01-10 | 2014 | 10 |
 123   | 1974-01-10 | 2014 | 11 |

And I would like to get something like this:

我希望得到这样的东西:

shop_id | year | month | 18 < age < 25 | 26 < age < 35
--------+------+-------+---------------+-------------
567   |  2014  | 10    | 2             | 1
234   |  2014  | 10    | 1             | 0
123   |  2014  | 10    | 0             | 0

In the first query, it does not manage the case where one shop has NO customers. How to get 0 if there is not?

在第一个查询中,它不管理一个商店没有客户的情况。如果没有,如何获得0?

How to query the several date ranges at the same time?

如何同时查询多个日期范围?

2 个解决方案

#1


0  

Instead of filters, use case statements:

使用case语句代替过滤器:

select year, month, shop_id, 
count(case when birthday between <range1> then 1 end) RANGE1,
count(case when birthday between <range2> then 1 end) RANGE2,
count(case when birthday between <range3> then 1 end) RANGE3
from customers 
group by year, month, shop_id;

#2


0  

"No rows with zeros" is a common problem with GROUP BY queries. The solution is to make your FROM be whatever table has the complete list, and then do a LEFT JOIN. Since you are grouping by year and month too, you'll need to produce the complete list of years and months. You can do that with generate_series:

“没有带零的行”是GROUP BY查询的常见问题。解决方案是使您的FROM成为具有完整列表的任何表,然后执行LEFT JOIN。由于您按年份和月份进行分组,因此您需要生成完整的年份和月份列表。你可以用generate_series做到这一点:

SELECT  t.t, s.id, COUNT(c.birthday) 
FROM    shops s
CROSS JOIN generate_series('2014-01-01 00:00:00', '2015-01-01 00:00:00', interval '1 month') t(t)
LEFT OUTER JOIN customers c
ON      c.shop_id = s.id
AND     c.birthday 
        BETWEEN '1992-01-01 00:00:00' AND '1998-01-01 00:00:00'
AND     c.year = EXTRACT(YEAR FROM t.t)
AND     c.month = EXTRACT(MONTH FROM t.t)
GROUP BY t.t, s.id
ORDER BY s.id, t.t;

To get counts for two date ranges, you could do what @mo2 suggests, or you could join to the customers table twice:

要获得两个日期范围的计数,您可以执行@ mo2建议的操作,或者您可以两次加入customers表:

SELECT  t.t, s.id, COUNT(DISTINCT c1.id), COUNT(DISTINCT c2.id) 
FROM    shops s
CROSS JOIN generate_series('2014-01-01 00:00:00', '2015-01-01 00:00:00', interval '1 month') t(t)
LEFT OUTER JOIN customers c1
ON      c1.shop_id = s.id
AND     c1.birthday 
        BETWEEN '1992-01-01 00:00:00' AND '1998-01-01 00:00:00'
AND     c1.year = EXTRACT(YEAR FROM t.t)
AND     c1.month = EXTRACT(MONTH FROM t.t)
LEFT OUTER JOIN customers c2
ON      c2.shop_id = s.id
AND     c2.birthday 
        BETWEEN '1982-01-01 00:00:00' AND '1992-01-01 00:00:00'
AND     c2.year = EXTRACT(YEAR FROM t.t)
AND     c2.month = EXTRACT(MONTH FROM t.t)
GROUP BY t.t, s.id
ORDER BY s.id, t.t;

Note that in both queries I am SELECTing a full datetime rather than year and month. That is more flexible I think, but it should be easy to change if you want.

请注意,在两个查询中,我选择的是完整的日期时间而不是年份和月份。我认为这更灵活,但如果你愿意,它应该很容易改变。

EDIT: I realized your year and month are not birthday-related, but something else, I guess the visit date? So I updated my query. If you are only checking one month at a time, you could remove the generate_series and just put the year and month integers directly into the join conditions.

编辑:我意识到你的年和月不是生日相关的,但其他的东西,我想访问日期?所以我更新了我的查询。如果您一次只检查一个月,则可以删除generate_series并将年份和月份整数直接放入连接条件。

#1


0  

Instead of filters, use case statements:

使用case语句代替过滤器:

select year, month, shop_id, 
count(case when birthday between <range1> then 1 end) RANGE1,
count(case when birthday between <range2> then 1 end) RANGE2,
count(case when birthday between <range3> then 1 end) RANGE3
from customers 
group by year, month, shop_id;

#2


0  

"No rows with zeros" is a common problem with GROUP BY queries. The solution is to make your FROM be whatever table has the complete list, and then do a LEFT JOIN. Since you are grouping by year and month too, you'll need to produce the complete list of years and months. You can do that with generate_series:

“没有带零的行”是GROUP BY查询的常见问题。解决方案是使您的FROM成为具有完整列表的任何表,然后执行LEFT JOIN。由于您按年份和月份进行分组,因此您需要生成完整的年份和月份列表。你可以用generate_series做到这一点:

SELECT  t.t, s.id, COUNT(c.birthday) 
FROM    shops s
CROSS JOIN generate_series('2014-01-01 00:00:00', '2015-01-01 00:00:00', interval '1 month') t(t)
LEFT OUTER JOIN customers c
ON      c.shop_id = s.id
AND     c.birthday 
        BETWEEN '1992-01-01 00:00:00' AND '1998-01-01 00:00:00'
AND     c.year = EXTRACT(YEAR FROM t.t)
AND     c.month = EXTRACT(MONTH FROM t.t)
GROUP BY t.t, s.id
ORDER BY s.id, t.t;

To get counts for two date ranges, you could do what @mo2 suggests, or you could join to the customers table twice:

要获得两个日期范围的计数,您可以执行@ mo2建议的操作,或者您可以两次加入customers表:

SELECT  t.t, s.id, COUNT(DISTINCT c1.id), COUNT(DISTINCT c2.id) 
FROM    shops s
CROSS JOIN generate_series('2014-01-01 00:00:00', '2015-01-01 00:00:00', interval '1 month') t(t)
LEFT OUTER JOIN customers c1
ON      c1.shop_id = s.id
AND     c1.birthday 
        BETWEEN '1992-01-01 00:00:00' AND '1998-01-01 00:00:00'
AND     c1.year = EXTRACT(YEAR FROM t.t)
AND     c1.month = EXTRACT(MONTH FROM t.t)
LEFT OUTER JOIN customers c2
ON      c2.shop_id = s.id
AND     c2.birthday 
        BETWEEN '1982-01-01 00:00:00' AND '1992-01-01 00:00:00'
AND     c2.year = EXTRACT(YEAR FROM t.t)
AND     c2.month = EXTRACT(MONTH FROM t.t)
GROUP BY t.t, s.id
ORDER BY s.id, t.t;

Note that in both queries I am SELECTing a full datetime rather than year and month. That is more flexible I think, but it should be easy to change if you want.

请注意,在两个查询中,我选择的是完整的日期时间而不是年份和月份。我认为这更灵活,但如果你愿意,它应该很容易改变。

EDIT: I realized your year and month are not birthday-related, but something else, I guess the visit date? So I updated my query. If you are only checking one month at a time, you could remove the generate_series and just put the year and month integers directly into the join conditions.

编辑:我意识到你的年和月不是生日相关的,但其他的东西,我想访问日期?所以我更新了我的查询。如果您一次只检查一个月,则可以删除generate_series并将年份和月份整数直接放入连接条件。