I am using postgreSQL, I have two tables, one is user, and one is usertasks.
我正在使用postgreSQL,我有两个表,一个是用户,一个是usertasks。
user has following fields : userid, username
用户有以下字段:userid,username
usertasks has following fields: id, taskdate, userid
usertasks有以下字段:id,taskdate,userid
userid and id are primary keys on above tables
userid和id是上表的主键
I want to find all users who have made less than 3 tasks in last 3 months. I cannot use WHERE taskdate>(last3months) here because I need all the users, not just those who made tasks in last 3 months. (Some users might have done their tasks 6 months ago, but didn't do any task in last 3 months, so I need those users as well)
我想找到在过去3个月内完成少于3项任务的所有用户。我不能在这里使用WHERE taskdate>(last3months),因为我需要所有用户,而不仅仅是那些在过去3个月内完成任务的用户。 (有些用户可能在6个月前完成了他们的任务,但在过去3个月内没有完成任务,所以我也需要这些用户)
My query is this:
我的查询是这样的:
select userid
from users
EXCEPT
select userid from usertasks
where usertasks.taskdate > CURRENT_DATE - INTERVAL '3 months'
group by usertasks.userid having count(id) >= 3
Problem: The above query works perfectly and returns the right result, I have also tried NOT IN , instead of EXCEPT, that works fine too, but the thing is I am getting performance issues, can this be done in one single query without using a sub query, can it be done using joins or any other method ? The use of sub-queries making it slower.
问题:上面的查询工作得很好,并返回正确的结果,我也尝试过NOT,而不是EXCEPT,也可以正常工作,但问题是我遇到了性能问题,这可以在一个查询中完成而不使用子查询,可以使用连接或任何其他方法完成吗?使用子查询使其变慢。
the test case is for 100 thousand users and 1 million tasks, i am searching for fastest methods..
测试案例是针对10万用户和100万个任务,我正在寻找最快的方法..
1 个解决方案
#1
1
You need to use having
with a case
.
您需要使用案例。
Select u.user_id
from users u
left join usertask ut
on ut.user_id=u.user_id
group by u.user_id
having count(case when ut.taskdate > CURRENT_DATE - INTERVAL '3 months' then task_id else null end)<3 -- count of tasks in last 3 monthx < 3
#1
1
You need to use having
with a case
.
您需要使用案例。
Select u.user_id
from users u
left join usertask ut
on ut.user_id=u.user_id
group by u.user_id
having count(case when ut.taskdate > CURRENT_DATE - INTERVAL '3 months' then task_id else null end)<3 -- count of tasks in last 3 monthx < 3