I have a hierarchy of tables:
我有一个表层次结构:
-
GrandParentFoo
, which has zero or more -
ParentFoo
, which has zero or more ChildFoo
GrandParentFoo,零或更多
ParentFoo,零或更多
ParentFoo
and ChildFoo
both have a Status
column with a total of 4 possible statuses:
ParentFoo和ChildFoo都有一个Status列,总共有4种可能的状态:
- Pending (1)
- Active (2)
- Paused (3)
- Complete (4)
I am trying to write a query that gives me a rollup for any particular GrandParentFoo
along the following lines:
我正在尝试编写一个查询,它给出了以下几行中任何特定GrandParentFoo的汇总:
- GrandParentFooId
- Total ParentFoos
- Total ParentFoos Pending
- Total ParentFoos Active
- Total ParentFoos Paused
- Total ParentFoos Completed
- Total ChildFoos
- Total ChildFoos Pending
- Total ChildFoos Active
- Total ChildFoos Paused
- Total ChildFoos Completed
总父母待定
Total ParentFoos Active
总父母暂停
总父母已完成
ChildFoos待定总数
Total ChildFoos Active
总ChildFoos已暂停
ChildFoos总数已经完成
I was starting down the path of:
我开始走的路:
select
gp.GrandParentFooId,
count(distinct pf.ParentFooId) as TotalParentFoos,
sum(case pf.Status
when 1 then 1
else 0 end) as TotalParentFoosPending
...when I realized this was going to give me an inflated count where multiple ChildFoo
records existed on the ParentFoo
records.
...当我意识到这会给我一个膨胀的计数,其中在ParentFoo记录中存在多个ChildFoo记录。
Do I have to write this out as a series of CTE's, or is there a cleaner, simpler way to do this? It seems like a pivot or windowed function of some kind would work here, but I can't conceptualize it.
我是否必须将其写成一系列CTE,或者是否有更简洁的方法来做到这一点?看起来某种类型的枢轴或窗口函数在这里可以工作,但我无法概念化它。
1 个解决方案
#1
1
One relatively simple method uses conditional aggregation with count(distinct)
:
一个相对简单的方法使用带有count(distinct)的条件聚合:
select gp.GrandParentFooId,
count(distinct pf.ParentFooId) as TotalParentFoos,
count(distinct case when fp.status = 1 then pf.ParentFooId end) as parent_pending,
count(distinct case when fp.status = 2 then pf.ParentFooId end) as parent_active,
count(distinct case when fp.status = 3 then pf.ParentFooId end) as parent_paused,
count(distinct case when fp.status = 4 then pf.ParentFooId end) as parent_completed,
count(distinct c.ChildId) as num_children,
count(distinct case when fp.status = 1 then c.ChildId end) as child_pending,
count(distinct case when fp.status = 2 then c.ChildId end) as child_active,
count(distinct case when fp.status = 3 then c.ChildId end) as child_paused,
count(distinct case when fp.status = 4 then c.ChildId end) as child_completed
from grandparentfoo gp left join
parentfoo p
on gp.GrandParentFooId = p.GrandParentFooId left join
childfoo c
on p.ParentFooId = c.ParentFooId;
Notes:
-
COUNT(DISTINCT)
is probably not needed for the children.COUNT(c.ChildId)
is probably sufficient. - For larger data, I would suggest a more complex query to avoid the
COUNT(DISTINCT)
s.
孩子可能不需要COUNT(DISTINCT)。 COUNT(c.ChildId)可能就足够了。
对于较大的数据,我建议使用更复杂的查询来避免COUNT(DISTINCT)。
#1
1
One relatively simple method uses conditional aggregation with count(distinct)
:
一个相对简单的方法使用带有count(distinct)的条件聚合:
select gp.GrandParentFooId,
count(distinct pf.ParentFooId) as TotalParentFoos,
count(distinct case when fp.status = 1 then pf.ParentFooId end) as parent_pending,
count(distinct case when fp.status = 2 then pf.ParentFooId end) as parent_active,
count(distinct case when fp.status = 3 then pf.ParentFooId end) as parent_paused,
count(distinct case when fp.status = 4 then pf.ParentFooId end) as parent_completed,
count(distinct c.ChildId) as num_children,
count(distinct case when fp.status = 1 then c.ChildId end) as child_pending,
count(distinct case when fp.status = 2 then c.ChildId end) as child_active,
count(distinct case when fp.status = 3 then c.ChildId end) as child_paused,
count(distinct case when fp.status = 4 then c.ChildId end) as child_completed
from grandparentfoo gp left join
parentfoo p
on gp.GrandParentFooId = p.GrandParentFooId left join
childfoo c
on p.ParentFooId = c.ParentFooId;
Notes:
-
COUNT(DISTINCT)
is probably not needed for the children.COUNT(c.ChildId)
is probably sufficient. - For larger data, I would suggest a more complex query to avoid the
COUNT(DISTINCT)
s.
孩子可能不需要COUNT(DISTINCT)。 COUNT(c.ChildId)可能就足够了。
对于较大的数据,我建议使用更复杂的查询来避免COUNT(DISTINCT)。