SQL层次结构 - 解析给定节点的所有祖先的完整路径

时间:2021-03-02 08:40:58

I have a hierarchy described by an adjacency list. There is not necessarily a single root element, but I do have data to identify the leaf (terminal) items in the hiearchy. So, a hierachy that looked like this ...

我有一个由邻接列表描述的层次结构。没有一个根元素,但我确实有数据来识别层次结构中的叶子(终端)项目。所以,一个看起来像这样的层次......

1
- 2
- - 4
- - - 7
- 3
- - 5
- - 6 
8
- 9

... would be described by a table, like this. NOTE: I don't have the ability to change this format.

......将用表格来描述,就像这样。注意:我无法更改此格式。

id  parentid isleaf
--- -------- ------
1   null     0
2   1        0
3   1        0
4   2        0
5   3        1
6   3        1
7   4        1
8   null     0
9   8        1

here is the sample table definition and data:

这是示例表定义和数据:

CREATE TABLE [dbo].[HiearchyTest](
    [id] [int] NOT NULL,
    [parentid] [int] NULL,
    [isleaf] [bit] NOT NULL
)
GO

INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (1, NULL, 0)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (2, 1, 0)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (3, 1, 0)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (4, 2, 0)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (5, 3, 1)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (6, 3, 1)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (7, 4, 1)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (8, NULL, 0)
INSERT [dbo].[HiearchyTest] ([id], [parentid], [isleaf]) VALUES (9, 8, 1)
GO

From this, I need to provide any id and get a list of all ancestors including all descendents of each. So, if I provided the input of id = 6, I would expect the following:

从这里,我需要提供任何id并获得所有祖先的列表,包括每个祖先的所有后代。所以,如果我提供id = 6的输入,我会期望以下内容:

id descendentid
-- ------------
1  1
1  3
1  6
3  3
3  6
6  6
  • id 6 just has itself
  • id 6就是它自己

  • its parent, id 3 would have decendents of 3 and 6
  • 它的父母,身份3将具有3和6的后嗣

  • its parent, id 1 would have decendents of 1, 3, and 6
  • 它的父,id 1将具有1,3和6的后瞻

I will be using this data to provide roll-up calculations at each level in the hierarchy. This works well, assuming I can get the dataset above.

我将使用此数据在层次结构中的每个级别提供汇总计算。假设我可以获得上面的数据集,这很有效。

I have accomplished this using two recusive ctes - one to get the "terminal" item for each node in the hiearchy. Then, a second one where I get the full ancestory of my selected node (so, 6 resolves to 6, 3, 1) to walk up and get the full set. I'm hoping that I'm missing something and that this can be accomplished in one round. Here is the example double-recursion code:

我使用两个recusive ctes完成了这个 - 一个用于获取hiearchy中每个节点的“terminal”项。然后,第二个,我得到我所选节点的完整祖先(所以,6解析为6,3,1),然后走上去获得全套。我希望我错过了一些东西,这可以在一轮中完成。这是示例双递归代码:

declare @test int = 6;

with cte as (

    -- leaf nodes
    select id, parentid, id as terminalid
    from HiearchyTest
    where isleaf = 1

    union all

    -- walk up - preserve "terminal" item for all levels
    select h.id, h.parentid, c.terminalid
    from HiearchyTest as h
    inner join
    cte as c on h.id = c.parentid

)

, cte2 as (

    -- get all ancestors of our test value
    select id, parentid, id as descendentid
    from cte
    where terminalid = @test 

    union all

    -- and walkup from each to complete the set
    select h.id, h.parentid, c.descendentid
    from HiearchyTest h
    inner join cte2 as c on h.id = c.parentid

)

-- final selection - order by is just for readability of this example
select id, descendentid 
from cte2
order by id, descendentid

Additional detail: the "real" hierarchy will be much larger than the example. It can technically have infinite depth, but realistically it would rarely go more than 10 levels deep.

其他细节:“真实”层次结构将比示例大得多。它在技术上可以具有无限深度,但实际上它很少会超过10级。

In summary, my question is if I can accomplish this with a single recursive cte instead of having to recurse over the hierarchy twice.

总而言之,我的问题是我是否可以使用单个递归cte完成此操作,而不必两次递归层次结构。

3 个解决方案

#1


1  

Okay this has been bothering me since I have read the question and I just came back to think of it again..... Anyway, why do you need to recurse back down to get all of the descendants? You have asked for ancestors not descendants and your result set is not trying to get other siblings, grand children, etc.. It is getting a parent and a grand parent in this case. Your First cte gives you everything you need to know except when an ancestor id is also the parentid. So with a union all, a little magic to setup the originating ancestor, and you have everything you need without a second recursion.

好吧,这一直困扰着我,因为我已经阅读了这个问题而且我刚回来再想一想.....无论如何,为什么你需要回归以获得所有的后代?你已经要求祖先不是后代,而你的结果集并没有试图让其他兄弟姐妹,大孩子等等。在这种情况下,它会得到一个父母和一个大父母。你的第一个cte为你提供了你需要知道的一切,除非祖先身份也是parentid。因此,使用联合所有,设置原始祖先有点神奇,并且您拥有所需的一切而无需第二次递归。

declare @test int = 6;

with cte as (

    -- leaf nodes
    select id, parentid, id as terminalid
    from HiearchyTest
    where isleaf = 1

    union all

    -- walk up - preserve "terminal" item for all levels
    select h.id, h.parentid, c.terminalid
    from HiearchyTest as h
    inner join
    cte as c on h.id = c.parentid

)

, cteAncestors AS (

    SELECT DISTINCT
       id = IIF(parentid IS NULL, @Test, id)
       ,parentid = IIF(parentid IS NULL,id,parentid)
    FROM
       cte
    WHERE
       terminalid = @test

    UNION

    SELECT DISTINCT
       id
       ,parentid = id
    FROM
       cte
    WHERE
       terminalid = @test
) 

SELECT
    id = parentid
    ,DecendentId = id
FROM
    cteAncestors
ORDER BY
    id
    ,DecendentId

Your result set from your first cte gives you your 2 ancestors and self related to their ancestor except in the case of the originating ancestors who's parentid is null. That null is a special case I will deal with in a minute.

你的第一个cte的结果集给你2个祖先和他们祖先的自我相关,除非原始祖先的parentid是空的。那个null是我将在一分钟内处理的一个特例。

SQL层次结构 - 解析给定节点的所有祖先的完整路径

Remember at this point your query is producing Ancestors not descendants, but what it doesn't give you is self references meaning grandparent = grandparent, parent = parent, self = self. But all you have to do to get that is to add rows for every id and make the parentid equal to their id. hence the union. Now your result set is almost totally shaped up:

请记住,此时你的查询是产生祖先而不是后代,但它没有给你的是自我引用意味着祖父母=祖父母,父母=父母,自我=自我。但是你需要做的就是为每个id添加行,并使parentid等于它们的id。因此工会。现在您的结果集几乎完全形成:

SQL层次结构 - 解析给定节点的所有祖先的完整路径

The special case of the null parentid. So the null parentid identifies the originating ancestor meaning that ancestor has no other ancestor in your dataset. And here is how you will use that to your advantage. Because you started your initial recursion at the leaf level there is no direct tie to the id that you started with to the originating ancestor, but there is at every other level, simply hijack that null parent id and flip the values around and you now have an ancestor for your leaf.

null parentid的特例。因此,null parentid标识了原始祖先,这意味着祖先在您的数据集中没有其他祖先。以下是您将如何利用这一优势。因为你在叶子级别开始你的初始递归,所以你没有直接绑定到你开始使用祖先的id,但是在其他每个级别上,只是劫持那个空的父id并翻转你周围的值,你现在有了你的叶子的祖先。

SQL层次结构 - 解析给定节点的所有祖先的完整路径

Then in the end if you want it to be a descendants table switch the columns and you are finished. One last note DISTINCTs are there in case the id is repeated with an additional parentid. E.g. 6 | 3 and another record for 6 | 4

然后最后如果你想让它成为一个后代表切换列,你就完成了。最后一个注意事项DISTINCT是存在的,以防id重复另一个parentid。例如。 6 | 3和另一个记录为6 | 4

SQL层次结构 - 解析给定节点的所有祖先的完整路径

#2


1  

I'm not sure if this performs better, or even produces the proper results in all cases, but you could capture a node list, then use xml functionality to parse it out and cross apply to the id list:

我不确定这是否表现更好,甚至在所有情况下都能产生正确的结果,但您可以捕获节点列表,然后使用xml功能将其解析出来并交叉应用于id列表:

declare @test int = 6;

;WITH cte AS (SELECT id, parentid, CAST(id AS VARCHAR(MAX)) as IDlist
              FROM HiearchyTest
              WHERE isleaf = 1
              UNION ALL
              SELECT h.id, h.parentid , CAST(CONCAT(c.IDlist,',',h.id) AS VARCHAR(MAX))
              FROM HiearchyTest as h
              JOIN cte as c 
                ON  h.id = c.parentid
            )
    ,cte2 AS (SELECT *, CAST ('<M>' + REPLACE(IDlist, ',', '</M><M>') + '</M>' AS XML) AS Data 
              FROM cte
              WHERE IDlist LIKE '%'+CAST(@test AS VARCHAR(50))+'%'
              )
SELECT id,Split.a.value('.', 'VARCHAR(100)') AS descendentid
FROM cte2 a
CROSS APPLY Data.nodes ('/M') AS Split(a); 

#3


1  

Because your data is a tree structure, we can use the hierarchyid data type to meet your needs (despite your saying that you can't in the comments). First, the easy part - generating the hierarchyid with a recursive cte

因为您的数据是树结构,所以我们可以使用hierarchyid数据类型来满足您的需求(尽管您说不能在评论中)。首先,简单的部分 - 使用递归cte生成hierarchyid

with cte as (

    select id, parentid, 
       cast(concat('/', id, '/') as varchar(max)) as [path]
    from [dbo].[HiearchyTest]
    where ParentID is null

    union all

    select child.id, child.parentid, 
       cast(concat(parent.[path], child.id, '/') as varchar(max))
    from [dbo].[HiearchyTest] as child
    join cte as parent
        on child.parentid = parent.id
)
select id, parentid, cast([path] as hierarchyid) as [path] 
into h
from cte;

Next, a little table-valued function I wrote:

接下来,我写了一个小的表值函数:

create function dbo.GetAllAncestors(@h hierarchyid, @ReturnSelf bit)
returns table
as return
   select @h.GetAncestor(n.n) as h
   from dbo.Numbers as n
   where n.n <= @h.GetLevel()
      or (@ReturnSelf = 1 and n.n = 0)

   union all

   select @h
   where @ReturnSelf = 1;

Armed with that, getting your desired result set isn't too bad:

有了这个,获得你想要的结果也不错:

declare @h hierarchyid;

set @h = (
    select path
    from h
    where id = 6
);

with cte as (
    select * 
    from h
    where [path].IsDescendantOf(@h) = 1
        or @h.IsDescendantOf([path]) = 1
)
select h.id as parent, c.id as descendentid
from cte as c
cross apply dbo.GetAllAncestors([path], 1) as a
join h
    on a.h = h.[path]
order by h.id, c.id;

Of course, you're missing out on a lot of the benefit of using a hierarchyid by not persisting it (you'll either have to keep it up to date in the side table or generate it every time). But there you go.

当然,你错过了使用hierarchyid而不是持久化的很多好处(你要么必须在边桌中保持最新,要么每次都生成它)。但是你去吧。

#1


1  

Okay this has been bothering me since I have read the question and I just came back to think of it again..... Anyway, why do you need to recurse back down to get all of the descendants? You have asked for ancestors not descendants and your result set is not trying to get other siblings, grand children, etc.. It is getting a parent and a grand parent in this case. Your First cte gives you everything you need to know except when an ancestor id is also the parentid. So with a union all, a little magic to setup the originating ancestor, and you have everything you need without a second recursion.

好吧,这一直困扰着我,因为我已经阅读了这个问题而且我刚回来再想一想.....无论如何,为什么你需要回归以获得所有的后代?你已经要求祖先不是后代,而你的结果集并没有试图让其他兄弟姐妹,大孩子等等。在这种情况下,它会得到一个父母和一个大父母。你的第一个cte为你提供了你需要知道的一切,除非祖先身份也是parentid。因此,使用联合所有,设置原始祖先有点神奇,并且您拥有所需的一切而无需第二次递归。

declare @test int = 6;

with cte as (

    -- leaf nodes
    select id, parentid, id as terminalid
    from HiearchyTest
    where isleaf = 1

    union all

    -- walk up - preserve "terminal" item for all levels
    select h.id, h.parentid, c.terminalid
    from HiearchyTest as h
    inner join
    cte as c on h.id = c.parentid

)

, cteAncestors AS (

    SELECT DISTINCT
       id = IIF(parentid IS NULL, @Test, id)
       ,parentid = IIF(parentid IS NULL,id,parentid)
    FROM
       cte
    WHERE
       terminalid = @test

    UNION

    SELECT DISTINCT
       id
       ,parentid = id
    FROM
       cte
    WHERE
       terminalid = @test
) 

SELECT
    id = parentid
    ,DecendentId = id
FROM
    cteAncestors
ORDER BY
    id
    ,DecendentId

Your result set from your first cte gives you your 2 ancestors and self related to their ancestor except in the case of the originating ancestors who's parentid is null. That null is a special case I will deal with in a minute.

你的第一个cte的结果集给你2个祖先和他们祖先的自我相关,除非原始祖先的parentid是空的。那个null是我将在一分钟内处理的一个特例。

SQL层次结构 - 解析给定节点的所有祖先的完整路径

Remember at this point your query is producing Ancestors not descendants, but what it doesn't give you is self references meaning grandparent = grandparent, parent = parent, self = self. But all you have to do to get that is to add rows for every id and make the parentid equal to their id. hence the union. Now your result set is almost totally shaped up:

请记住,此时你的查询是产生祖先而不是后代,但它没有给你的是自我引用意味着祖父母=祖父母,父母=父母,自我=自我。但是你需要做的就是为每个id添加行,并使parentid等于它们的id。因此工会。现在您的结果集几乎完全形成:

SQL层次结构 - 解析给定节点的所有祖先的完整路径

The special case of the null parentid. So the null parentid identifies the originating ancestor meaning that ancestor has no other ancestor in your dataset. And here is how you will use that to your advantage. Because you started your initial recursion at the leaf level there is no direct tie to the id that you started with to the originating ancestor, but there is at every other level, simply hijack that null parent id and flip the values around and you now have an ancestor for your leaf.

null parentid的特例。因此,null parentid标识了原始祖先,这意味着祖先在您的数据集中没有其他祖先。以下是您将如何利用这一优势。因为你在叶子级别开始你的初始递归,所以你没有直接绑定到你开始使用祖先的id,但是在其他每个级别上,只是劫持那个空的父id并翻转你周围的值,你现在有了你的叶子的祖先。

SQL层次结构 - 解析给定节点的所有祖先的完整路径

Then in the end if you want it to be a descendants table switch the columns and you are finished. One last note DISTINCTs are there in case the id is repeated with an additional parentid. E.g. 6 | 3 and another record for 6 | 4

然后最后如果你想让它成为一个后代表切换列,你就完成了。最后一个注意事项DISTINCT是存在的,以防id重复另一个parentid。例如。 6 | 3和另一个记录为6 | 4

SQL层次结构 - 解析给定节点的所有祖先的完整路径

#2


1  

I'm not sure if this performs better, or even produces the proper results in all cases, but you could capture a node list, then use xml functionality to parse it out and cross apply to the id list:

我不确定这是否表现更好,甚至在所有情况下都能产生正确的结果,但您可以捕获节点列表,然后使用xml功能将其解析出来并交叉应用于id列表:

declare @test int = 6;

;WITH cte AS (SELECT id, parentid, CAST(id AS VARCHAR(MAX)) as IDlist
              FROM HiearchyTest
              WHERE isleaf = 1
              UNION ALL
              SELECT h.id, h.parentid , CAST(CONCAT(c.IDlist,',',h.id) AS VARCHAR(MAX))
              FROM HiearchyTest as h
              JOIN cte as c 
                ON  h.id = c.parentid
            )
    ,cte2 AS (SELECT *, CAST ('<M>' + REPLACE(IDlist, ',', '</M><M>') + '</M>' AS XML) AS Data 
              FROM cte
              WHERE IDlist LIKE '%'+CAST(@test AS VARCHAR(50))+'%'
              )
SELECT id,Split.a.value('.', 'VARCHAR(100)') AS descendentid
FROM cte2 a
CROSS APPLY Data.nodes ('/M') AS Split(a); 

#3


1  

Because your data is a tree structure, we can use the hierarchyid data type to meet your needs (despite your saying that you can't in the comments). First, the easy part - generating the hierarchyid with a recursive cte

因为您的数据是树结构,所以我们可以使用hierarchyid数据类型来满足您的需求(尽管您说不能在评论中)。首先,简单的部分 - 使用递归cte生成hierarchyid

with cte as (

    select id, parentid, 
       cast(concat('/', id, '/') as varchar(max)) as [path]
    from [dbo].[HiearchyTest]
    where ParentID is null

    union all

    select child.id, child.parentid, 
       cast(concat(parent.[path], child.id, '/') as varchar(max))
    from [dbo].[HiearchyTest] as child
    join cte as parent
        on child.parentid = parent.id
)
select id, parentid, cast([path] as hierarchyid) as [path] 
into h
from cte;

Next, a little table-valued function I wrote:

接下来,我写了一个小的表值函数:

create function dbo.GetAllAncestors(@h hierarchyid, @ReturnSelf bit)
returns table
as return
   select @h.GetAncestor(n.n) as h
   from dbo.Numbers as n
   where n.n <= @h.GetLevel()
      or (@ReturnSelf = 1 and n.n = 0)

   union all

   select @h
   where @ReturnSelf = 1;

Armed with that, getting your desired result set isn't too bad:

有了这个,获得你想要的结果也不错:

declare @h hierarchyid;

set @h = (
    select path
    from h
    where id = 6
);

with cte as (
    select * 
    from h
    where [path].IsDescendantOf(@h) = 1
        or @h.IsDescendantOf([path]) = 1
)
select h.id as parent, c.id as descendentid
from cte as c
cross apply dbo.GetAllAncestors([path], 1) as a
join h
    on a.h = h.[path]
order by h.id, c.id;

Of course, you're missing out on a lot of the benefit of using a hierarchyid by not persisting it (you'll either have to keep it up to date in the side table or generate it every time). But there you go.

当然,你错过了使用hierarchyid而不是持久化的很多好处(你要么必须在边桌中保持最新,要么每次都生成它)。但是你去吧。