Linq中的分层数据 - 选项和性能

时间:2020-12-18 16:31:30

I have some hierarchical data - each entry has an id and a (nullable) parent entry id. I want to retrieve all entries in the tree under a given entry. This is in a SQL Server 2005 database. I am querying it with LINQ to SQL in C# 3.5.

我有一些分层数据 - 每个条目都有一个id和一个(可空)父条目id。我想检索给定条目下树中的所有条目。这是在SQL Server 2005数据库中。我在C#3.5中使用LINQ to SQL查询它。

LINQ to SQL does not support Common Table Expressions directly. My choices are to assemble the data in code with several LINQ queries, or to make a view on the database that surfaces a CTE.

LINQ to SQL不直接支持公用表表达式。我的选择是使用多个LINQ查询在代码中组装数据,或者在面向CTE的数据库上进行查看。

Which option (or another option) do you think will perform better when data volumes get large? Is SQL Server 2008's HierarchyId type supported in Linq to SQL?

当数据量变大时,您认为哪个选项(或其他选项)会表现更好? Linq to SQL是否支持SQL Server 2008的HierarchyId类型?

9 个解决方案

#1


6  

I would set up a view and an associated table-based function based on the CTE. My reasoning for this is that, while you could implement the logic on the application side, this would involve sending the intermediate data over the wire for computation in the application. Using the DBML designer, the view translates into a Table entity. You can then associate the function with the Table entity and invoke the method created on the DataContext to derive objects of the type defined by the view. Using the table-based function allows the query engine to take your parameters into account while constructing the result set rather than applying a condition on the result set defined by the view after the fact.

我将基于CTE设置视图和相关的基于表的函数。我的理由是,虽然您可以在应用程序端实现逻辑,但这将涉及通过线路发送中间数据以便在应用程序中进行计算。使用DBML设计器,视图转换为Table实体。然后,您可以将该函数与Table实体相关联,并调用在DataContext上创建的方法,以派生由该视图定义的类型的对象。使用基于表的函数允许查询引擎在构造结果集时考虑您的参数,而不是在事实之后对视图定义的结果集应用条件。

CREATE TABLE [dbo].[hierarchical_table](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [parent_id] [int] NULL,
    [data] [varchar](255) NOT NULL,
 CONSTRAINT [PK_hierarchical_table] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE VIEW [dbo].[vw_recursive_view]
AS
WITH hierarchy_cte(id, parent_id, data, lvl) AS
(SELECT     id, parent_id, data, 0 AS lvl
      FROM         dbo.hierarchical_table
      WHERE     (parent_id IS NULL)
      UNION ALL
      SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
      FROM         dbo.hierarchical_table AS t1 INNER JOIN
                            hierarchy_cte AS h ON t1.parent_id = h.id)
SELECT     id, parent_id, data, lvl
FROM         hierarchy_cte AS result


CREATE FUNCTION [dbo].[fn_tree_for_parent] 
(
    @parent int
)
RETURNS 
@result TABLE 
(
    id int not null,
    parent_id int,
    data varchar(255) not null,
    lvl int not null
)
AS
BEGIN
    WITH hierarchy_cte(id, parent_id, data, lvl) AS
   (SELECT     id, parent_id, data, 0 AS lvl
        FROM         dbo.hierarchical_table
        WHERE     (id = @parent OR (parent_id IS NULL AND @parent IS NULL))
        UNION ALL
        SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
        FROM         dbo.hierarchical_table AS t1 INNER JOIN
            hierarchy_cte AS h ON t1.parent_id = h.id)
    INSERT INTO @result
    SELECT     id, parent_id, data, lvl
    FROM         hierarchy_cte AS result
RETURN 
END

ALTER TABLE [dbo].[hierarchical_table]  WITH CHECK ADD  CONSTRAINT [FK_hierarchical_table_hierarchical_table] FOREIGN KEY([parent_id])
REFERENCES [dbo].[hierarchical_table] ([id])

ALTER TABLE [dbo].[hierarchical_table] CHECK CONSTRAINT [FK_hierarchical_table_hierarchical_table]

To use it you would do something like -- assuming some reasonable naming scheme:

要使用它,你会做一些事情 - 假设一些合理的命名方案:

using (DataContext dc = new HierarchicalDataContext())
{
    HierarchicalTableEntity h = (from e in dc.HierarchicalTableEntities
                                 select e).First();
    var query = dc.FnTreeForParent( h.ID );
    foreach (HierarchicalTableViewEntity entity in query) {
        ...process the tree node...
    }
}

#2


15  

This option might also prove useful:

此选项也可能有用:

LINQ AsHierarchy() extension method
http://www.scip.be/index.php?Page=ArticlesNET18

LINQ AsHierarchy()扩展方法http://www.scip.be/index.php?Page=ArticlesNET18

#3


8  

I am surprised nobody has mentioned an alternative database design - when hierarchy needs to be flattened from multiple levels and retrieved with high performance (not so considering storage space) it is better to use another entity-2-entity table to track hierarchy instead of parent_id approach.

我很惊讶没有人提到过替代数据库设计 - 当层次结构需要从多个层次展平并以高性能检索(不是考虑存储空间)时,最好使用另一个实体2实体表来跟踪层次结构而不是parent_id做法。

It will allow not only single parent relations but also multi parent relations, level indications and different types of relationships:

它不仅允许单亲关系,还允许多父关系,水平指示和不同类型的关系:

CREATE TABLE Person (
  Id INTEGER,
  Name TEXT
);

CREATE TABLE PersonInPerson (
  PersonId INTEGER NOT NULL,
  InPersonId INTEGER NOT NULL,
  Level INTEGER,
  RelationKind VARCHAR(1)
);

#4


3  

I have done this two ways:

我这样做了两个方面:

  1. Drive the retrieval of each layer of the tree based on user input. Imagine a tree view control populated with the root node, the children of the root, and the grandchildren of the root. Only the root and the children are expanded (grandchildren are hidden with the collapse). As the user expands a child node the grandchildren of the root are display (that were previously retrieved and hidden), and a retrieval of all of the great-grandchildren is launched. Repeat the pattern for N-layers deep. This pattern works very well for large trees (depth or width) because it only retrieves the portion of the tree needed.
  2. 根据用户输入驱动树的每一层的检索。想象一下,树视图控件填充了根节点,根节点的子节点和根节点的孙子节点。只扩展了根和子项(孙子被崩溃隐藏)。当用户扩展子节点时,显示根的孙子(先前已检索并隐藏),并且启动对所有曾孙的检索。重复N层深度的模式。这种模式适用于大树(深度或宽度),因为它只检索所需树的部分。
  3. Use a stored procedure with LINQ. Use something like a common table expression on the server to build your results in a flat table, or build an XML tree in T-SQL. Scott Guthrie has a great article about using stored procs in LINQ. Build your tree from the results when they come back if in a flat format, or use the XML tree if that is that is what you return.
  4. 使用LINQ存储过程。在服务器上使用类似公用表表达式的东西来在平面表中构建结果,或者在T-SQL中构建XML树。 Scott Guthrie有一篇关于在LINQ中使用存储过程的好文章。如果以平面格式返回,则从结果中构建树,或者如果那是您返回的,则使用XML树。

#5


3  

This extension method could potentially be modified to use IQueryable. I've used it succesfully in the past on a collection of objects. It may work for your scenario.

可以修改此扩展方法以使用IQueryable。我过去在一组对象上成功地使用了它。它可能适用于您的场景。

public static IEnumerable<T> ByHierarchy<T>(
 this IEnumerable<T> source, Func<T, bool> startWith, Func<T, T, bool> connectBy)
{
  if (source == null)
   throw new ArgumentNullException("source");

  if (startWith == null)
   throw new ArgumentNullException("startWith");

  if (connectBy == null)
   throw new ArgumentNullException("connectBy");

  foreach (T root in source.Where(startWith))
  {
   yield return root;
   foreach (T child in source.ByHierarchy(c => connectBy(root, c), connectBy))
   {
    yield return child;
   }
 }
}

Here is how I called it:

以下是我的称呼方式:

comments.ByHierarchy(comment => comment.ParentNum == parentNum, 
 (parent, child) => child.ParentNum == parent.CommentNum && includeChildren)

This code is an improved, bug-fixed version of the code found here.

此代码是此处的代码的改进,错误修复版本。

#6


2  

In MS SQL 2008 you could use HierarchyID directly, in sql2005 you may have to implement them manually. ParentID is not that performant on large data sets. Also check this article for more discussion on the topic.

在MS SQL 2008中,您可以直接使用HierarchyID,在sql2005中,您可能必须手动实现它们。 ParentID不适用于大型数据集。另请查看本文以获取有关该主题的更多讨论。

#7


1  

I got this approach from Rob Conery's blog (check around Pt. 6 for this code, also on codeplex) and I love using it. This could be refashioned to support multiple "sub" levels.

我从Rob Conery的博客那里得到了这个方法(查看第6条,这个代码,也在codeplex上),我喜欢使用它。这可以重新设计以支持多个“子”级别。

var categories = from c in db.Categories
                 select new Category
                 {
                     CategoryID = c.CategoryID,
                     ParentCategoryID = c.ParentCategoryID,
                     SubCategories = new List<Category>(
                                      from sc in db.Categories
                                      where sc.ParentCategoryID == c.CategoryID
                                      select new Category {
                                        CategoryID = sc.CategoryID, 
                                        ParentProductID = sc.ParentProductID
                                        }
                                      )
                             };

#8


0  

The trouble with fetching the data from the client side is that you can never be sure how deep you need to go. This method will do one roundtrip per depth and it could be union'd to do from 0 to a specified depth in one roundtrip.

从客户端获取数据的麻烦在于,您无法确定需要走多远。这种方法将在每个深度进行一次往返,并且可以在一次往返中从0到指定深度进行联合。

public IQueryable<Node> GetChildrenAtDepth(int NodeID, int depth)
{
  IQueryable<Node> query = db.Nodes.Where(n => n.NodeID == NodeID);
  for(int i = 0; i < depth; i++)
    query = query.SelectMany(n => n.Children);
       //use this if the Children association has not been defined
    //query = query.SelectMany(n => db.Nodes.Where(c => c.ParentID == n.NodeID));
  return query;
}

It can't, however, do arbitrary depth. If you really do require arbitrary depth, you need to do that in the database - so you can make the correct decision to stop.

但是,它不能做任意深度。如果你确实需要任意深度,你需要在数据库中这样做 - 这样你就可以做出正确的决定来停止。

#9


0  

Please read the following link.

请阅读以下链接。

http://support.microsoft.com/default.aspx?scid=kb;en-us;q248915

http://support.microsoft.com/default.aspx?scid=kb;en-us;q248915

#1


6  

I would set up a view and an associated table-based function based on the CTE. My reasoning for this is that, while you could implement the logic on the application side, this would involve sending the intermediate data over the wire for computation in the application. Using the DBML designer, the view translates into a Table entity. You can then associate the function with the Table entity and invoke the method created on the DataContext to derive objects of the type defined by the view. Using the table-based function allows the query engine to take your parameters into account while constructing the result set rather than applying a condition on the result set defined by the view after the fact.

我将基于CTE设置视图和相关的基于表的函数。我的理由是,虽然您可以在应用程序端实现逻辑,但这将涉及通过线路发送中间数据以便在应用程序中进行计算。使用DBML设计器,视图转换为Table实体。然后,您可以将该函数与Table实体相关联,并调用在DataContext上创建的方法,以派生由该视图定义的类型的对象。使用基于表的函数允许查询引擎在构造结果集时考虑您的参数,而不是在事实之后对视图定义的结果集应用条件。

CREATE TABLE [dbo].[hierarchical_table](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [parent_id] [int] NULL,
    [data] [varchar](255) NOT NULL,
 CONSTRAINT [PK_hierarchical_table] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE VIEW [dbo].[vw_recursive_view]
AS
WITH hierarchy_cte(id, parent_id, data, lvl) AS
(SELECT     id, parent_id, data, 0 AS lvl
      FROM         dbo.hierarchical_table
      WHERE     (parent_id IS NULL)
      UNION ALL
      SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
      FROM         dbo.hierarchical_table AS t1 INNER JOIN
                            hierarchy_cte AS h ON t1.parent_id = h.id)
SELECT     id, parent_id, data, lvl
FROM         hierarchy_cte AS result


CREATE FUNCTION [dbo].[fn_tree_for_parent] 
(
    @parent int
)
RETURNS 
@result TABLE 
(
    id int not null,
    parent_id int,
    data varchar(255) not null,
    lvl int not null
)
AS
BEGIN
    WITH hierarchy_cte(id, parent_id, data, lvl) AS
   (SELECT     id, parent_id, data, 0 AS lvl
        FROM         dbo.hierarchical_table
        WHERE     (id = @parent OR (parent_id IS NULL AND @parent IS NULL))
        UNION ALL
        SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
        FROM         dbo.hierarchical_table AS t1 INNER JOIN
            hierarchy_cte AS h ON t1.parent_id = h.id)
    INSERT INTO @result
    SELECT     id, parent_id, data, lvl
    FROM         hierarchy_cte AS result
RETURN 
END

ALTER TABLE [dbo].[hierarchical_table]  WITH CHECK ADD  CONSTRAINT [FK_hierarchical_table_hierarchical_table] FOREIGN KEY([parent_id])
REFERENCES [dbo].[hierarchical_table] ([id])

ALTER TABLE [dbo].[hierarchical_table] CHECK CONSTRAINT [FK_hierarchical_table_hierarchical_table]

To use it you would do something like -- assuming some reasonable naming scheme:

要使用它,你会做一些事情 - 假设一些合理的命名方案:

using (DataContext dc = new HierarchicalDataContext())
{
    HierarchicalTableEntity h = (from e in dc.HierarchicalTableEntities
                                 select e).First();
    var query = dc.FnTreeForParent( h.ID );
    foreach (HierarchicalTableViewEntity entity in query) {
        ...process the tree node...
    }
}

#2


15  

This option might also prove useful:

此选项也可能有用:

LINQ AsHierarchy() extension method
http://www.scip.be/index.php?Page=ArticlesNET18

LINQ AsHierarchy()扩展方法http://www.scip.be/index.php?Page=ArticlesNET18

#3


8  

I am surprised nobody has mentioned an alternative database design - when hierarchy needs to be flattened from multiple levels and retrieved with high performance (not so considering storage space) it is better to use another entity-2-entity table to track hierarchy instead of parent_id approach.

我很惊讶没有人提到过替代数据库设计 - 当层次结构需要从多个层次展平并以高性能检索(不是考虑存储空间)时,最好使用另一个实体2实体表来跟踪层次结构而不是parent_id做法。

It will allow not only single parent relations but also multi parent relations, level indications and different types of relationships:

它不仅允许单亲关系,还允许多父关系,水平指示和不同类型的关系:

CREATE TABLE Person (
  Id INTEGER,
  Name TEXT
);

CREATE TABLE PersonInPerson (
  PersonId INTEGER NOT NULL,
  InPersonId INTEGER NOT NULL,
  Level INTEGER,
  RelationKind VARCHAR(1)
);

#4


3  

I have done this two ways:

我这样做了两个方面:

  1. Drive the retrieval of each layer of the tree based on user input. Imagine a tree view control populated with the root node, the children of the root, and the grandchildren of the root. Only the root and the children are expanded (grandchildren are hidden with the collapse). As the user expands a child node the grandchildren of the root are display (that were previously retrieved and hidden), and a retrieval of all of the great-grandchildren is launched. Repeat the pattern for N-layers deep. This pattern works very well for large trees (depth or width) because it only retrieves the portion of the tree needed.
  2. 根据用户输入驱动树的每一层的检索。想象一下,树视图控件填充了根节点,根节点的子节点和根节点的孙子节点。只扩展了根和子项(孙子被崩溃隐藏)。当用户扩展子节点时,显示根的孙子(先前已检索并隐藏),并且启动对所有曾孙的检索。重复N层深度的模式。这种模式适用于大树(深度或宽度),因为它只检索所需树的部分。
  3. Use a stored procedure with LINQ. Use something like a common table expression on the server to build your results in a flat table, or build an XML tree in T-SQL. Scott Guthrie has a great article about using stored procs in LINQ. Build your tree from the results when they come back if in a flat format, or use the XML tree if that is that is what you return.
  4. 使用LINQ存储过程。在服务器上使用类似公用表表达式的东西来在平面表中构建结果,或者在T-SQL中构建XML树。 Scott Guthrie有一篇关于在LINQ中使用存储过程的好文章。如果以平面格式返回,则从结果中构建树,或者如果那是您返回的,则使用XML树。

#5


3  

This extension method could potentially be modified to use IQueryable. I've used it succesfully in the past on a collection of objects. It may work for your scenario.

可以修改此扩展方法以使用IQueryable。我过去在一组对象上成功地使用了它。它可能适用于您的场景。

public static IEnumerable<T> ByHierarchy<T>(
 this IEnumerable<T> source, Func<T, bool> startWith, Func<T, T, bool> connectBy)
{
  if (source == null)
   throw new ArgumentNullException("source");

  if (startWith == null)
   throw new ArgumentNullException("startWith");

  if (connectBy == null)
   throw new ArgumentNullException("connectBy");

  foreach (T root in source.Where(startWith))
  {
   yield return root;
   foreach (T child in source.ByHierarchy(c => connectBy(root, c), connectBy))
   {
    yield return child;
   }
 }
}

Here is how I called it:

以下是我的称呼方式:

comments.ByHierarchy(comment => comment.ParentNum == parentNum, 
 (parent, child) => child.ParentNum == parent.CommentNum && includeChildren)

This code is an improved, bug-fixed version of the code found here.

此代码是此处的代码的改进,错误修复版本。

#6


2  

In MS SQL 2008 you could use HierarchyID directly, in sql2005 you may have to implement them manually. ParentID is not that performant on large data sets. Also check this article for more discussion on the topic.

在MS SQL 2008中,您可以直接使用HierarchyID,在sql2005中,您可能必须手动实现它们。 ParentID不适用于大型数据集。另请查看本文以获取有关该主题的更多讨论。

#7


1  

I got this approach from Rob Conery's blog (check around Pt. 6 for this code, also on codeplex) and I love using it. This could be refashioned to support multiple "sub" levels.

我从Rob Conery的博客那里得到了这个方法(查看第6条,这个代码,也在codeplex上),我喜欢使用它。这可以重新设计以支持多个“子”级别。

var categories = from c in db.Categories
                 select new Category
                 {
                     CategoryID = c.CategoryID,
                     ParentCategoryID = c.ParentCategoryID,
                     SubCategories = new List<Category>(
                                      from sc in db.Categories
                                      where sc.ParentCategoryID == c.CategoryID
                                      select new Category {
                                        CategoryID = sc.CategoryID, 
                                        ParentProductID = sc.ParentProductID
                                        }
                                      )
                             };

#8


0  

The trouble with fetching the data from the client side is that you can never be sure how deep you need to go. This method will do one roundtrip per depth and it could be union'd to do from 0 to a specified depth in one roundtrip.

从客户端获取数据的麻烦在于,您无法确定需要走多远。这种方法将在每个深度进行一次往返,并且可以在一次往返中从0到指定深度进行联合。

public IQueryable<Node> GetChildrenAtDepth(int NodeID, int depth)
{
  IQueryable<Node> query = db.Nodes.Where(n => n.NodeID == NodeID);
  for(int i = 0; i < depth; i++)
    query = query.SelectMany(n => n.Children);
       //use this if the Children association has not been defined
    //query = query.SelectMany(n => db.Nodes.Where(c => c.ParentID == n.NodeID));
  return query;
}

It can't, however, do arbitrary depth. If you really do require arbitrary depth, you need to do that in the database - so you can make the correct decision to stop.

但是,它不能做任意深度。如果你确实需要任意深度,你需要在数据库中这样做 - 这样你就可以做出正确的决定来停止。

#9


0  

Please read the following link.

请阅读以下链接。

http://support.microsoft.com/default.aspx?scid=kb;en-us;q248915

http://support.microsoft.com/default.aspx?scid=kb;en-us;q248915