关系数据库中的分层过滤

I have a bunch of items in my program that all belong to a specific category. I'd like to return only the items that belong to that category. The problem is that categories can have parent categories. For example, let's say there's a category "Stuff" with the child category "Food" with the child category "Fruit". I have the items, Apple, Pear, Chocolate, and Computer.

我的程序中有很多项都属于特定类别。我想只返回属于该类别的项目。问题是类别可以有父类别。例如,假设有一个类别为“Stuff”的类别为“Food”,子类别为“Fruit”。我有物品,苹果,梨,巧克力和电脑。

If I want to display all of the fruits, it's easy to do a database query with a "WHERE item.category = FRUIT_ID" clause. However, if I want all foods to be included, I need a way to get the fruits in there, too.

如果我想显示所有结果,可以使用“WHERE item.category = FRUIT_ID”子句轻松进行数据库查询。但是,如果我想要包括所有食物,我也需要一种方法来获取水果。

I know that some databases, like Oracle, have a notion of recursive queries, and that might be the right solution, but I don't have a lot of experiences with hierarchical data and am looking for general suggestions. Assume I have unlimited control over the database schema, the category tree only goes maybe 5 categories deep maximum, and I need it to be as ridiculously fast as possible.

我知道有些数据库,比如Oracle,有一个递归查询的概念,这可能是正确的解决方案,但我没有很多层次数据的经验,我正在寻找一般的建议。假设我可以无限制地控制数据库模式,类别树最多可能只有5个类别,我需要它尽可能快得离谱。

5 个解决方案

#1

Have a look at the adjacency list model - it's not perfect (it's very slow to update), but in some situations (hierarchical queries), it's a great representation, especially for problems like yours.

看一下邻接列表模型 - 它并不完美(更新速度很慢),但在某些情况下(分层查询),它是一个很好的代表,特别是对于像你这样的问题。

#2

There's a whole book full of design strategies for representing trees in SQL. It's worth looking at just for the sheer clever points.

有一整本书充满了用于在SQL中表示树的设计策略。值得关注的只是纯粹的聪明点。

#3

Assuming your category tree is small enough to be cached, you might be better off keeping the category tree in memory and have a function over that tree that will generate a list of category id's that are below a given category.

假设您的类别树足够小以便缓存,您最好将类别树保留在内存中,并在该树上具有一个函数,该函数将生成一个低于给定类别的类别ID列表。

Then when you query the database, you just use an IN clause with the list of child IDs

然后,当您查询数据库时,只需使用带有子ID列表的IN子句

#4

One possible solution is to separate the hierarchy from the actual categorization. For instance, an apple could be categorized as both a fruit and a food. The categorization has no knowledge that a fruit is a food, but you could define that somewhere else. Then, your query would be as simple as where category='food'.

一种可能的解决方案是将层次结构与实际分类分开。例如,苹果可以分类为水果和食物。分类不知道水果是食物,但你可以在其他地方定义。然后,您的查询将像category ='food'一样简单。

Alternatively, you could go through the hierarchy before building your query and it would require something like where category='food' or category='fruit'.

或者,您可以在构建查询之前浏览层次结构,并且需要类似于category ='food'或category ='fruit'的内容。

#5

I think your database schema is quite fine, but the implementation of this search really depends on your specific RDBMS. A lot of them have ways to perform this sort of recursion. One example I can think of is SQL Server's support of Common Table Expressions which are lightning fast alternatives to those nasty cursors.

我认为您的数据库架构非常好,但此搜索的实现实际上取决于您的特定RDBMS。他们中的很多人都有办法执行这种递归。我能想到的一个例子是SQL Server对Common Table Expressions的支持,它们是那些令人讨厌的游标的快速替代品。

If you specify which RDBMS you're using, you might get more specific answers.

如果您指定了您正在使用的RDBMS,则可能会获得更具体的答案。

#1