I need to implement Categorization and Sub-Categorization on something which is a bit similar to golden pages.
我需要对类似于黄金页面的东西实现分类和子分类。
Assume I have the following table:
假设我有下表:
Category Table
CategoryId, Title
10, Home
20, Business
30, Hobbies
I have two options to code the sub-categorization.
我有两个选项来编码子分类。
OPTION 1 - Subcategory Id is unique within Category ONLY:
Sub Category Table
CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 100, Development
20, 110, Marketing
20, 120, ...
30, 100, Soccer
30, 110, Reading
30, 120, ...
OPTION 2 - Subcategory Id is unique OVERALL:
Sub Category Table
CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 130, Development
20, 140, Marketing
20, 150, ...
30, 160, Soccer
30, 170, Reading
30, 180, ...
Option 2 sounds like it is easier to fetch rows from table For example: SELECT BizTitle FROM tblBiz WHERE SubCatId = 170
选项2听起来更容易从表中获取行例如:SELECT BizTitle FROM tblBiz WHERE SubCatId = 170
whereas using Option 1 I'd have to write something like this:
而使用选项1,我必须写这样的东西:
SELECT BizTitle FROM tblBiz WHERE CatId = 30 AND SubCatId = 170
i.e., containing an extra AND
即包含一个额外的AND
However, Option 1 is easier to maintain manually (when I need to update and insert new subcategories etc. and it is more pleasant to the eye in my opinion.
但是,选项1更容易手动维护(当我需要更新和插入新的子类别等时,在我看来它更令人愉悦。
Any thoughts about it? Does Option 2 worth the trouble in terms of efficiency? Is there any design patters related with this common issue?
有什么想法吗?选项2在效率方面是否值得麻烦?有没有与这个常见问题相关的设计模式?
3 个解决方案
#1
8
I would use this structure:
我会用这个结构:
ParentId, CategoryId, Title
null, 1, Home
null, 2, Business
null, 3, Hobbies
1, 4, Gardening
1, 5, Kitchen
1, 6, ...
2, 7, Development
2, 8, Marketing
2, 9, ...
3, 10, Soccer
3, 11, Reading
3, 12, ...
In detail:
详细:
- only use one table, which references itself, so that you can have unlimited depth of categories
- 只使用一个引用自身的表,这样你就可以拥有无限深度的类别
- use technical ids (using
IDENTITY
, or similar), so that you can have more than 10 subcategories - 使用技术ID(使用IDENTITY或类似),以便您可以拥有10个以上的子类别
- if required add a human readable column for category-numbers as separate field
- 如果需要,将类别编号的人类可读列添加为单独的字段
As long as you are only using two levels of categories you can still select like this:
只要您只使用两个级别的类别,您仍然可以选择如下:
SELECT BizTitle FROM tblBiz WHERE ParentId = 3 AND CategoryId = 11
The new hierarchyid
feature of SQL server also looks quite promising: https://msdn.microsoft.com/en-us/library/bb677173.aspx
SQL服务器的新hierarchyid功能看起来也很有希望:https://msdn.microsoft.com/en-us/library/bb677173.aspx
What I don't like about the Nested Set Model:
我不喜欢嵌套集模型:
- Inserting and deleting items in the Nested Set Model is a quite comlicated thing and requires expensive locks.
- 在嵌套集模型中插入和删除项是一个非常复杂的事情,需要昂贵的锁。
- One can easily create inconsistencies which is prohibited, if you use the
parent
field in combination with a foreign key constraint.- Inconsistencies can appear, if
rght
is lower thanlft
- 如果rght低于lft,则会出现不一致
- Inconsistencies can appear, if a value apprears in several
rght
orlft
fields - 如果某个值适用于几个rft或lft字段,则会出现不一致
- Inconsistencies can appear, if you create gaps
- 如果您创建差距,则可能会出现不一致
- Inconsistencies can appear, if you create overlaps
- 如果您创建重叠,则可能会出现不一致
- Inconsistencies can appear, if
- 如果将父字段与外键约束结合使用,则可以轻松地创建禁止的不一致。如果rght低于lft,则会出现不一致。如果某个值出现在几个rft或lft字段中,则会出现不一致。如果创建间隙,则会出现不一致性如果创建重叠,则会出现Inconsistencies
- The Nested Set Model is in my opinion more complex and therefore not as easy to understand. This is absolutely subjective, of course.
- 在我看来,嵌套集模型更复杂,因此不容易理解。当然,这绝对是主观的。
- The Nested Set Model requires two fields, instead of one - and so uses more disk space.
- 嵌套集模型需要两个字段,而不是一个 - 因此使用更多的磁盘空间。
#2
6
Managing hierarchical data has some ways. One of the most important one is Nested Set Model
.
管理分层数据有一些方法。其中最重要的一个是嵌套集模型。
See here for implementation.
请参阅此处以了解实施
Even some content management systems like joomla, use this structure.
甚至像joomla这样的内容管理系统也使用这种结构。
#3
1
I'd recommend going with option 1 - keep sub-category unique within category. Let's say I have two categories of unrelated items:
我建议使用选项1 - 保持子类别在类别中唯一。假设我有两类不相关的项目:
- Fruits
- 水果
- Colors
- 颜色
For each I want a subcategory
对于每个我想要一个子类别
-
Fruits
水果
- Apple
- 苹果
- Orange
- 橙子
-
Colors
颜色
- White
- 白色
- Orange
- 橙子
Notice that Orange is a sub-category within each category. Although the name is the same, it's function is very different. (Let's not get into the possibility that Orange fruit is orange in color)
请注意,Orange是每个类别中的子类别。虽然名称相同,但它的功能却截然不同。 (我们没有考虑到橙色水果是橙色的可能性)
With this design, if someone changes their mind and wants to rename Orange to Oranges, fine. It's easy to change without affecting Orange sub-category under Colors.
有了这个设计,如果有人改变主意并希望将Orange重命名为Oranges,那很好。在不影响颜色下的橙色子类别的情况下,可以轻松更改。
If your UI is built in such a way that Marketing can control subcategories of Colors whereas Production can control subcategories of Fruits, this design will allow Marketing to work with their subcategories without stepping over Production's subcategories.
如果您的UI构建方式使Marketing可以控制Colors的子类别而Production可以控制Fruits的子类别,则此设计将允许Marketing使用其子类别而不会超越Production的子类别。
#1
8
I would use this structure:
我会用这个结构:
ParentId, CategoryId, Title
null, 1, Home
null, 2, Business
null, 3, Hobbies
1, 4, Gardening
1, 5, Kitchen
1, 6, ...
2, 7, Development
2, 8, Marketing
2, 9, ...
3, 10, Soccer
3, 11, Reading
3, 12, ...
In detail:
详细:
- only use one table, which references itself, so that you can have unlimited depth of categories
- 只使用一个引用自身的表,这样你就可以拥有无限深度的类别
- use technical ids (using
IDENTITY
, or similar), so that you can have more than 10 subcategories - 使用技术ID(使用IDENTITY或类似),以便您可以拥有10个以上的子类别
- if required add a human readable column for category-numbers as separate field
- 如果需要,将类别编号的人类可读列添加为单独的字段
As long as you are only using two levels of categories you can still select like this:
只要您只使用两个级别的类别,您仍然可以选择如下:
SELECT BizTitle FROM tblBiz WHERE ParentId = 3 AND CategoryId = 11
The new hierarchyid
feature of SQL server also looks quite promising: https://msdn.microsoft.com/en-us/library/bb677173.aspx
SQL服务器的新hierarchyid功能看起来也很有希望:https://msdn.microsoft.com/en-us/library/bb677173.aspx
What I don't like about the Nested Set Model:
我不喜欢嵌套集模型:
- Inserting and deleting items in the Nested Set Model is a quite comlicated thing and requires expensive locks.
- 在嵌套集模型中插入和删除项是一个非常复杂的事情,需要昂贵的锁。
- One can easily create inconsistencies which is prohibited, if you use the
parent
field in combination with a foreign key constraint.- Inconsistencies can appear, if
rght
is lower thanlft
- 如果rght低于lft,则会出现不一致
- Inconsistencies can appear, if a value apprears in several
rght
orlft
fields - 如果某个值适用于几个rft或lft字段,则会出现不一致
- Inconsistencies can appear, if you create gaps
- 如果您创建差距,则可能会出现不一致
- Inconsistencies can appear, if you create overlaps
- 如果您创建重叠,则可能会出现不一致
- Inconsistencies can appear, if
- 如果将父字段与外键约束结合使用,则可以轻松地创建禁止的不一致。如果rght低于lft,则会出现不一致。如果某个值出现在几个rft或lft字段中,则会出现不一致。如果创建间隙,则会出现不一致性如果创建重叠,则会出现Inconsistencies
- The Nested Set Model is in my opinion more complex and therefore not as easy to understand. This is absolutely subjective, of course.
- 在我看来,嵌套集模型更复杂,因此不容易理解。当然,这绝对是主观的。
- The Nested Set Model requires two fields, instead of one - and so uses more disk space.
- 嵌套集模型需要两个字段,而不是一个 - 因此使用更多的磁盘空间。
#2
6
Managing hierarchical data has some ways. One of the most important one is Nested Set Model
.
管理分层数据有一些方法。其中最重要的一个是嵌套集模型。
See here for implementation.
请参阅此处以了解实施
Even some content management systems like joomla, use this structure.
甚至像joomla这样的内容管理系统也使用这种结构。
#3
1
I'd recommend going with option 1 - keep sub-category unique within category. Let's say I have two categories of unrelated items:
我建议使用选项1 - 保持子类别在类别中唯一。假设我有两类不相关的项目:
- Fruits
- 水果
- Colors
- 颜色
For each I want a subcategory
对于每个我想要一个子类别
-
Fruits
水果
- Apple
- 苹果
- Orange
- 橙子
-
Colors
颜色
- White
- 白色
- Orange
- 橙子
Notice that Orange is a sub-category within each category. Although the name is the same, it's function is very different. (Let's not get into the possibility that Orange fruit is orange in color)
请注意,Orange是每个类别中的子类别。虽然名称相同,但它的功能却截然不同。 (我们没有考虑到橙色水果是橙色的可能性)
With this design, if someone changes their mind and wants to rename Orange to Oranges, fine. It's easy to change without affecting Orange sub-category under Colors.
有了这个设计,如果有人改变主意并希望将Orange重命名为Oranges,那很好。在不影响颜色下的橙色子类别的情况下,可以轻松更改。
If your UI is built in such a way that Marketing can control subcategories of Colors whereas Production can control subcategories of Fruits, this design will allow Marketing to work with their subcategories without stepping over Production's subcategories.
如果您的UI构建方式使Marketing可以控制Colors的子类别而Production可以控制Fruits的子类别,则此设计将允许Marketing使用其子类别而不会超越Production的子类别。