以数据库方式处理标签的更有效方法是什么？

Is it more efficient to use a taglist field, with all the tags separated by a space, or use 2 more tables (tag: tagid tagtext, tagitem: tagid, itemid)?

使用taglist字段更有效，所有标签用空格分隔，还是再使用2个表（tag：tagid tagtext，tagitem：tagid，itemid）？

4 个解决方案

#1

The efficiency largely depends on what you are doing. If you want to query based on the tag name, it is probably faster if you have a tag table with the ID keyed on both the tag and items table (i.e. option #2). However, unless you have thousands of rows of either, it probably won't make a difference. If you don't have that many tags at all, the difference will be even less.

效率很大程度上取决于你在做什么。如果要根据标记名称进行查询，如果您的标记表在标记和项目表（即选项＃2）上都标记了ID，则可能会更快。但是，除非你有数千行，否则它可能没什么区别。如果您根本没有那么多标签，那么差异会更小。

If you want to get tags by item IDs, though, the first method is ever so slightly faster. Again, I doubt you will notice.

但是，如果您想按项目ID获取标签，那么第一种方法的速度要快得多。我再次怀疑你会注意到。

There are other considerations to make: data integrity and normalization. If you use two tables and foreign keys, it is much easier for you to have your set of tags be consistent with the items. If a tag is removed and you are only using one table, old items will still have the old tags. Additionally, it's much easier to get a list of unique tags and keep it consistent. If you have tags in another table, this opens up a whole new world of organization: you can make timestamps for tag creation and modification, mark tags as active or inactive (and possibly other statuses), etc.

还有其他考虑因素：数据完整性和规范化。如果使用两个表和外键，则可以更轻松地使标记集与项目保持一致。如果删除了某个代码并且您只使用了一个表格，则旧项目仍会包含旧代码。此外，获取唯一标记列表并使其保持一致更容易。如果您在另一个表中有标签，则会打开一个全新的组织世界：您可以为标签创建和修改创建时间戳，将标签标记为活动或非活动（以及可能的其他状态）等。

#2

The second option. Store the tags separately. You won't be able to write good queries to search on a specific tag if you store them in a single field. You don't want to use MATCH or LIKE to filter on tags. By storing them in a separate table, you can easily find the tags you need, and the related articles too. Your tables do need to be properly indexed, though.

第二种选择。分别存储标签。如果将特定标记存储在单个字段中，则无法编写好的查询来搜索特定标记。您不希望使用MATCH或LIKE来过滤标记。通过将它们存储在单独的表中，您可以轻松找到所需的标签以及相关文章。但是，您的表确实需要正确编入索引。

Never store comma/space/otherwise separated values in a database if you need to query for those values. The whole essence of a database is to store the data in a structured way. This way the database can optimize the retrieval of that data to a great extent.

如果需要查询这些值，切勿在数据库中存储逗号/空格/其他分隔值。数据库的全部本质是以结构化的方式存储数据。这样，数据库可以在很大程度上优化对该数据的检索。

#3

The second version, to split the data into two additional tables, is a lot more efficient, as it allows the database to use indexes to run the queries you mostly need (Get all texts with a certain tag, get a count of how often the tags are used sorted by count for the tag cloud, and get all tags for the given text)

第二个版本，将数据拆分为两个额外的表，效率更高，因为它允许数据库使用索引来运行您最需要的查询（获取具有特定标记的所有文本，计算频率的数量）标签按标签云的计数使用，并获取给定文本的所有标签）

#4

-1

One table will be more efficient, but having two tables is generally the proper way to store simple tags.

一个表将更有效，但有两个表通常是存储简单标记的正确方法。

#1

If you want to get tags by item IDs, though, the first method is ever so slightly faster. Again, I doubt you will notice.

但是，如果您想按项目ID获取标签，那么第一种方法的速度要快得多。我再次怀疑你会注意到。

#2

#3

#4

-1

One table will be more efficient, but having two tables is generally the proper way to store simple tags.

一个表将更有效，但有两个表通常是存储简单标记的正确方法。