哪些DB设计更好?

时间:2022-12-16 12:48:21

I'm trying to understand some concepts of DB desing.

我试图理解数据库设计的一些概念。

I have three tables:

我有三张桌子:

Movies (id,title)
1 - The godfather
2 - Matrix


Attribute (id,name)
1 - Country
2 - Type

Attribute Value(attribute_id,id,value)

1,1,USA
1,2,Japan
2,1,Thriller
2,2,Comedy

and I would like to link movies with one and just one attribute and with one attribute value

我想将电影与一个属性和一个属性以及一个属性值相关联

IE: Godfather, Country: USA, Type: Crime

IE:教父,国家:美国,类型:犯罪

I'm trying to find out which of the next is the best solution for linking attributes to a movie. I can see 4 diferent options:

我试图找出下一个是将属性链接到电影的最佳解决方案。我可以看到4种不同的选择:

Schema A

架构A.

哪些DB设计更好?

Problem I see is that I can't restrict multiple attribute_values of same attribute for a movie. I.E ("godfather","USA","JAPAN") is a valid statement The restriction should be controlled by the application

我看到的问题是我不能为电影限制相同属性的多个attribute_values。 I.E(“教父”,“美国”,“日本”)是有效的声明该限制应由申请控制

Schema B

架构B.

哪些DB设计更好?

It's almost the same as Schema A but making Attribute Value a weak entity. I think this has no effect at database level but it would make a bit harder to fetch attribute values as you need attribute key as well. This schema allows to have same category repated, with diferent values, multiple times, so I don't think is a good option neither. As well as option A, the restriction should be controlled by the application

它与Schema A几乎相同,但使属性值成为弱实体。我认为这在数据库级别没有任何影响,但是由于您需要属性键,因此获取属性值会更加困难。这个模式允许重复使用相同的类别,具有不同的值,多​​次,所以我认为这也不是一个好的选择。除选项A外,限制应由应用程序控制

("godfather","Country:USA","Country:JAPAN") is a valid statement

(“教父”,“国家:美国”,“国家:日本”)是一个有效的声明

Schema C

架构C.

哪些DB设计更好?

I think this is the correct one as now we can’t add more than 1 attribute of the same type to a movie “Godfather”, “USA”, “JAPAN” is not a valid insertion!

我认为这是正确的,因为现在我们不能将超过1个相同类型的属性添加到电影“教父”,“美国”,“日本”不是有效的插入!

But I can't tell if making attribute_value a weak entity would be correct or not, better or worse :S

但我无法判断将attribute_value设为弱实体是否正确无论是好还是坏:S

Schema D

架构D.

哪些DB设计更好?

As I said, same as C but with composite key in attribute_value. I’m not sure if this break some database normalization rules. In case this is OK, which table should be referenced from movie_attribute_value for field attribute_id? Attribute ID from Attribute table or AttributeID from AttributeValue table? Is ok to have a composite foreing key and use only a part of the key in the PK ?

正如我所说,与C相同,但在attribute_value中使用复合键。我不确定这是否打破了一些数据库规范化规则。如果没有问题,应该从movie_attribute_value为哪个字段引用属性_id?属性表中的属性ID或AttributeValue表中的AttributeID?可以使用复合foreing键并且只使用PK中的一部分键吗?

Could you please explain with option is better and why?

你能解释一下选项是否更好?为什么?

Thanks in advance!

提前致谢!

EDIT

编辑

I understood the problems of such a design like this, what a EAV schema is and the needed to avoid this type of schema unless in a situation with a lot of changes in the attribute table. Unfortunately this is my scenario, the attributes of the movie are defined by users, so I don't have a way to know which attributes are going to be used. I have to read them and display to another users to fill them. I think Schema C is correct but would like to know what's the problem of using schemas A and B and letting developers to control the restrictions (one same type attribute per movie) in the code

我理解像这样的设计的问题,EAV模式是什么以及避免这种类型的模式所需要的,除非在属性表中有很多变化的情况。不幸的是,这是我的场景,电影的属性是由用户定义的,所以我无法知道将使用哪些属性。我必须阅读它们并显示给其他用户以填充它们。我认为Schema C是正确的,但想知道使用模式A和B的问题是什么,让开发人员控制代码中的限制(每个电影一个相同的类型属性)

As well would be great if somebody can explaing the benefits and pitfalls of using Schema D (composite K) instead of Schema C and if it's OK to have only some fields of a foreign key (attribute_value_id,attribute_id) as PK (movie_id, attribute_id)

如果有人可以解释使用Schema D(复合K)而不是Schema C的好处和缺陷,并且只有外键的一些字段(attribute_value_id,attribute_id)可以作为PK(movie_id,attribute_id),那么也会很棒。

2 个解决方案

#1


4  

As Marc_s comments, EAV designs have a whole bunch of drawbacks. In the case of a movie collection, you know the schema, and it's unlikely to change randomly, and when it does change (e.g. you need to add a flag "available in 4K"), it's probably a big deal.

正如Marc_s评论,EAV设计有一大堆缺点。在电影收藏的情况下,你知道架构,它不太可能随机改变,当它发生变化时(例如你需要添加一个“4K中可用的”标志),这可能是一个大问题。

Ask yourself how you will retrieve all films for a given genre, or all films available in both the US and Japan, or all comedies available in the US but not Japan - you'll very quickly see the limits of EAV.

问问自己如何检索特定类型的所有电影,或美国和日本的所有电影,或美国但不是日本的所有喜剧 - 你很快就能看到EAV的极限。

To answer your question - none of your designs work for me - there are too many tables that don't earn their keep. If you really must go EAV, I'd suggest:

要回答你的问题 - 你的设计都不适合我 - 有太多的桌子不能保留。如果你真的必须去EAV,我建议:

MOVIES
---------
MovieID
.....


ATTRIBUTES
--------------
AttributeID
AttributeName

MOVIE_ATTRIBUTES
------------
MovieID
AttributeID
Value

If you want to provide the list of valid value, the easiest way is to query the "movie attributes" table and retrieve previous entries for that combination of movie and attribute - keeping your schema simple will make life MUCH easier.

如果要提供有效值列表,最简单的方法是查询“影片属性”表并检索影片和属性组合的先前条目 - 保持模式简单将使生活更轻松。

If you really want to put the values in a separate table, schema D appears correct.

如果您确实希望将值放在单独的表中,则架构D显示正确。

Schema C says:

架构C说:

  • for each movie, I have 0 or more movie_attibute_value records
  • 对于每部电影,我有0或更多movie_attibute_value记录
  • for each movie_attribute_value_record, I have 0 or more attribute_value records
  • 对于每个movie_attribute_value_record,我有0个或更多个attribute_value记录
  • for each attribute_value record, i have zero or more attribute/value combinations.
  • 对于每个attribute_value记录,我有零个或多个属性/值组合。

I believe the last statement is incorrect.

我相信最后的陈述是不正确的。

#2


1  

One approach would be to lump all the attributes in the one table, along with a defined type of attribute. Thus:

一种方法是将一个表中的所有属性与定义的属性类型混为一谈。从而:

Movies
------
MovieId


AttributeTypes
---------------
AttributeTypeId
Description


Attributes
---------
AttributeId
AttributeTypeId
Description


MovieAttributes
---------------
MovieId
AttributeId

It could make for awkward queries, but that really depends on how the stored data will be used.

它可能会产生尴尬的查询,但这实际上取决于存储数据的使用方式。

(In other words, yes, I agree with prior posts, and recommend avoiding EAV structures.)

(换句话说,是的,我同意之前的帖子,并建议避免使用EAV结构。)

#1


4  

As Marc_s comments, EAV designs have a whole bunch of drawbacks. In the case of a movie collection, you know the schema, and it's unlikely to change randomly, and when it does change (e.g. you need to add a flag "available in 4K"), it's probably a big deal.

正如Marc_s评论,EAV设计有一大堆缺点。在电影收藏的情况下,你知道架构,它不太可能随机改变,当它发生变化时(例如你需要添加一个“4K中可用的”标志),这可能是一个大问题。

Ask yourself how you will retrieve all films for a given genre, or all films available in both the US and Japan, or all comedies available in the US but not Japan - you'll very quickly see the limits of EAV.

问问自己如何检索特定类型的所有电影,或美国和日本的所有电影,或美国但不是日本的所有喜剧 - 你很快就能看到EAV的极限。

To answer your question - none of your designs work for me - there are too many tables that don't earn their keep. If you really must go EAV, I'd suggest:

要回答你的问题 - 你的设计都不适合我 - 有太多的桌子不能保留。如果你真的必须去EAV,我建议:

MOVIES
---------
MovieID
.....


ATTRIBUTES
--------------
AttributeID
AttributeName

MOVIE_ATTRIBUTES
------------
MovieID
AttributeID
Value

If you want to provide the list of valid value, the easiest way is to query the "movie attributes" table and retrieve previous entries for that combination of movie and attribute - keeping your schema simple will make life MUCH easier.

如果要提供有效值列表,最简单的方法是查询“影片属性”表并检索影片和属性组合的先前条目 - 保持模式简单将使生活更轻松。

If you really want to put the values in a separate table, schema D appears correct.

如果您确实希望将值放在单独的表中,则架构D显示正确。

Schema C says:

架构C说:

  • for each movie, I have 0 or more movie_attibute_value records
  • 对于每部电影,我有0或更多movie_attibute_value记录
  • for each movie_attribute_value_record, I have 0 or more attribute_value records
  • 对于每个movie_attribute_value_record,我有0个或更多个attribute_value记录
  • for each attribute_value record, i have zero or more attribute/value combinations.
  • 对于每个attribute_value记录,我有零个或多个属性/值组合。

I believe the last statement is incorrect.

我相信最后的陈述是不正确的。

#2


1  

One approach would be to lump all the attributes in the one table, along with a defined type of attribute. Thus:

一种方法是将一个表中的所有属性与定义的属性类型混为一谈。从而:

Movies
------
MovieId


AttributeTypes
---------------
AttributeTypeId
Description


Attributes
---------
AttributeId
AttributeTypeId
Description


MovieAttributes
---------------
MovieId
AttributeId

It could make for awkward queries, but that really depends on how the stored data will be used.

它可能会产生尴尬的查询,但这实际上取决于存储数据的使用方式。

(In other words, yes, I agree with prior posts, and recommend avoiding EAV structures.)

(换句话说,是的,我同意之前的帖子,并建议避免使用EAV结构。)