For example, I have 2 tables : 'customer' and 'staff'. They are almost the same, only 2 attributes are different. So should I create another table named 'person' contains all of the same attributes of 'customer' and 'staff' then create fk keys point to this 'person'? Something like inheritance in class design.
例如,我有两个表:“客户”和“员工”。它们几乎相同,只有两个属性不同。那么,我是否应该创建另一个名为“person”的表,包含“customer”和“staff”的所有相同属性,然后创建指向这个“person”的fk键?类设计中的继承。
Is there any drawback to this method ?
这种方法有什么缺点吗?
5 个解决方案
#1
5
You're describing a pattern call Class Table Inheritance. It's a valid design, but like any other design, it must be used with good judgment. Read Martin Fowler's "Patterns of Enterprise Application Architecture" for more details on its advantages and disadvantages.
您正在描述一个模式调用类表继承。这是一个有效的设计,但是和其他的设计一样,它必须要有良好的判断力。请阅读Martin Fowler的“企业应用程序体系结构模式”,以了解其优缺点。
Some people caution against the use of joins, but you need a join only when you need the subclass-specific columns. When a given query only needs the common columns, you can avoid the extra join.
有些人警告不要使用联接,但只有在需要特定于子类的列时才需要联接。当给定的查询只需要公共列时,可以避免额外的连接。
#2
6
Yes, there is a drawback to that method. Joins increase query complexity (immensely so in some cases) and can increase query time if you're not careful.
是的,这种方法有缺点。连接增加了查询的复杂性(在某些情况下是非常高的),如果不小心,还可以增加查询时间。
Instead, the standard way to do this (i.e. simulate object inheritance when only a few attributes differ between the subclasses) is to do something called Single Table Inheritance. This method prevents database joins at the cost of a little bit of unused database space.
相反,实现这一点的标准方法(例如,在子类之间只有几个属性不同时模拟对象继承)是执行称为单表继承的操作。这种方法以少量未使用的数据库空间为代价来防止数据库连接。
It works like this: You create one table that contains all the attributes, including the ones that only apply to one or the other, as well as a type
attribute to specify the object type. For example, if customer
has attributes:
它的工作方式如下:创建一个包含所有属性的表,包括只应用于其中一个或另一个的表,以及指定对象类型的类型属性。例如,如果客户有以下属性:
id
, name
, email
, password
, order_date
id,姓名,电子邮件,密码,order_date
AND staff
has attributes:
和员工有属性:
id
, name
, email
, password
, hire_date
id,姓名,电子邮件,密码,hire_date
Then you create one table with columns for all the attributes and a type:
然后创建一个包含所有属性和类型的列的表:
id
, type
, name
, email
, password
, order_date
, hire_date
id,类型,名称,电子邮件,密码,order_date, hire_date
The type
column will always contain either "customer" or "staff". If type
is "customer", then hire_date
is always NULL, and is meaningless. If type
is "staff" then order_date
is always NULL, and is meaningless.
类型列将始终包含“customer”或“staff”。如果类型是“customer”,则hire_date总是NULL,并且没有意义。如果类型是“staff”,那么order_date总是NULL,并且没有意义。
#3
4
Both Pranay Rana and Ben Lee are correct, and the ultimate answer is: "it depends".
Pranay Rana和Ben Lee都是对的,最终的答案是:“视情况而定”。
You have to weigh up the number of sub-class specific columns against the number of common columns to decide what's right for you. Single Table inheritance doesn't scale well: what happens when you have to introduce a third type of sub-class, such as suppliers?
您必须权衡子类特定列的数量与公共列的数量,以决定什么是适合您的。单表继承不能很好地扩展:如果必须引入第三种子类(如供应商),会发生什么情况?
For that matter how are you going to treat staff that are also customers?
那么,你将如何对待同样是顾客的员工呢?
#4
3
Lookup "generalization specialization relational modeling". You'll find some good articles on the subject. Most of the examples follow the same pattern as the Class Table Inheritance link that Bill gave you.
查找“泛化专门化关系建模”。你会找到一些关于这个主题的好文章。大多数示例都遵循与Bill提供的类表继承链接相同的模式。
There's just one more little detail. The specialized tables (for customer and staff in your case) do not autonumber their id fieid. Instead, when you populate them, the id field should get a copy of the id field in the generalized table (person in your case).
还有一个小细节。专用表(针对客户和员工)不会自动编号他们的id fieid。相反,当您填充它们时,id字段应该在通用表(在您的例子中是person)中获得id字段的副本。
This makes the specialized ids do double duty. They are both a pk and an fk reference to the corresponsding row in the genralized table. This makes joins easier and faster.
这使得专用id执行双重任务。它们都是在genralized表中对应行的pk和fk引用。这使得连接更容易、更快。
It can be convenient to create views that have each specialized table joined with the generalized table. Or you can make one large view that generates the same data you would see in a single table inheritance pattern suggested by another response. It's basically a union of a bunch of joins.
可以方便地创建具有与通用表相连接的每个专用表的视图。或者,您可以创建一个大型视图,生成与另一个响应建议的单个表继承模式中相同的数据。它基本上是一组连接的结合。
#5
2
Well I say its good design because you are not repeating data and that's y the data normalization is there.
我说它的设计很好,因为你不是重复数据,这就是数据标准化。
just one thing is that as much as you normalize your no of join will increase.
只有一件事是,只要您将no join规范化,那么它就会增加。
#1
5
You're describing a pattern call Class Table Inheritance. It's a valid design, but like any other design, it must be used with good judgment. Read Martin Fowler's "Patterns of Enterprise Application Architecture" for more details on its advantages and disadvantages.
您正在描述一个模式调用类表继承。这是一个有效的设计,但是和其他的设计一样,它必须要有良好的判断力。请阅读Martin Fowler的“企业应用程序体系结构模式”,以了解其优缺点。
Some people caution against the use of joins, but you need a join only when you need the subclass-specific columns. When a given query only needs the common columns, you can avoid the extra join.
有些人警告不要使用联接,但只有在需要特定于子类的列时才需要联接。当给定的查询只需要公共列时,可以避免额外的连接。
#2
6
Yes, there is a drawback to that method. Joins increase query complexity (immensely so in some cases) and can increase query time if you're not careful.
是的,这种方法有缺点。连接增加了查询的复杂性(在某些情况下是非常高的),如果不小心,还可以增加查询时间。
Instead, the standard way to do this (i.e. simulate object inheritance when only a few attributes differ between the subclasses) is to do something called Single Table Inheritance. This method prevents database joins at the cost of a little bit of unused database space.
相反,实现这一点的标准方法(例如,在子类之间只有几个属性不同时模拟对象继承)是执行称为单表继承的操作。这种方法以少量未使用的数据库空间为代价来防止数据库连接。
It works like this: You create one table that contains all the attributes, including the ones that only apply to one or the other, as well as a type
attribute to specify the object type. For example, if customer
has attributes:
它的工作方式如下:创建一个包含所有属性的表,包括只应用于其中一个或另一个的表,以及指定对象类型的类型属性。例如,如果客户有以下属性:
id
, name
, email
, password
, order_date
id,姓名,电子邮件,密码,order_date
AND staff
has attributes:
和员工有属性:
id
, name
, email
, password
, hire_date
id,姓名,电子邮件,密码,hire_date
Then you create one table with columns for all the attributes and a type:
然后创建一个包含所有属性和类型的列的表:
id
, type
, name
, email
, password
, order_date
, hire_date
id,类型,名称,电子邮件,密码,order_date, hire_date
The type
column will always contain either "customer" or "staff". If type
is "customer", then hire_date
is always NULL, and is meaningless. If type
is "staff" then order_date
is always NULL, and is meaningless.
类型列将始终包含“customer”或“staff”。如果类型是“customer”,则hire_date总是NULL,并且没有意义。如果类型是“staff”,那么order_date总是NULL,并且没有意义。
#3
4
Both Pranay Rana and Ben Lee are correct, and the ultimate answer is: "it depends".
Pranay Rana和Ben Lee都是对的,最终的答案是:“视情况而定”。
You have to weigh up the number of sub-class specific columns against the number of common columns to decide what's right for you. Single Table inheritance doesn't scale well: what happens when you have to introduce a third type of sub-class, such as suppliers?
您必须权衡子类特定列的数量与公共列的数量,以决定什么是适合您的。单表继承不能很好地扩展:如果必须引入第三种子类(如供应商),会发生什么情况?
For that matter how are you going to treat staff that are also customers?
那么,你将如何对待同样是顾客的员工呢?
#4
3
Lookup "generalization specialization relational modeling". You'll find some good articles on the subject. Most of the examples follow the same pattern as the Class Table Inheritance link that Bill gave you.
查找“泛化专门化关系建模”。你会找到一些关于这个主题的好文章。大多数示例都遵循与Bill提供的类表继承链接相同的模式。
There's just one more little detail. The specialized tables (for customer and staff in your case) do not autonumber their id fieid. Instead, when you populate them, the id field should get a copy of the id field in the generalized table (person in your case).
还有一个小细节。专用表(针对客户和员工)不会自动编号他们的id fieid。相反,当您填充它们时,id字段应该在通用表(在您的例子中是person)中获得id字段的副本。
This makes the specialized ids do double duty. They are both a pk and an fk reference to the corresponsding row in the genralized table. This makes joins easier and faster.
这使得专用id执行双重任务。它们都是在genralized表中对应行的pk和fk引用。这使得连接更容易、更快。
It can be convenient to create views that have each specialized table joined with the generalized table. Or you can make one large view that generates the same data you would see in a single table inheritance pattern suggested by another response. It's basically a union of a bunch of joins.
可以方便地创建具有与通用表相连接的每个专用表的视图。或者,您可以创建一个大型视图,生成与另一个响应建议的单个表继承模式中相同的数据。它基本上是一组连接的结合。
#5
2
Well I say its good design because you are not repeating data and that's y the data normalization is there.
我说它的设计很好,因为你不是重复数据,这就是数据标准化。
just one thing is that as much as you normalize your no of join will increase.
只有一件事是,只要您将no join规范化,那么它就会增加。