I'm working on a personal project for timekeeping on various projects, but I'm not sure of the best way to structure my database.
我正在做一个个人项目,以便在各种项目上进行计时,但是我不确定最好的方式来构造我的数据库。
A simplified breakdown of the structure is as follows:
结构的简化分类如下:
- Each client can have multiple reports.
- 每个客户端可以有多个报告。
- Each report can have multiple line items.
- 每个报表可以有多个行项。
- Each line item can have multiple time records.
- 每一行都可以有多个时间记录。
There will ultimately be more relationships, but that's the basis of the application. As you can see, each item is related to the item beneath it in a one-to-many relationship.
最终会有更多的关系,但这是应用程序的基础。正如您所看到的,每个项目都与下面的项目相关,在一对多关系中。
My question is, should I relate each table to each "parent" table above it? Something like this:
我的问题是,我是否应该将每个表与上面的“父”表关联起来?是这样的:
clients
id
reports
id
client_id
line_items
id
report_id
client_id
time_records
id
report_id
line_item_id
client_id
And as it cascaded down, there would be more and more foreign keys added to each new table.
当它级联下来时,会有越来越多的外键添加到每个新表中。
My initial reaction is that this is not the correct way to do it, but I would love to get some second(and third!) opinions.
我的第一反应是,这不是正确的做法,但我希望得到一些第二(和第三!)的意见。
5 个解决方案
#1
3
The advantage of the way you're doing it is that you could check all time records for, say, a specific client id without needing a join. But really, it isn't necessary. All you need is to store a reference back up one "level" so to speak. Here are some examples from the "client" perspective:
这样做的好处是,您可以检查所有时间记录,例如,一个特定的客户端id,而不需要连接。但是,真的没有必要。您所需要的是将引用存储到一个“级别”上。下面是一些“客户”视角的例子:
To get a specific client's reports: (simple; same as current schema you suggest)
获得具体客户的报告:(简单;与您建议的当前模式相同)
SELECT * FROM `reports`
WHERE `client_id` = ?;
To get a specific client's line items: (new schema; don't need "client_id" in table)
获取特定客户的行项目:(新模式;表中不需要“client_id”)
SELECT `line_items`.* FROM `line_items`
JOIN `reports` ON `reports`.`id` = `line_items`.`id`
JOIN `clients` ON `clients`.`id` = `reports`.`client_id`
WHERE `clients`.`id` = ?;
To get a specific client's time entries: (new schema; don't need "client_id" or "report_id" in table)
获取特定客户的时间条目:(新模式;表中不需要“client_id”或“report_id”)
SELECT `time_records`.* FROM `time_records`
JOIN `line_items` ON `line_items`.`id` = `time_records`.`line_item_id`
JOIN `reports` ON `reports`.`id` = `line_items`.`id`
JOIN `clients` ON `clients`.`id` = `reports`.`client_id`
WHERE `client_id` = ?;
So, the revised schema would be:
因此,修订后的模式是:
clients
id
reports
id
client_id
line_items
id
report_id
time_records
id
line_item_id
EDIT:
编辑:
Additionally, I would consider using views to simplify the queries (I assume you'll use them often), definitely creating indexes on the join columns, and utilizing foreign key references for normalization (InnoDB only).
此外,我将考虑使用视图来简化查询(我假设您将经常使用它们),在连接列上创建索引,并使用外键引用进行规范化(只有InnoDB)。
#2
1
No, if there is no direct relation in the elements of the model, then there should not be direct relation in the corresponding tables. Otherwise your data will have redundancies and you will have problems for updating.
不,如果模型中的元素没有直接关系,那么在相应的表中就不应该有直接关系。否则,您的数据将具有冗余,您将有更新的问题。
This is the right way:
这是正确的方式:
clients
id
reports
id
client_id
line_items
id
report_id
time_records
id
line_id
#3
1
You don't need to create client_id
on line_items
table if you never join line items directly clients, becouse you can get that by reports
table. Same happens to others FKs.
如果您从来没有连接过直接客户端,您不需要在line_items表上创建client_id,因为您可以通过报表表获得。其他人也一样。
I recommend you think in your report needs/queries over this collection of data before create redundant foreign keys who can complicate your development.
我建议您在报告中考虑在创建冗余外键之前需要/查询这些数据集,这些外键可能会使您的开发复杂化。
Create redundant FKs is not difficult if you need them in the future, some ALTERS and UPDATE SELECTS solves your problem.
如果将来需要的话,创建冗余FKs并不困难,一些修改和更新选择可以解决您的问题。
If you not have so much information in the line_items
, you can denormalize and add this info in the time_records
.
如果您在line_items中没有那么多的信息,您可以反规范化并在time_records中添加该信息。
#4
1
Anywhere there is a direct relationship between two tables, you should use foreign keys to keep the data integrity. Personally, I would look at a structure like this:
在两个表之间有直接关系的任何地方,都应该使用外键来保持数据的完整性。就我个人而言,我会看这样一个结构:
Client
ClientId
Report
ReportId
ClientId
LineItem
LineItemId
ReportId
TimeRecord
TimeRecordId
LineItemId
In this example, you do not need ClientId
in LineItem
because you have that relationship through the Report
table. The major disadvantage of having ClientId
in all of your tables is that if the business logic does not enforce consistency of these values (a bug is in the code) you can run into situations where you get different values if you search based on
在本例中,您不需要在LineItem中使用ClientId,因为您通过报表拥有这种关系。在所有表中拥有ClientId的主要缺点是,如果业务逻辑不强制执行这些值的一致性(bug在代码中),那么您可以在基于搜索的情况下遇到不同的值。
Report:
ReportId = 3
ClientId = 2
LineItem:
LineItemId = 1
ReportId = 3
ClientId = 3
In the above situation, you would be looking at ClientId = 2
if your query went through Report
and ClientId = 3
if your query went through LineItem
It is difficult once this happens to determine which relationship is correct, and where the bug is.
在上面的情况中,如果您的查询通过报表,那么您将看到ClientId = 2;如果您的查询通过LineItem,那么您将看到ClientId = 3。
Also, I would advocate for not having id
columns, but instead more explicit names to describe what the id
is used for. (ReportId
or ClientId
) In my opinion, this makes Joins easier to read. As an example:
此外,我主张不使用id列,而是使用更显式的名称来描述id的用途。(ReportId或ClientId)在我看来,这使连接更容易阅读。作为一个例子:
SELECT COUNT(1) AS NumberOfLineItems
FROM Client AS c
INNER JOIN Report AS r ON c.ClientId = r.ClientId
INNER JOIN LineItem AS li ON r.ReportId = li.ReportId
WHERE c.ClientId = 12
#5
0
As personal opinion, I would have:
我个人认为:
clients
id
time_records
id
client_id
report
line_item
report_id
That way all of your fields are over in the time_records
table. You can then do something like:
这样,所有字段都在time_records表中结束。你可以这样做:
SELECT *
FROM 'time_records'
WHERE 'time_records'.'client_id' = 16542
AND 'time_records'.'report' = 164652
ORDER BY 'time_records'.'id' ASC
#1
3
The advantage of the way you're doing it is that you could check all time records for, say, a specific client id without needing a join. But really, it isn't necessary. All you need is to store a reference back up one "level" so to speak. Here are some examples from the "client" perspective:
这样做的好处是,您可以检查所有时间记录,例如,一个特定的客户端id,而不需要连接。但是,真的没有必要。您所需要的是将引用存储到一个“级别”上。下面是一些“客户”视角的例子:
To get a specific client's reports: (simple; same as current schema you suggest)
获得具体客户的报告:(简单;与您建议的当前模式相同)
SELECT * FROM `reports`
WHERE `client_id` = ?;
To get a specific client's line items: (new schema; don't need "client_id" in table)
获取特定客户的行项目:(新模式;表中不需要“client_id”)
SELECT `line_items`.* FROM `line_items`
JOIN `reports` ON `reports`.`id` = `line_items`.`id`
JOIN `clients` ON `clients`.`id` = `reports`.`client_id`
WHERE `clients`.`id` = ?;
To get a specific client's time entries: (new schema; don't need "client_id" or "report_id" in table)
获取特定客户的时间条目:(新模式;表中不需要“client_id”或“report_id”)
SELECT `time_records`.* FROM `time_records`
JOIN `line_items` ON `line_items`.`id` = `time_records`.`line_item_id`
JOIN `reports` ON `reports`.`id` = `line_items`.`id`
JOIN `clients` ON `clients`.`id` = `reports`.`client_id`
WHERE `client_id` = ?;
So, the revised schema would be:
因此,修订后的模式是:
clients
id
reports
id
client_id
line_items
id
report_id
time_records
id
line_item_id
EDIT:
编辑:
Additionally, I would consider using views to simplify the queries (I assume you'll use them often), definitely creating indexes on the join columns, and utilizing foreign key references for normalization (InnoDB only).
此外,我将考虑使用视图来简化查询(我假设您将经常使用它们),在连接列上创建索引,并使用外键引用进行规范化(只有InnoDB)。
#2
1
No, if there is no direct relation in the elements of the model, then there should not be direct relation in the corresponding tables. Otherwise your data will have redundancies and you will have problems for updating.
不,如果模型中的元素没有直接关系,那么在相应的表中就不应该有直接关系。否则,您的数据将具有冗余,您将有更新的问题。
This is the right way:
这是正确的方式:
clients
id
reports
id
client_id
line_items
id
report_id
time_records
id
line_id
#3
1
You don't need to create client_id
on line_items
table if you never join line items directly clients, becouse you can get that by reports
table. Same happens to others FKs.
如果您从来没有连接过直接客户端,您不需要在line_items表上创建client_id,因为您可以通过报表表获得。其他人也一样。
I recommend you think in your report needs/queries over this collection of data before create redundant foreign keys who can complicate your development.
我建议您在报告中考虑在创建冗余外键之前需要/查询这些数据集,这些外键可能会使您的开发复杂化。
Create redundant FKs is not difficult if you need them in the future, some ALTERS and UPDATE SELECTS solves your problem.
如果将来需要的话,创建冗余FKs并不困难,一些修改和更新选择可以解决您的问题。
If you not have so much information in the line_items
, you can denormalize and add this info in the time_records
.
如果您在line_items中没有那么多的信息,您可以反规范化并在time_records中添加该信息。
#4
1
Anywhere there is a direct relationship between two tables, you should use foreign keys to keep the data integrity. Personally, I would look at a structure like this:
在两个表之间有直接关系的任何地方,都应该使用外键来保持数据的完整性。就我个人而言,我会看这样一个结构:
Client
ClientId
Report
ReportId
ClientId
LineItem
LineItemId
ReportId
TimeRecord
TimeRecordId
LineItemId
In this example, you do not need ClientId
in LineItem
because you have that relationship through the Report
table. The major disadvantage of having ClientId
in all of your tables is that if the business logic does not enforce consistency of these values (a bug is in the code) you can run into situations where you get different values if you search based on
在本例中,您不需要在LineItem中使用ClientId,因为您通过报表拥有这种关系。在所有表中拥有ClientId的主要缺点是,如果业务逻辑不强制执行这些值的一致性(bug在代码中),那么您可以在基于搜索的情况下遇到不同的值。
Report:
ReportId = 3
ClientId = 2
LineItem:
LineItemId = 1
ReportId = 3
ClientId = 3
In the above situation, you would be looking at ClientId = 2
if your query went through Report
and ClientId = 3
if your query went through LineItem
It is difficult once this happens to determine which relationship is correct, and where the bug is.
在上面的情况中,如果您的查询通过报表,那么您将看到ClientId = 2;如果您的查询通过LineItem,那么您将看到ClientId = 3。
Also, I would advocate for not having id
columns, but instead more explicit names to describe what the id
is used for. (ReportId
or ClientId
) In my opinion, this makes Joins easier to read. As an example:
此外,我主张不使用id列,而是使用更显式的名称来描述id的用途。(ReportId或ClientId)在我看来,这使连接更容易阅读。作为一个例子:
SELECT COUNT(1) AS NumberOfLineItems
FROM Client AS c
INNER JOIN Report AS r ON c.ClientId = r.ClientId
INNER JOIN LineItem AS li ON r.ReportId = li.ReportId
WHERE c.ClientId = 12
#5
0
As personal opinion, I would have:
我个人认为:
clients
id
time_records
id
client_id
report
line_item
report_id
That way all of your fields are over in the time_records
table. You can then do something like:
这样,所有字段都在time_records表中结束。你可以这样做:
SELECT *
FROM 'time_records'
WHERE 'time_records'.'client_id' = 16542
AND 'time_records'.'report' = 164652
ORDER BY 'time_records'.'id' ASC