需要帮助的数据库设计前10名

时间:2022-10-07 02:06:13

I am trying to come up with a database design to hold the "Top 10" results for some calculations that are being done. Basically, when all is said in done, there will be 3 "Top 10" categories, which I am fine with all being separate tables, however I need to be able to go back and later pull historical data about what was in the Top 10 at certain times, hence the need for a database, although a flat-file would work, this has the potential to hold years worth of data.

我正在尝试设计一个数据库来保存正在进行的一些计算的“前10个”结果。基本上,当所有人都说在做,将会有3个“十大”类别,我很好,都是单独的表,但是我需要回去后把历史数据什么十大在某些时刻,因此需要一个数据库,尽管文件工作,这有可能保存年的数据。

Now, it's been awhile since I have done anything serious with a database, other than something that had a couple of simple tables, so I am having some issues thinking through this design. If someone could help me with the design of it, I know enough MySQL to get the rest done.

现在,我已经有一段时间没有对数据库做任何认真的工作了,除了有一些简单的表之外,所以我在考虑这个设计时遇到了一些问题。如果有人能帮助我设计它,我知道足够的MySQL来完成剩下的。

So, in essence, I need to store: A group of 10 names, a % of the total points each name had, the rank they held in the Top 10 and a time associated with that Top 10 (So I can later query for that time)

因此,本质上,我需要存储:一组10个名称,每个名称的总分的%,它们在前10名中的排名,以及与前10名相关的时间(因此我可以稍后查询)

I would think I need a table for for the Top 10 with 11 columns, one for the ID and 10 for the Foreign Key of the 'Names' table, that holds every name ever used with a PK, Name, %, and Rank. This seems clunky to me, anyone else have a suggestion?

我想我需要一个表,用于前10名,有11列,一个用于ID, 10个用于'Names'表的外键,它包含所有与PK、name、%和Rank一起使用的名称。这在我看来很笨拙,还有人有什么建议吗?

edit:The 'Top 10' is associated with a specific set of data for 5-minute intervals, and each interval is completely independent from the previous or future intervals.

编辑:“前10位”与一组特定的数据相关联,时间间隔为5分钟,并且每个时间间隔完全独立于以前或将来的时间间隔。

2 个解决方案

#1


2  

I don't recommend your solution, because then if you want to ask the database "How often has Joe been in the top 10," you have to write 10 queries of the form

我不推荐您的解决方案,因为如果您想问数据库“Joe在前10名中出现的频率是多少”,您必须编写10个表单查询

SELECT Date FROM Top10 WHERE FirstPlace = 'joe'
SELECT Date FROM Top10 WHERE SecondPlace = 'joe'
...

Instead, how about a Rankings table, with fields:

相反,我们来看看有字段的排名表:

id
Date
Person
Rank

Then if you want the Top 10 list for a certain date, the query is

然后,如果您想要某个日期的前10个列表,查询是

SELECT * FROM Rankings WHERE Date = ...

and if you want to know someone's historical ranking, the query is

如果你想知道某人的历史排名,查询是

SELECT * FROM Rankings WHERE Person = ...

and if you want to know all the historical leaders, the query is

如果你想知道所有的历史*,查询是

SELECT * FROM Rankings WHERE Rank = 1

The downside to this is that you might accidentally make two different people 8th place, and your database would allow the anomaly. But I have good news for you -- people might actually tie for 8th place, so you might actually want that to be possible!

这样做的缺点是,您可能意外地使两个不同的人排在第8位,而您的数据库将允许异常。但我有个好消息要告诉你——人们可能会并列第八名,所以你可能真的想让这成为可能!

#2


1  

I assume that your "Top 10" is a snapshot data in certain time. And your business logic is that "every 5 minutes" so that the time is the parent entity for table design

我假设您的“前10名”是某个时间的快照数据。您的业务逻辑是“每5分钟”,因此时间是表设计的父实体

top_10_history
    th_id - the primary key
    th_time - the time point when taking the snapshot data of "Top 10"
top_10_detail
    td_th_id - the FK to top_10_history
    td_name_id - the FK to name
    td_percentage - the "%"
    td_rank - the rank
  1. If the sequence of "Top 10" could be calculated from columns in "top_10_detail", you don't need a column to keep the sequence of it. Otherwise, you need a column to persist the sequence for it.
  2. 如果“Top 10”的序列可以从“top_10_detail”中的列中计算出来,则不需要使用列来保持其序列。否则,您需要一个列来为它持久化序列。
  3. If you need more complicated query such as "The top 10 at 12:00 AM in last 30 days", using individual columns for "day", "hour", and "minute" would be a better idea for performance(with suitable indexes).
  4. 如果您需要更复杂的查询,比如“最后30天凌晨12点前10名”,那么使用“day”、“hour”和“minute”的单独列将是更好的性能(使用合适的索引)。

#1


2  

I don't recommend your solution, because then if you want to ask the database "How often has Joe been in the top 10," you have to write 10 queries of the form

我不推荐您的解决方案,因为如果您想问数据库“Joe在前10名中出现的频率是多少”,您必须编写10个表单查询

SELECT Date FROM Top10 WHERE FirstPlace = 'joe'
SELECT Date FROM Top10 WHERE SecondPlace = 'joe'
...

Instead, how about a Rankings table, with fields:

相反,我们来看看有字段的排名表:

id
Date
Person
Rank

Then if you want the Top 10 list for a certain date, the query is

然后,如果您想要某个日期的前10个列表,查询是

SELECT * FROM Rankings WHERE Date = ...

and if you want to know someone's historical ranking, the query is

如果你想知道某人的历史排名,查询是

SELECT * FROM Rankings WHERE Person = ...

and if you want to know all the historical leaders, the query is

如果你想知道所有的历史*,查询是

SELECT * FROM Rankings WHERE Rank = 1

The downside to this is that you might accidentally make two different people 8th place, and your database would allow the anomaly. But I have good news for you -- people might actually tie for 8th place, so you might actually want that to be possible!

这样做的缺点是,您可能意外地使两个不同的人排在第8位,而您的数据库将允许异常。但我有个好消息要告诉你——人们可能会并列第八名,所以你可能真的想让这成为可能!

#2


1  

I assume that your "Top 10" is a snapshot data in certain time. And your business logic is that "every 5 minutes" so that the time is the parent entity for table design

我假设您的“前10名”是某个时间的快照数据。您的业务逻辑是“每5分钟”,因此时间是表设计的父实体

top_10_history
    th_id - the primary key
    th_time - the time point when taking the snapshot data of "Top 10"
top_10_detail
    td_th_id - the FK to top_10_history
    td_name_id - the FK to name
    td_percentage - the "%"
    td_rank - the rank
  1. If the sequence of "Top 10" could be calculated from columns in "top_10_detail", you don't need a column to keep the sequence of it. Otherwise, you need a column to persist the sequence for it.
  2. 如果“Top 10”的序列可以从“top_10_detail”中的列中计算出来,则不需要使用列来保持其序列。否则,您需要一个列来为它持久化序列。
  3. If you need more complicated query such as "The top 10 at 12:00 AM in last 30 days", using individual columns for "day", "hour", and "minute" would be a better idea for performance(with suitable indexes).
  4. 如果您需要更复杂的查询,比如“最后30天凌晨12点前10名”,那么使用“day”、“hour”和“minute”的单独列将是更好的性能(使用合适的索引)。