I'm creating a site where all of the users have a score that is updated everyday. I can easily create rankings from this score, however I'd like to be able to create a "Hot" list of the week or month, etc..
我创建了一个网站,所有用户都有每天更新的分数。我可以很容易地从这个分数中创建排名,但是我希望能够创建一个星期或月的“热门”列表,等等。
My brute force design would be each day for every user, calculate their score and put it into the "Scores" table. So everyday the Scores table would increase by how many users there are. I could rank users by their score deltas over whatever time period.
我的蛮力设计是每天为每个用户计算他们的分数,并把它放到“分数”表中。所以每天分数表都会增加多少用户。我可以根据用户在任何时间段的得分来给他们打分。
While I believe this would technically work I feel like there has to be a more sophisticated way of doing this, right? Or not? I feel like a Scores table that increases everyday by how many users there are can't be the way other sites are doing it.
虽然我相信这在技术上行得通,但我觉得必须有一种更复杂的方法,对吧?或不呢?我感觉就像一个分数表,每天都在增加有多少用户,这是其他网站无法做到的。
2 个解决方案
#1
2
You get the most flexibility by not storing any snapshots of score at all. Instead, record incremental scores, as they happen.
通过根本不存储任何分数快照,您可以获得最大的灵活性。相反,记录增量分数。
If you have tables like this:
如果你有这样的桌子:
USER
用户
- user_id
- user_id
- name
- 的名字
- personal_high_score
- personal_high_score
- {anything else that you store once per user}
- {您为每个用户存储一次的任何其他内容}
SCORE_LOG
SCORE_LOG
- score_log_id
- score_log_id
- user_id (FK to USER)
- user_id(颗用户)
- date_time
- date_time
- scored_points
- scored_points
Now you can get a cumulative score for a user as of any point in time with a simple query like:
现在,您可以通过一个简单的查询(如:
select sum(scored_points)
from SCORE_LOG
where user_id = @UserID
and date_time <= @PointInTime
You can also easily get top ranking scorers for a time period with something like:
在一段时间内,你也可以很容易地获得排名第一的得分者,比如:
select
user_id
, sum(scored_points)
from SCORE_LOG
group by
user_id
where date_time >= @StartOfPeriod
and date_time <= @EndOfPeriod
order by
sum(scored_points) desc
limit 5
If you get to production and find that you're having performance issues in practice, then you could consider denormalizing a snapshot of whatever statistics make sense. The problem with these snapshot statistics is that they can get out of sync with your source data, so you'll need a strategy for recalculating the snapshots periodically.
如果您开始生产并发现在实践中存在性能问题,那么您可以考虑对任何有意义的统计数据的快照进行非规范化。这些快照统计数据的问题是它们可能与源数据不同步,因此需要一种策略来定期重新计算快照。
#2
2
Barranka was on the right track with his comment, you need to make sure you are not duplicating any of the data wherever possible.
Barranka的评论是正确的,你需要确保你没有在任何可能的地方复制任何数据。
However, if you are looking to be able to revert back to some old users score or possibly be able to pick out a day and see who was top at a certain point i.e. dynamic reporting, then you will need to record each record separately next to a date. Having a separate table for this would be useful as you could deduce the daily score from the existing user data via SQL and just enter it in to the table whenever you want.
但是,如果您希望能够恢复到一些老用户的评分,或者可能能够挑选一天,并在某个时间点(即动态报告)查看谁是最优秀的,那么您将需要在一个日期之前分别记录每个记录。有一个单独的表将非常有用,因为您可以通过SQL从现有的用户数据中推断出每日得分,并在需要时将其输入到表中。
The decision you have is how many users record do you want to maintain in the history and how long. I have written the below with the idea that the "hot list" would be the top 5 users, you could have a CRON job or scheduled task running each day/month to run the inserts and also clean out very old data.
您拥有的决策是您希望在历史中维护多少用户记录,以及要维护多长时间。我已经写了下面的想法,“热门列表”将是前5个用户,您可以有CRON作业或计划的任务,每天/每个月运行插入,并清理非常旧的数据。
Users
用户
- id
- id
- username
- 用户名
- score
- 分数
score_ranking
score_ranking
- id
- id
- user_id (we normalise by using the id rather than all the user info)
- user_id(我们使用id而不是所有用户信息进行规范化)
- score_at_the_time
- score_at_the_time
- date_of_ranking
- date_of_ranking
So to generate a single data ranking you could insert into this table. Something like:
因此,要生成一个数据排序,可以将其插入到这个表中。喜欢的东西:
INSERT INTO
`score_ranking` (`user_id`, `score_at_the_time`, `date_of_ranking`)
SELECT
`id`, `score`, CURDATE()
FROM
`users`
ORDER BY
`score` DESC
LIMIT
5
To read the data for a specific date (or date range) you could then do:
要读取特定日期(或日期范围)的数据,您可以这样做:
SELECT * FROM score_ranking
WHERE date_of_ranking = 'somedate'
ORDER BY score_at_the_time DESC
#1
2
You get the most flexibility by not storing any snapshots of score at all. Instead, record incremental scores, as they happen.
通过根本不存储任何分数快照,您可以获得最大的灵活性。相反,记录增量分数。
If you have tables like this:
如果你有这样的桌子:
USER
用户
- user_id
- user_id
- name
- 的名字
- personal_high_score
- personal_high_score
- {anything else that you store once per user}
- {您为每个用户存储一次的任何其他内容}
SCORE_LOG
SCORE_LOG
- score_log_id
- score_log_id
- user_id (FK to USER)
- user_id(颗用户)
- date_time
- date_time
- scored_points
- scored_points
Now you can get a cumulative score for a user as of any point in time with a simple query like:
现在,您可以通过一个简单的查询(如:
select sum(scored_points)
from SCORE_LOG
where user_id = @UserID
and date_time <= @PointInTime
You can also easily get top ranking scorers for a time period with something like:
在一段时间内,你也可以很容易地获得排名第一的得分者,比如:
select
user_id
, sum(scored_points)
from SCORE_LOG
group by
user_id
where date_time >= @StartOfPeriod
and date_time <= @EndOfPeriod
order by
sum(scored_points) desc
limit 5
If you get to production and find that you're having performance issues in practice, then you could consider denormalizing a snapshot of whatever statistics make sense. The problem with these snapshot statistics is that they can get out of sync with your source data, so you'll need a strategy for recalculating the snapshots periodically.
如果您开始生产并发现在实践中存在性能问题,那么您可以考虑对任何有意义的统计数据的快照进行非规范化。这些快照统计数据的问题是它们可能与源数据不同步,因此需要一种策略来定期重新计算快照。
#2
2
Barranka was on the right track with his comment, you need to make sure you are not duplicating any of the data wherever possible.
Barranka的评论是正确的,你需要确保你没有在任何可能的地方复制任何数据。
However, if you are looking to be able to revert back to some old users score or possibly be able to pick out a day and see who was top at a certain point i.e. dynamic reporting, then you will need to record each record separately next to a date. Having a separate table for this would be useful as you could deduce the daily score from the existing user data via SQL and just enter it in to the table whenever you want.
但是,如果您希望能够恢复到一些老用户的评分,或者可能能够挑选一天,并在某个时间点(即动态报告)查看谁是最优秀的,那么您将需要在一个日期之前分别记录每个记录。有一个单独的表将非常有用,因为您可以通过SQL从现有的用户数据中推断出每日得分,并在需要时将其输入到表中。
The decision you have is how many users record do you want to maintain in the history and how long. I have written the below with the idea that the "hot list" would be the top 5 users, you could have a CRON job or scheduled task running each day/month to run the inserts and also clean out very old data.
您拥有的决策是您希望在历史中维护多少用户记录,以及要维护多长时间。我已经写了下面的想法,“热门列表”将是前5个用户,您可以有CRON作业或计划的任务,每天/每个月运行插入,并清理非常旧的数据。
Users
用户
- id
- id
- username
- 用户名
- score
- 分数
score_ranking
score_ranking
- id
- id
- user_id (we normalise by using the id rather than all the user info)
- user_id(我们使用id而不是所有用户信息进行规范化)
- score_at_the_time
- score_at_the_time
- date_of_ranking
- date_of_ranking
So to generate a single data ranking you could insert into this table. Something like:
因此,要生成一个数据排序,可以将其插入到这个表中。喜欢的东西:
INSERT INTO
`score_ranking` (`user_id`, `score_at_the_time`, `date_of_ranking`)
SELECT
`id`, `score`, CURDATE()
FROM
`users`
ORDER BY
`score` DESC
LIMIT
5
To read the data for a specific date (or date range) you could then do:
要读取特定日期(或日期范围)的数据,您可以这样做:
SELECT * FROM score_ranking
WHERE date_of_ranking = 'somedate'
ORDER BY score_at_the_time DESC