通过查询，在组中选择错误的列值。

Here's a real noobish MySQL query problem I'm having.

这是一个真正的MySQL查询问题。

I have a high score table in a game I'm writing. The high score DB records a name, level, and score achieved. There are many near duplicates in the db. For example:

我正在写的游戏里有一张高分榜。高分数的DB记录一个名字，级别，和取得的分数。db中有许多几乎相同的文件。例如:

Name | Level | Score | Timestamp (key)
Bob    2       41    | 1234567.890
Bob    3       15    | 1234568.890
Bob    3       20    | 1234569.890
Joe    2       40    | 1234561.890
Bob    3       21    | 1234562.890
Bob    3       21    | 1234563.890

I want to return a "highest level achieved" high score list, with an output similar to:

我想返回一个“最高水平实现”的高分列表，输出类似:

Name | Level | Score
Bob    3       21
Joe    2       40

The SQL Query I currently use is:

我目前使用的SQL查询是:

SELECT *, MAX(level) as level 
FROM highscores 
GROUP BY name
ORDER BY level DESC, score DESC
LIMIT 5

However this doesn't quite work. The "Score" field output always seems to be randomly pulled from the group, instead of taking the corresponding score for the highest level achieved. Eg:

然而，这并不完全有效。“Score”字段输出似乎总是从组中随机抽取，而不是取最高的对应分数。例如:

Name | Level | Score
Bob    3       41
Joe    2       40

Bob never got 41 points on level 3! How can I fix this?

鲍勃在第三关没有得到41分!我该怎么解决这个问题呢?

3 个解决方案

#1

You'll need to use a subquery to pull the score out.

您需要使用子查询来提取分数。

select distinct
    name, 
    max(level) as level,
    (select max(score) from highscores h2 
        where h2.name = h1.name and h2.level = h1.level) as score
from highscores h1 
group by name 
order by level desc, score desc

Cheers,

欢呼,

Eric

埃里克

It irks me that I didn't take the time to explain why this is the case when I posted the answer, so here goes:

我没花时间解释为什么我在发布答案时出现了这种情况，这让我很不爽。

When you pull back everything (*), and then the max level, what you'll get is each record sequentially, plus a column with the max level on it. Note that you're not grouping by score (which would have given you Bob 2 41, and Bob 3 21--two records for our friend Bob).

当你拉回所有的(*)，然后是最大级别，你会得到的是每个记录的顺序，加上一个列，其中包含最大级别。请注意，您不是按分数分组(这将给您Bob 2 41和Bob 3 21——我们的朋友Bob的两项记录)。

So, how the heck do we fix this? You need to do a subquery to additionally filter your results, which is what that (select max(score)...) is. Now, for each row that reads Bob, you will get his max level (3), and his max score at that level (21). But, this still gives us however many rows Bob has (e.g.-if he has 5 rows, you'll get 5 rows of Bob 3 21). To limit this to only the top score, we need to use a DISTINCT clause in the select statement to only return unique rows.

那么，我们该如何解决这个问题呢?您需要做一个子查询来额外过滤您的结果，这就是(选择max(score)…)。现在，对于每一行读到Bob，您将得到他的最大级别(3)，以及他的最大级别(21)。但是，这仍然给了我们Bob有多少行(例如，如果他有5行，您将得到5行Bob 3 21)。要将其限制为仅最高分数，我们需要在select语句中使用一个不同的子句，以仅返回唯一的行。

UPDATE: Correct SQL (can't comment on le dorfier's post):

更新:正确的SQL(不能评论le dorfier的帖子):

SELECT h1.Name, h1.Level, MAX(h1.Score)
    FROM highscores h1
    LEFT OUTER JOIN highscores h2 ON h1.name = h2.name AND h1.level < h2.level
    LEFT OUTER JOIN highscores h3 ON h1.name = h3.name AND h2.level = h3.level AND h1.score < h3.score
    WHERE h2.Name IS NULL AND h3.Name IS NULL
    GROUP BY h1.Name, h1.Level

#2

This is efficient.

这是有效的。

SELECT h1.Name, h1.Level, h1.Score
FROM highscores h1
LEFT JOIN highscores h2 ON h1.name = h2.name AND h1.level < h2.level
LEFT JOIN highscores h3 ON h1.name = h3.name AND h1.level = h3.level AND h1.score < h3.score
WHERE h2.id IS NULL AND h3.id IS NULL

选择h1。名字,h1。水平,h1。分数从高分h1左加入高分h2在h1.name = h2.name和h1。水平< h2。在h1.name = h3.name和h1上加入h3。= h3水平。水平和h1。分数< h3。分数h2。id为NULL, h3。id是零

You're looking for the level/score for which there is no higher level for that user, and no higher score that that level.

你要找的是那个用户没有更高级别的级别/分数，也没有更高级别的分数。

#3

Interesting problem. Here's another solution:

有趣的问题。这是另一个解决方案:

SELECT hs.name, hs.level, MAX(score) AS score
FROM highscores hs
INNER JOIN (
  SELECT name, MAX(level) AS level FROM highscores GROUP BY name
) hl ON hl.name = hs.name AND hl.level = hs.level
GROUP BY hs.name, hs.level;

Personally, I find this the easiest to understand, and my hunch is that it will be relatively efficient for the database to execute.

就我个人而言，我觉得这是最容易理解的，而且我的直觉是，对数据库执行来说，这是相对高效的。

I like the above query best, but just for kicks... I find the following one amusing in a kludgey sort of way. Assuming score can't exceed 99999...

我最喜欢上面的查询，但只是为了好玩……我觉得下面这句话有点滑稽。假设分数不能超过99999…

SELECT name, level, score
FROM highscores hs
INNER JOIN (
  SELECT name, MAX(level * 100000 + score) AS hfactor
  FROM highscores GROUP BY name
) hf ON hf.hfactor = hs.level * 100000 + hs.score AND hf.name = hs.name;

#1