MySQL - 如何将GROUP BY / ORDER BY与“嵌套”数据集一起使用?

时间:2021-03-09 12:34:17

My (sub)query results in following dataset:

我的(子)查询导致以下数据集:

+---------+------------+-----------+
| item_id | version_id | relevance |
+---------+------------+-----------+
|       1 |          1 |        30 |
|       1 |          2 |        30 |
|       2 |          3 |        22 |
|       3 |          4 |        30 |
|       4 |          5 |        18 |
|       3 |          6 |        30 |
|       2 |          7 |        22 |
|       1 |          8 |        30 |
|       5 |          9 |        48 |
|       4 |         10 |        18 |
|       5 |         11 |        48 |
|       3 |         12 |        30 |
|       3 |         13 |        31 |
|       4 |         14 |        19 |
|       2 |         15 |        22 |
|       1 |         16 |        30 |
|       5 |         17 |        49 |
|       2 |         18 |        22 |
+---------+------------+-----------+
18 rows in set (0.00 sec)

Items and versions are stored in separate InnoDB-tables. Both tables have auto-incrementing primary keys. Versions have a foreign key to items (item_id).

项目和版本存储在单独的InnoDB表中。两个表都有自动递增的主键。版本具有项目的外键(item_id)。

My question: How do I get a subset based on relevance?

我的问题:如何根据相关性获得子集?

I would like to fetch the following subset containing the most relevant versions:

我想获取包含最相关版本的以下子集:

+---------+------------+-----------+
| item_id | version_id | relevance |
+---------+------------+-----------+
|       1 |         16 |        30 |
|       2 |         18 |        22 |
|       3 |         13 |        31 |
|       4 |         14 |        19 |
|       5 |         17 |        49 |
+---------+------------+-----------+

It would be even more ideal to fetch the MAX(version_id) in case of equal relevance.

在相同的相关性情况下获取MAX(version_id)会更理想。

I tried grouping, joining, ordering, etcetera in many ways but I'm not able to get the desired result. Some of the things I tried is:

我在很多方面尝试过分组,连接,排序等等,但是我无法获得理想的结果。我尝试过的一些事情是:

SELECT    item_id, version_id, relevance
FROM      (subquery) a
GROUP BY  item_id
ORDER BY  relevance DESC, version_id DESC

But of course the ordering happens after the fact, so that both relevance and MAX(version_id) information is lost.

但当然,排序发生在事实之后,因此相关性和MAX(version_id)信息都会丢失。

Please advice.

1 个解决方案

#1


1  

This is how you can do this:

这是你如何做到这一点:

SELECT t1.item_id, max(t1.version_id), t1.relevance FROM t t1
LEFT JOIN t t2 ON t1.item_id = t2.item_id AND t1.relevance < t2.relevance
WHERE t2.relevance IS NULL
GROUP BY t1.item_id
ORDER BY t1.item_id, t1.version_id

Output:

| ITEM_ID | VERSION_ID | RELEVANCE |
|---------|------------|-----------|
|       1 |         16 |        30 |
|       2 |         18 |        22 |
|       3 |         13 |        31 |
|       4 |         14 |        19 |
|       5 |         17 |        49 |

Fiddle here.

#1


1  

This is how you can do this:

这是你如何做到这一点:

SELECT t1.item_id, max(t1.version_id), t1.relevance FROM t t1
LEFT JOIN t t2 ON t1.item_id = t2.item_id AND t1.relevance < t2.relevance
WHERE t2.relevance IS NULL
GROUP BY t1.item_id
ORDER BY t1.item_id, t1.version_id

Output:

| ITEM_ID | VERSION_ID | RELEVANCE |
|---------|------------|-----------|
|       1 |         16 |        30 |
|       2 |         18 |        22 |
|       3 |         13 |        31 |
|       4 |         14 |        19 |
|       5 |         17 |        49 |

Fiddle here.