My (sub)query results in following dataset:
我的(子)查询导致以下数据集:
+---------+------------+-----------+
| item_id | version_id | relevance |
+---------+------------+-----------+
| 1 | 1 | 30 |
| 1 | 2 | 30 |
| 2 | 3 | 22 |
| 3 | 4 | 30 |
| 4 | 5 | 18 |
| 3 | 6 | 30 |
| 2 | 7 | 22 |
| 1 | 8 | 30 |
| 5 | 9 | 48 |
| 4 | 10 | 18 |
| 5 | 11 | 48 |
| 3 | 12 | 30 |
| 3 | 13 | 31 |
| 4 | 14 | 19 |
| 2 | 15 | 22 |
| 1 | 16 | 30 |
| 5 | 17 | 49 |
| 2 | 18 | 22 |
+---------+------------+-----------+
18 rows in set (0.00 sec)
Items and versions are stored in separate InnoDB-tables. Both tables have auto-incrementing primary keys. Versions have a foreign key to items (item_id).
项目和版本存储在单独的InnoDB表中。两个表都有自动递增的主键。版本具有项目的外键(item_id)。
My question: How do I get a subset based on relevance?
我的问题:如何根据相关性获得子集?
I would like to fetch the following subset containing the most relevant versions:
我想获取包含最相关版本的以下子集:
+---------+------------+-----------+
| item_id | version_id | relevance |
+---------+------------+-----------+
| 1 | 16 | 30 |
| 2 | 18 | 22 |
| 3 | 13 | 31 |
| 4 | 14 | 19 |
| 5 | 17 | 49 |
+---------+------------+-----------+
It would be even more ideal to fetch the MAX(version_id) in case of equal relevance.
在相同的相关性情况下获取MAX(version_id)会更理想。
I tried grouping, joining, ordering, etcetera in many ways but I'm not able to get the desired result. Some of the things I tried is:
我在很多方面尝试过分组,连接,排序等等,但是我无法获得理想的结果。我尝试过的一些事情是:
SELECT item_id, version_id, relevance
FROM (subquery) a
GROUP BY item_id
ORDER BY relevance DESC, version_id DESC
But of course the ordering happens after the fact, so that both relevance and MAX(version_id) information is lost.
但当然,排序发生在事实之后,因此相关性和MAX(version_id)信息都会丢失。
Please advice.
1 个解决方案
#1
1
This is how you can do this:
这是你如何做到这一点:
SELECT t1.item_id, max(t1.version_id), t1.relevance FROM t t1
LEFT JOIN t t2 ON t1.item_id = t2.item_id AND t1.relevance < t2.relevance
WHERE t2.relevance IS NULL
GROUP BY t1.item_id
ORDER BY t1.item_id, t1.version_id
Output:
| ITEM_ID | VERSION_ID | RELEVANCE |
|---------|------------|-----------|
| 1 | 16 | 30 |
| 2 | 18 | 22 |
| 3 | 13 | 31 |
| 4 | 14 | 19 |
| 5 | 17 | 49 |
Fiddle here.
#1
1
This is how you can do this:
这是你如何做到这一点:
SELECT t1.item_id, max(t1.version_id), t1.relevance FROM t t1
LEFT JOIN t t2 ON t1.item_id = t2.item_id AND t1.relevance < t2.relevance
WHERE t2.relevance IS NULL
GROUP BY t1.item_id
ORDER BY t1.item_id, t1.version_id
Output:
| ITEM_ID | VERSION_ID | RELEVANCE |
|---------|------------|-----------|
| 1 | 16 | 30 |
| 2 | 18 | 22 |
| 3 | 13 | 31 |
| 4 | 14 | 19 |
| 5 | 17 | 49 |
Fiddle here.