Ok so I have this table :
好的,我有这张桌子:
+----+--------------+------------------+----------+
| id | business_key | other columns... | creation |
+----+--------------+------------------+----------+
| 1 | 1 | ... | 01/01/14 |
| 2 | 1 | ... | 12/02/14 |
| 3 | 1 | ... | 13/03/14 | <--
| 4 | 2 | ... | 01/01/14 |
| 5 | 2 | ... | 12/02/14 | <--
| 6 | 8 | ... | 01/01/14 | <--
| 7 | 10 | ... | 01/01/14 |
| 8 | 10 | ... | 12/02/14 |
| 9 | 10 | ... | 13/03/14 |
| 10 | 10 | ... | 13/03/14 | <--
+----+--------------+------------------+----------+
For each business key, I want to return the most recent row and for that I have the "creation" column (see the arrows above). The simple answer would be :
对于每个业务键,我想返回最近的行,为此我有“创建”列(参见上面的箭头)。简单的答案是:
SELECT business_key, MAX(creation) FROM mytable GROUP BY business_key;
The thing is, I need to return ALL the columns. Then I learned the existence of the greatest-n-per-group tag on * and I found this topic : SQL Select only rows with Max Value on a Column. The best answer is great and provides this request :
问题是,我需要返回所有列。然后我在*上了解了最大的每组标签的存在,我找到了这个主题:SQL只选择列上具有最大值的行。最好的答案很棒,并提供此请求:
SELECT mt1.*
FROM mytable mt1
LEFT OUTER JOIN mytable mt2
ON (mt1.business_key = mt2.business_key AND mt1.creation < mt2.creation)
WHERE mt2.business_key IS NULL;
Sadly it doesn't work because my situation is a little trickier : if you look at the line 9 and 10 of my table, you will see that they have the same business key and the same creation date. While this should be avoided in my application, I still have to handle it if it happens.
可悲的是,它不起作用,因为我的情况有点棘手:如果你查看我的表格的第9行和第10行,你会发现他们有相同的商业密钥和相同的创建日期。虽然在我的应用程序中应该避免这种情况,但如果发生这种情况,我仍然需要处理它。
With the last request above, this is what I will get :
根据上面的最后一个请求,这是我将得到的:
+----+--------------+------------------+----------+
| id | business_key | other columns... | creation |
+----+--------------+------------------+----------+
| 3 | 1 | ... | 13/03/14 |
| 5 | 2 | ... | 12/02/14 |
| 6 | 8 | ... | 01/01/14 |
| 9 | 10 | ... | 13/03/14 | <--
| 10 | 10 | ... | 13/03/14 | <--
+----+--------------+------------------+----------+
While I wanted this :
虽然我想要这个:
+----+--------------+------------------+----------+
| id | business_key | other columns... | creation |
+----+--------------+------------------+----------+
| 3 | 1 | ... | 13/03/14 |
| 5 | 2 | ... | 12/02/14 |
| 6 | 8 | ... | 01/01/14 |
| 10 | 10 | ... | 13/03/14 | <--
+----+--------------+------------------+----------+
I know it's a poor choice to want a MAX() on a technical column like "id", but right now it's the only way for me to prevent duplicates when the business key AND the creation date are the same. The problem is, I have no idea how to do it. Any idea ? Keep in mind it must return all the columns (and we have a lot of columns so a SELECT * will be necessary).
我知道在像“id”这样的技术专栏上想要MAX()是一个糟糕的选择,但是现在这是我在商业密钥和创建日期相同时防止重复的唯一方法。问题是,我不知道该怎么做。任何想法 ?请记住它必须返回所有列(并且我们有很多列,因此需要SELECT *)。
Thanks a lot.
非常感谢。
1 个解决方案
#1
3
The first thought is that your id
seems to increment along with the date, so just use that:
第一个想法是你的id似乎随着日期一起递增,所以只需使用:
SELECT mt1.*
FROM mytable mt1 LEFT OUTER JOIN
mytable mt2
ON mt1.business_key = mt2.business_key AND mt2.id > mt1.id
WHERE mt2.business_key IS NULL;
You can still do the same idea with two columns:
你仍然可以用两列做同样的想法:
SELECT mt1.*
FROM mytable mt1 LEFT OUTER JOIN
mytable mt2
ON mt1.business_key = mt2.business_key AND
(mt2.creation > mt1.creation OR
mt2.creation = mt1.creation AND
mt2.id > mt1.id
)
WHERE mt2.business_key IS NULL;
#1
3
The first thought is that your id
seems to increment along with the date, so just use that:
第一个想法是你的id似乎随着日期一起递增,所以只需使用:
SELECT mt1.*
FROM mytable mt1 LEFT OUTER JOIN
mytable mt2
ON mt1.business_key = mt2.business_key AND mt2.id > mt1.id
WHERE mt2.business_key IS NULL;
You can still do the same idea with two columns:
你仍然可以用两列做同样的想法:
SELECT mt1.*
FROM mytable mt1 LEFT OUTER JOIN
mytable mt2
ON mt1.business_key = mt2.business_key AND
(mt2.creation > mt1.creation OR
mt2.creation = mt1.creation AND
mt2.id > mt1.id
)
WHERE mt2.business_key IS NULL;