learning sql here and I ran into a challenge.
在这里学习SQL,我遇到了挑战。
I have the following table:
我有下表:
tbl <- data.frame(
id_name = c("a", "a", "b", "c", "d", "f", "b", "c", "d", "f"),
value = c(1, -1, 1, 1, 1, 1, -1, -1, -1, -1),
score = c(1, 0, 1, 2, 3, 4, 3, 2, 1, 0),
date = as.Date(c("2001-1-1", "2002-1-1", "2003-1-1", "2005-1-1",
"2005-1-1", "2007-1-1", "2008-1-1", "2010-1-1",
"2011-1-1", "2012-1-1"), "%Y-%m-%d")
)
+---------+-------+-------+-----------+
| id_name | value | score | date |
+---------+-------+-------+-----------+
| a | 1 | 1 | 2001-1-1 |
| a | -1 | 0 | 2002-1-1 |
| b | 1 | 1 | 2003-1-1 |
| c | 1 | 2 | 2005-1-1 |
| d | 1 | 3 | 2005-1-1 |
| f | 1 | 4 | 2007-1-1 |
| b | -1 | 3 | 2008-1-1 |
| c | -1 | 2 | 2010-1-1 |
| d | -1 | 1 | 2011-1-1 |
| f | -1 | 0 | 2012-1-1 |
+---------+-------+-------+-----------+
My goal is this:
我的目标是:
For each id_name, I'd like to get the first date (in case of tie breakers) of maximum score from the tbl between the dates where the current row = id_name (inclusive)
对于每个id_name,我想在当前行= id_name(包括)的日期之间获得tbl的最大分数(如果是断路器)
For example, id_name 'a' should return '2001-1-1' since its score is 1 id_name 'b' should return '2007-1-1' since its score is 4:
例如,id_name'a'应返回'2001-1-1',因为其得分为1 id_name'b'应返回'2007-1-1',因为其得分为4:
+---------+----------+
| id_name | date |
+---------+----------+
| a | 2001-1-1 |
| b | 2007-1-1 |
+---------+----------+
This is what I have thus far,
这是我到目前为止,
sqldf("
SELECT
id_name,
date,
score
FROM
tbl As d
WHERE
score = (
SELECT MAX(score)
FROM tbl As b
WHERE
date >= (
SELECT MIN(date)
FROM tbl
WHERE id_name = b.id_name
) AND
date <= (
SELECT MAX(date)
FROM tbl
WHERE id_name = b.id_name
)
)
")
Problem is that it is returning the rows with the global max value irrespective of the current row value
问题是它返回具有全局最大值的行,而与当前行值无关
Thanks!
1 个解决方案
#1
0
I think a correlated subquery in the WHERE clause will fit the bill here:
我认为WHERE子句中的相关子查询符合以下条件:
SELECT id_name, date
FROM tbl as t1
WHERE score = (SELECT max(score) FROM tbl WHERE id_name = t1.id_name)
#1
0
I think a correlated subquery in the WHERE clause will fit the bill here:
我认为WHERE子句中的相关子查询符合以下条件:
SELECT id_name, date
FROM tbl as t1
WHERE score = (SELECT max(score) FROM tbl WHERE id_name = t1.id_name)