So I have a table, my_table
with a primary key, id
(INT
), and further columns foo
(VARCHAR
) and bar
(DOUBLE
). Each foo
should appear once in my table, with an associated bar
value, but I know that I have several rows with identical foo
s associated different bar
s. How do I get a list of those rows containing the same foo
value, but which have different bar
s (say, different by more than 10.)? I tried:
所以我有一个表,my_table带有主键,id(INT),还有列foo(VARCHAR)和bar(DOUBLE)。每个foo应该在我的表中出现一次,带有相关的条形值,但我知道我有几行具有相同的foos关联不同的条形。如何获取包含相同foo值但具有不同条形(即,差异超过10)的那些行的列表?我试过了:
SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo FROM my_table t1, my_table t2 WHERE t1.foo=t2.foo AND t1.bar - t2.bar > 10.;
But I get lots and lots of results (more than the total number of rows in my_table
). I feel I must be doing something very obviously stupid, but can't see my mistake.
但是我得到了很多很多结果(超过了my_table中的总行数)。我觉得我必须做一些非常明显愚蠢的事情,但看不出我的错误。
Ah - thanks SWeko: I think I understand why I'm getting so many results, then. Is there a way in SQL of counting, for each foo
, the number of rows with that foo
but bar
s differing by more than 10.?
啊 - 谢谢SWeko:我想我理解为什么我会得到这么多结果呢。在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。
3 个解决方案
#1
0
If, for example, you have 5 rows with foo='A'
and 10 rows with foo='B'
the self-join will join each A-row with each other A-row (including itself) and each B-row with each other B-row, so a simple
例如,如果你有5行foo ='A'和10行foo ='B',那么自连接会将每个A行与另一个A行(包括它自己)和每个B行连接起来对方B排,这么简单
SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
FROM my_table t1, my_table t2
WHERE t1.foo=t2.foo
will return 5*5+10*10=125
rows. Filtering the values will cut that number down, but you might still have (significantly) more rows than you started with. E.g. if we presume that the B-rows have values of bar
of 5 through 50 respectively, that would mean that they will be matched with:
将返回5 * 5 + 10 * 10 = 125行。过滤这些值会减少该数字,但您可能仍然拥有(显着)多于您开始的行数。例如。如果我们假设B行的条形值分别为5到50,那就意味着它们将匹配:
bar = 5 - 0 rows that have bar less than -5
bar = 10 - 0 rows that have bar less than 0
bar = 15 - 0 rows that have bar less than 5
bar = 20 - 1 rows that have bar less than 10
bar = 25 - 2 rows that have bar less than 15
bar = 30 - 3 rows that have bar less than 20
bar = 35 - 4 rows that have bar less than 25
bar = 40 - 5 rows that have bar less than 30
bar = 45 - 6 rows that have bar less than 35
bar = 50 - 7 rows that have bar less than 40
so you will have 28 results for the B-rows alone, and that number rises with the square of the rows that have the same value of foo
.
因此,对于B行,您将获得28个结果,并且该数字随着具有相同foo值的行的平方而上升。
#2
2
To answer your latest question:
要回答您的最新问题:
Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?
在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。
A query like this should work:
像这样的查询应该有效:
select t1.id, t1.foo, t1.bar, count(t2.id) as dupes
from my_table t1
left outer join my_table t2 on t1.foo=t2.foo and (t1.bar - t2.bar) > 10
group by t1.id, t1.foo, t1.bar;
#3
-1
Have you tried the same thing with the "new" JOIN
syntax?
你有没有用“新”JOIN语法尝试相同的东西?
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo
WHERE (t1.bar - t2.bar) > 10
I don't suspect that that will fix your problem, but for me that's at least where I would start.
我不怀疑这会解决你的问题,但对我而言,这至少是我要开始的地方。
I might also try this:
我也可以试试这个:
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo AND t1.id != t2.id
WHERE (t1.bar - t2.bar) > 10
#1
0
If, for example, you have 5 rows with foo='A'
and 10 rows with foo='B'
the self-join will join each A-row with each other A-row (including itself) and each B-row with each other B-row, so a simple
例如,如果你有5行foo ='A'和10行foo ='B',那么自连接会将每个A行与另一个A行(包括它自己)和每个B行连接起来对方B排,这么简单
SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
FROM my_table t1, my_table t2
WHERE t1.foo=t2.foo
will return 5*5+10*10=125
rows. Filtering the values will cut that number down, but you might still have (significantly) more rows than you started with. E.g. if we presume that the B-rows have values of bar
of 5 through 50 respectively, that would mean that they will be matched with:
将返回5 * 5 + 10 * 10 = 125行。过滤这些值会减少该数字,但您可能仍然拥有(显着)多于您开始的行数。例如。如果我们假设B行的条形值分别为5到50,那就意味着它们将匹配:
bar = 5 - 0 rows that have bar less than -5
bar = 10 - 0 rows that have bar less than 0
bar = 15 - 0 rows that have bar less than 5
bar = 20 - 1 rows that have bar less than 10
bar = 25 - 2 rows that have bar less than 15
bar = 30 - 3 rows that have bar less than 20
bar = 35 - 4 rows that have bar less than 25
bar = 40 - 5 rows that have bar less than 30
bar = 45 - 6 rows that have bar less than 35
bar = 50 - 7 rows that have bar less than 40
so you will have 28 results for the B-rows alone, and that number rises with the square of the rows that have the same value of foo
.
因此,对于B行,您将获得28个结果,并且该数字随着具有相同foo值的行的平方而上升。
#2
2
To answer your latest question:
要回答您的最新问题:
Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?
在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。
A query like this should work:
像这样的查询应该有效:
select t1.id, t1.foo, t1.bar, count(t2.id) as dupes
from my_table t1
left outer join my_table t2 on t1.foo=t2.foo and (t1.bar - t2.bar) > 10
group by t1.id, t1.foo, t1.bar;
#3
-1
Have you tried the same thing with the "new" JOIN
syntax?
你有没有用“新”JOIN语法尝试相同的东西?
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo
WHERE (t1.bar - t2.bar) > 10
I don't suspect that that will fix your problem, but for me that's at least where I would start.
我不怀疑这会解决你的问题,但对我而言,这至少是我要开始的地方。
I might also try this:
我也可以试试这个:
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo AND t1.id != t2.id
WHERE (t1.bar - t2.bar) > 10