
时间:2022-04-28 15:59:47

I’d like to identify IDs whose values transition from positive value to zero or negative value.


So, if we have a table as follows:


1   20     2
1   10     1
2   0      2
2   100    1
3   -10    2
3   5      1
4   0      2
4   0      1
5   -3     2
5   25     1
6   30     2
6  -50     1
7  -10     2
7   0      1
8  -100    3
8  50      2
8  100     1
9  -10     4
9  0       3
9  10      2
9  20      1

The results should be as follows, where ID 2’s values transitioned from 100 to 0, ID 3’s values transitioned from 5 to -10, ID 5’s values transitioned from 25 to -3. We are not interested in ID 6 since its values transitioned from negative to positive values -- we only care about IDs whose values went from positive to 0 or negative (and not vice versa). We can also have IDs that transition from 0 to negative but not 0 to 0:

结果应如下所示,其中ID 2的值从100转换为0,ID 3的值从5转换为-10,ID 5的值从25转换为-3。我们对ID 6不感兴趣,因为它的值从负值转换为正值 - 我们只关心其值从正值到0或负值(反之亦然)的ID。我们也可以将ID从0转换为负但不是0到0:

2   0       2
2   100     1
3   -10     2
3   5       1
5   -3      2
5   25      1
7  -10      2
7   0       1
8  -100     3
8   50      2
9  -10      4
9   0       3
9   10      2

How would I achieve such a result?


4 个解决方案



You could find the previous and next value for each ID using analytic lag/lead functions:


select id, value, seq,
  lag(value) over (partition by id order by seq) as prev_val,
  lead(value) over (partition by id order by seq) as next_val
from your_table
order by id, seq desc;

        ID      VALUE        SEQ   PREV_VAL   NEXT_VAL
---------- ---------- ---------- ---------- ----------
         1         20          2         10           
         1         10          1                    20
         2          0          2        100           
         2        100          1                     0
         3        -10          2          5           
         3          5          1                   -10
         4          0          2          0           
         4          0          1                     0
         5         -3          2         25           
         5         25          1                    -3
         6         30          2        -50           
         6        -50          1                    30
         7        -10          2          0           
         7          0          1                   -10
         8       -100          3         50           
         8         50          2        100       -100
         8        100          1                    50
         9        -10          3          0           
         9          0          2         10        -10
         9         20          1                    10
         9         10          1         20          0

(I changed the column names, as both order and values are reserved). You can then use that as a subquery and filter the combinations you want to see:


select id, value, seq
from (
  select id, value, seq,
    lag(value) over (partition by id order by seq) as prev_val,
    lead(value) over (partition by id order by seq) as next_val
  from your_table
where (prev_val >=0 and value <= 0 and not (prev_val = 0 and value = 0))
or (value >= 0 and next_val <= 0 and not (value = 0 and next_val = 0))
order by id, seq desc;

        ID      VALUE        SEQ
---------- ---------- ----------
         2          0          2
         2        100          1
         3        -10          2
         3          5          1
         5         -3          2
         5         25          1
         7        -10          2
         7          0          1
         8       -100          3
         8         50          2
         9        -10          3
         9          0          2
         9         10          1

As Gordon mentioned, if you only want the IDs - not the values involved - you only need the lead or lag value, not both.

正如戈登提到的,如果你只想要ID - 而不是所涉及的值 - 你只需要超前或滞后值,而不是两者。



I think you can just do this using lead():


select distinct id
from (select t.*,
             lead(value) over (partition by id order by seq) as next_val,
      from t
     ) t
where value > 0 and next_val <= 0;

Note that I used Alex's naming conventions.




This should give you the same result but in rows


select t1.ID, t1.VALUES, t2.VALUES from TABLE t1
join TABLE t2 on t1.VALUE < t2.VALUE and t1.ID = t2.ID and t1.ORDER > t2.ORDER
where t1.ORDER = t2.ORDER +1 and not (t1.VALUE >0 and t2.VALUE >0)

Table would look like this if this is useful for you :


ID    Value1    Value2
2     0         100
3     -10       5
5     -3        25
7     -10       0
8     -100      50
9     -10       0
9     0         10



If you are using Oracle 12c, this is a great use case for MATCH_RECOGNIZE. In that case, it would be done like so:

如果您使用的是Oracle 12c,这是MATCH_RECOGNIZE的一个很好的用例。在这种情况下,它将这样做:

SELECT id, value, ord FROM d
MATCH_RECOGNIZE ( partition by id 
                  order by ord
                  all rows per match
                  after match skip past last row
                  pattern (posorzero neg | pos negorzero )
                    neg as neg.value < 0,
                    negorzero as negorzero.value <= 0,
                    pos as pos.value > 0, 
                    posorzero as posorzero.value>= 0
order by id, ord desc;

--  -----   ---
2       0     2
2     100     1
3     -10     2
3       5     1
5      -3     2
5      25     1
7     -10     2
7       0     1
8    -100     3
8      50     2
9       0     3
9      10     2

The only thing is, this doesn't include the following row from your desired output:


9  -10      4

In all other cases, your desired output seems to be "include only the rows from each id that contribute to the cross-over". This one row that you've included in your desired output would not meet that definition, since these rows:


9   0       3
9   10      2

... by themselves represent a cross-over as you've defined it.


If you have a consistent and better definition of what the requested output should include, please let me know and I'll update my answer.




You could find the previous and next value for each ID using analytic lag/lead functions:


select id, value, seq,
  lag(value) over (partition by id order by seq) as prev_val,
  lead(value) over (partition by id order by seq) as next_val
from your_table
order by id, seq desc;

        ID      VALUE        SEQ   PREV_VAL   NEXT_VAL
---------- ---------- ---------- ---------- ----------
         1         20          2         10           
         1         10          1                    20
         2          0          2        100           
         2        100          1                     0
         3        -10          2          5           
         3          5          1                   -10
         4          0          2          0           
         4          0          1                     0
         5         -3          2         25           
         5         25          1                    -3
         6         30          2        -50           
         6        -50          1                    30
         7        -10          2          0           
         7          0          1                   -10
         8       -100          3         50           
         8         50          2        100       -100
         8        100          1                    50
         9        -10          3          0           
         9          0          2         10        -10
         9         20          1                    10
         9         10          1         20          0

(I changed the column names, as both order and values are reserved). You can then use that as a subquery and filter the combinations you want to see:


select id, value, seq
from (
  select id, value, seq,
    lag(value) over (partition by id order by seq) as prev_val,
    lead(value) over (partition by id order by seq) as next_val
  from your_table
where (prev_val >=0 and value <= 0 and not (prev_val = 0 and value = 0))
or (value >= 0 and next_val <= 0 and not (value = 0 and next_val = 0))
order by id, seq desc;

        ID      VALUE        SEQ
---------- ---------- ----------
         2          0          2
         2        100          1
         3        -10          2
         3          5          1
         5         -3          2
         5         25          1
         7        -10          2
         7          0          1
         8       -100          3
         8         50          2
         9        -10          3
         9          0          2
         9         10          1

As Gordon mentioned, if you only want the IDs - not the values involved - you only need the lead or lag value, not both.

正如戈登提到的,如果你只想要ID - 而不是所涉及的值 - 你只需要超前或滞后值,而不是两者。



I think you can just do this using lead():


select distinct id
from (select t.*,
             lead(value) over (partition by id order by seq) as next_val,
      from t
     ) t
where value > 0 and next_val <= 0;

Note that I used Alex's naming conventions.




This should give you the same result but in rows


select t1.ID, t1.VALUES, t2.VALUES from TABLE t1
join TABLE t2 on t1.VALUE < t2.VALUE and t1.ID = t2.ID and t1.ORDER > t2.ORDER
where t1.ORDER = t2.ORDER +1 and not (t1.VALUE >0 and t2.VALUE >0)

Table would look like this if this is useful for you :


ID    Value1    Value2
2     0         100
3     -10       5
5     -3        25
7     -10       0
8     -100      50
9     -10       0
9     0         10



If you are using Oracle 12c, this is a great use case for MATCH_RECOGNIZE. In that case, it would be done like so:

如果您使用的是Oracle 12c,这是MATCH_RECOGNIZE的一个很好的用例。在这种情况下,它将这样做:

SELECT id, value, ord FROM d
MATCH_RECOGNIZE ( partition by id 
                  order by ord
                  all rows per match
                  after match skip past last row
                  pattern (posorzero neg | pos negorzero )
                    neg as neg.value < 0,
                    negorzero as negorzero.value <= 0,
                    pos as pos.value > 0, 
                    posorzero as posorzero.value>= 0
order by id, ord desc;

--  -----   ---
2       0     2
2     100     1
3     -10     2
3       5     1
5      -3     2
5      25     1
7     -10     2
7       0     1
8    -100     3
8      50     2
9       0     3
9      10     2

The only thing is, this doesn't include the following row from your desired output:


9  -10      4

In all other cases, your desired output seems to be "include only the rows from each id that contribute to the cross-over". This one row that you've included in your desired output would not meet that definition, since these rows:


9   0       3
9   10      2

... by themselves represent a cross-over as you've defined it.


If you have a consistent and better definition of what the requested output should include, please let me know and I'll update my answer.
