I have a specific problem. I have a table which contains invalid values. I need to replace the invalid values (here 0
) with the previous value which is bigger than 0
.
我有一个特别的问题。我有一个包含无效值的表。我需要用大于0的前值替换无效值(这里0)。
The difficulty is, it is not appropiate for me to use an Update or an insert(Cursor and update would do it).Well my only way is to use a Select statement.
难点在于,我不赞成使用更新或插入(光标和更新就可以)。我唯一的方法就是使用Select语句。
When I use the lag(col1, 1)
- function with when case, I get only one column with the correct value.
当我使用时滞(col1, 1) -当情况发生时函数,我只得到正确值的一列。
select col1, col2 realcol2,
(case
when col2 = 0 then
lag(col2,1,1) over (partition by col1 order by col1 )
else
col2
end ) col2,
col3 realcol3,
(case
when col3 = 0 then
lag(col3,1,1) over (partition by col1 order by col1 )
else
col3
end ) col3
from test_table
Contents of TEST_TABLE
:
TEST_TABLE内容:
---------------------------
Col1 | Col2 | Col3 | Col4
---------------------------
A | 0 | 1 | 5
B | 0 | 4 | 0
C | 2 | 0 | 0
D | 0 | 0 | 0
E | 3 | 5 | 0
F | 0 | 3 | 0
G | 0 | 3 | 1
A | 0 | 1 | 5
E | 3 | 5 | 0
Expected query result:
将查询结果:
---------------------------
Col1 | Col2 | Col3 | Col4
---------------------------
A | 0 | 1 | 5
B | 0 | 4 | 5
C | 2 | 4 | 5
D | 2 | 4 | 5
E | 3 | 5 | 5
F | 3 | 3 | 5
G | 3 | 3 | 1
A | 3 | 1 | 5
E | 3 | 5 | 5
3 个解决方案
#1
3
SELECT col1,
CASE col2 WHEN 0 THEN NVL( LAG( CASE col2 WHEN 0 THEN NULL ELSE col2 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col2 END AS col2,
CASE col3 WHEN 0 THEN NVL( LAG( CASE col3 WHEN 0 THEN NULL ELSE col3 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col3 END AS col3,
CASE col4 WHEN 0 THEN NVL( LAG( CASE col4 WHEN 0 THEN NULL ELSE col4 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col4 END AS col4
FROM table_name;
Result:
结果:
COL1 COL2 COL3 COL4
---- ---------- ---------- ----------
A 0 1 5
B 0 4 5
C 2 4 5
D 2 4 5
E 3 5 5
F 3 3 5
G 3 3 1
A 3 1 5
E 3 5 5
#2
6
I'm assuming an additional column col0
that contains an obvious ordering criteria for your data, as your col1
example data isn't really ordered correctly (repeated, trailing values of A
and E
).
我假设有一个额外的列col0,它包含了数据的一个明显的排序标准,因为您的col1示例数据并没有被正确地排序(重复的、末尾值为A和E)。
I love the MODEL
clause for these kinds of purposes. The following query yields the expected result:
出于这些目的,我喜欢模型条款。下面的查询会得到预期的结果:
WITH t(col0, col1, col2, col3, col4) AS (
SELECT 1, 'A', 0, 1, 5 FROM DUAL UNION ALL
SELECT 2, 'B', 0, 4, 0 FROM DUAL UNION ALL
SELECT 3, 'C', 2, 0, 0 FROM DUAL UNION ALL
SELECT 4, 'D', 0, 0, 0 FROM DUAL UNION ALL
SELECT 5, 'E', 3, 5, 0 FROM DUAL UNION ALL
SELECT 6, 'F', 0, 3, 0 FROM DUAL UNION ALL
SELECT 7, 'G', 0, 3, 1 FROM DUAL UNION ALL
SELECT 8, 'A', 0, 1, 5 FROM DUAL UNION ALL
SELECT 9, 'E', 3, 5, 0 FROM DUAL
)
SELECT * FROM t
MODEL
DIMENSION BY (row_number() OVER (ORDER BY col0) rn)
MEASURES (col1, col2, col3, col4)
RULES (
col2[any] = DECODE(col2[cv(rn)], 0, NVL(col2[cv(rn) - 1], 0), col2[cv(rn)]),
col3[any] = DECODE(col3[cv(rn)], 0, NVL(col3[cv(rn) - 1], 0), col3[cv(rn)]),
col4[any] = DECODE(col4[cv(rn)], 0, NVL(col4[cv(rn) - 1], 0), col4[cv(rn)])
)
Result:
结果:
RN COL1 COL2 COL3 COL4
1 A 0 1 5
2 B 0 4 5
3 C 2 4 5
4 D 2 4 5
5 E 3 5 5
6 F 3 3 5
7 G 3 3 1
8 A 3 1 5
9 E 3 5 5
SQLFiddle
A note on the MODEL clause vs. window function-based approaches
While the above looks cool (or scary, depending on your point of view), you should certainly prefer using a window function based appraoch as exposed by the other elegant answers by nop77svk (using LAST_VALUE() IGNORE NULLS
) or MT0 (using LAG() IGNORE NULLS
). I've explained these answers more in detail in this blog post.
虽然上面的内容看起来很酷(或者很吓人,取决于您的观点),但是您当然应该更喜欢使用基于appraoch的窗口函数,就像nop77svk(使用LAST_VALUE()忽略NULLS)或MT0(使用LAG()忽略NULLS)的其他优雅答案一样。我在这篇博文中更详细地解释了这些答案。
#3
4
Assuming you want the previous values as per your original data order (whatever that might be), then your query might look like:
假设您希望按照原始数据顺序(无论它是什么)使用先前的值,那么您的查询可能看起来是:
with preserve_the_order$ as (
select X.*,
rownum as original_order$
from test_table X
)
select X.col1,
nvl(last_value(case when col2 > 0 then col2 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col2) as col2,
nvl(last_value(case when col3 > 0 then col3 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col3) as col3,
nvl(last_value(case when col4 > 0 then col4 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col4) as col4
from preserve_the_order$ X
order by original_order$
;
Result:
结果:
COL1 COL2 COL3 COL4
---- ---------- ---------- ----------
A 0 1 5
B 0 4 5
C 2 4 5
D 2 4 5
E 3 5 5
F 3 3 5
G 3 3 1
A 0 1 5
E 3 5 5
#1
3
SELECT col1,
CASE col2 WHEN 0 THEN NVL( LAG( CASE col2 WHEN 0 THEN NULL ELSE col2 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col2 END AS col2,
CASE col3 WHEN 0 THEN NVL( LAG( CASE col3 WHEN 0 THEN NULL ELSE col3 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col3 END AS col3,
CASE col4 WHEN 0 THEN NVL( LAG( CASE col4 WHEN 0 THEN NULL ELSE col4 END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE col4 END AS col4
FROM table_name;
Result:
结果:
COL1 COL2 COL3 COL4
---- ---------- ---------- ----------
A 0 1 5
B 0 4 5
C 2 4 5
D 2 4 5
E 3 5 5
F 3 3 5
G 3 3 1
A 3 1 5
E 3 5 5
#2
6
I'm assuming an additional column col0
that contains an obvious ordering criteria for your data, as your col1
example data isn't really ordered correctly (repeated, trailing values of A
and E
).
我假设有一个额外的列col0,它包含了数据的一个明显的排序标准,因为您的col1示例数据并没有被正确地排序(重复的、末尾值为A和E)。
I love the MODEL
clause for these kinds of purposes. The following query yields the expected result:
出于这些目的,我喜欢模型条款。下面的查询会得到预期的结果:
WITH t(col0, col1, col2, col3, col4) AS (
SELECT 1, 'A', 0, 1, 5 FROM DUAL UNION ALL
SELECT 2, 'B', 0, 4, 0 FROM DUAL UNION ALL
SELECT 3, 'C', 2, 0, 0 FROM DUAL UNION ALL
SELECT 4, 'D', 0, 0, 0 FROM DUAL UNION ALL
SELECT 5, 'E', 3, 5, 0 FROM DUAL UNION ALL
SELECT 6, 'F', 0, 3, 0 FROM DUAL UNION ALL
SELECT 7, 'G', 0, 3, 1 FROM DUAL UNION ALL
SELECT 8, 'A', 0, 1, 5 FROM DUAL UNION ALL
SELECT 9, 'E', 3, 5, 0 FROM DUAL
)
SELECT * FROM t
MODEL
DIMENSION BY (row_number() OVER (ORDER BY col0) rn)
MEASURES (col1, col2, col3, col4)
RULES (
col2[any] = DECODE(col2[cv(rn)], 0, NVL(col2[cv(rn) - 1], 0), col2[cv(rn)]),
col3[any] = DECODE(col3[cv(rn)], 0, NVL(col3[cv(rn) - 1], 0), col3[cv(rn)]),
col4[any] = DECODE(col4[cv(rn)], 0, NVL(col4[cv(rn) - 1], 0), col4[cv(rn)])
)
Result:
结果:
RN COL1 COL2 COL3 COL4
1 A 0 1 5
2 B 0 4 5
3 C 2 4 5
4 D 2 4 5
5 E 3 5 5
6 F 3 3 5
7 G 3 3 1
8 A 3 1 5
9 E 3 5 5
SQLFiddle
A note on the MODEL clause vs. window function-based approaches
While the above looks cool (or scary, depending on your point of view), you should certainly prefer using a window function based appraoch as exposed by the other elegant answers by nop77svk (using LAST_VALUE() IGNORE NULLS
) or MT0 (using LAG() IGNORE NULLS
). I've explained these answers more in detail in this blog post.
虽然上面的内容看起来很酷(或者很吓人,取决于您的观点),但是您当然应该更喜欢使用基于appraoch的窗口函数,就像nop77svk(使用LAST_VALUE()忽略NULLS)或MT0(使用LAG()忽略NULLS)的其他优雅答案一样。我在这篇博文中更详细地解释了这些答案。
#3
4
Assuming you want the previous values as per your original data order (whatever that might be), then your query might look like:
假设您希望按照原始数据顺序(无论它是什么)使用先前的值,那么您的查询可能看起来是:
with preserve_the_order$ as (
select X.*,
rownum as original_order$
from test_table X
)
select X.col1,
nvl(last_value(case when col2 > 0 then col2 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col2) as col2,
nvl(last_value(case when col3 > 0 then col3 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col3) as col3,
nvl(last_value(case when col4 > 0 then col4 end) ignore nulls over (order by original_order$ rows between unbounded preceding and current row), col4) as col4
from preserve_the_order$ X
order by original_order$
;
Result:
结果:
COL1 COL2 COL3 COL4
---- ---------- ---------- ----------
A 0 1 5
B 0 4 5
C 2 4 5
D 2 4 5
E 3 5 5
F 3 3 5
G 3 3 1
A 0 1 5
E 3 5 5