So I have a table of transactions. I need to exclude any transactions that are within 15 minutes of the previous transaction for the same USER ID.
所以我有一张交易表。我需要排除相同USER ID在上一次交易的15分钟内发生的任何交易。
EXAMPLE
例
USERID TRANS_TIME
----------------------------------------
00000001 24-FEB-17 15.13.51.713000000
00000001 16-MAR-17 10.10.20.781000000
00000001 16-MAR-17 10.10.32.659000000
00000001 16-MAR-17 10.13.04.070000000
00000001 16-MAR-17 10.13.49.339000000
00000001 16-MAR-17 10.22.33.467000000
00000001 16-MAR-17 10.23.09.755000000
00000001 16-MAR-17 10.25.51.994000000
00000001 16-MAR-17 10.26.08.130000000
00000001 29-MAR-17 10.23.01.665000000
So I would end up with 4 rows.
所以我最终会有4行。
USER ID TRANS_TIME
----------------------------------------
00000001 24-FEB-17 15.13.51.713000000
00000001 16-MAR-17 10.10.20.781000000
00000001 16-MAR-17 10.25.51.994000000
00000001 29-MAR-17 10.23.01.665000000
Any ideas or tips on how to code for this? Ideally without creating a function or a procedure.
有关如何为此编码的任何想法或提示?理想情况下,无需创建函数或过程。
Cheers.
干杯。
3 个解决方案
#1
1
Interpreting your required logic as follows:
解释您所需的逻辑如下:
Separately for each userid
, include the row with the earliest transaction time. Then, for each row, look to see if it is within 15 minutes (<=) of the most recent included row, and if it is, then exclude this "current" row you are examining. If the new row is not within 15 minutes of the most recently included row, then include this new row.
对于每个用户标识,分别包括具有最早事务时间的行。然后,对于每一行,查看它是否在最近包含的行的15分钟(<=)内,如果是,则排除您正在检查的此“当前”行。如果新行不在最近包含的行的15分钟内,则包括此新行。
In other words, there are 15 minute sessions. A row opens a new session if it is not already in a session opened by another row. In this arrangement, as demonstrated by your desired output, it is not enough to compare a row to the one immediately preceding it.
换句话说,有15分钟的会议。如果行尚未在另一行打开的会话中,则会打开一个新会话。在这种安排中,如您所需的输出所示,仅将行与其前一行进行比较是不够的。
This problem can be solved very easily with the MATCH_RECOGNIZE
clause in Oracle 12.1 and above. Alas, this is not available in Oracle 11 or earlier.
使用Oracle 12.1及更高版本中的MATCH_RECOGNIZE子句可以非常轻松地解决此问题。唉,这在Oracle 11或更早版本中不可用。
with
test_data ( userid, trans_time ) as (
select '00000001', to_timestamp('24-FEB-17 15.13.51.713000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.10.20.781000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.10.32.659000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.13.04.070000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.13.49.339000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.22.33.467000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.23.09.755000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.25.51.994000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.26.08.130000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('29-MAR-17 10.23.01.665000000', 'dd-MON-yy hh24.mi.ss.ff') from dual
)
-- End of test data (not part of the solution). SQL query begins below this line.
select userid, session_start as trans_time
from test_data
match_recognize (
partition by userid
order by trans_time
measures a.trans_time as session_start
pattern ( a b* )
define b as b.trans_time <= a.trans_time + interval '15' minute
)
order by userid, trans_time -- if needed
;
USERID TRANS_TIME
-------- ------------------------------
00000001 24-FEB-2017 15.13.51.713000000
00000001 16-MAR-2017 10.10.20.781000000
00000001 16-MAR-2017 10.25.51.994000000
00000001 29-MAR-2017 10.23.01.665000000
#2
1
Just use lag()
:
只需使用lag():
select t.*
from (select t.*,
lag(trans_time) over (partition by userid order by trans_time) as prev_tt
from t
) t
where prev_tt is null or
trans_time > prev_tt + (15 / (24 * 60));
Note: You can write the where
using interval
notation instead (that is actually a better approach):
注意:您可以使用区间符号来编写where(这实际上是一种更好的方法):
where prev_tt is null or
trans_time > prev_tt + interval '15' minute;
#3
1
With the same assumptions I made in my other answer (using the MATCH_RECOGNIZE clause), here is another way to solve the problem.
我在另一个答案中使用相同的假设(使用MATCH_RECOGNIZE子句),这是解决问题的另一种方法。
This solution uses recursive subquery factoring (recursive CTE), and therefore will work in Oracle 11.2 (but, unfortunately, not in earlier versions).
此解决方案使用递归子查询因子(递归CTE),因此将在Oracle 11.2中工作(但遗憾的是,不在早期版本中)。
with
-- Begin test data (not part of the solution)
test_data ( userid, trans_time ) as (
[ select ...... SAME AS IN THE OTHER ANSWER ]
),
-- End of test data (not part of the solution). SQL query begins below this line.
prep ( userid, trans_time, rn ) as (
select userid, trans_time,
row_number() over (partition by userid order by trans_time)
from test_data
),
rec ( userid, trans_time, rn, session_start ) as (
select userid, min(trans_time), 1, min(trans_time)
from prep
group by userid
union all
select p.userid, p.trans_time, p.rn,
case when p.trans_time > r.session_start + interval '15' minute
then p.trans_time
else r.session_start
end
from prep p join rec r on p.userid = r.userid and p.rn = r.rn + 1
)
select distinct userid, trans_time
from rec
where trans_time = session_start
order by userid, trans_time -- if needed
;
#1
1
Interpreting your required logic as follows:
解释您所需的逻辑如下:
Separately for each userid
, include the row with the earliest transaction time. Then, for each row, look to see if it is within 15 minutes (<=) of the most recent included row, and if it is, then exclude this "current" row you are examining. If the new row is not within 15 minutes of the most recently included row, then include this new row.
对于每个用户标识,分别包括具有最早事务时间的行。然后,对于每一行,查看它是否在最近包含的行的15分钟(<=)内,如果是,则排除您正在检查的此“当前”行。如果新行不在最近包含的行的15分钟内,则包括此新行。
In other words, there are 15 minute sessions. A row opens a new session if it is not already in a session opened by another row. In this arrangement, as demonstrated by your desired output, it is not enough to compare a row to the one immediately preceding it.
换句话说,有15分钟的会议。如果行尚未在另一行打开的会话中,则会打开一个新会话。在这种安排中,如您所需的输出所示,仅将行与其前一行进行比较是不够的。
This problem can be solved very easily with the MATCH_RECOGNIZE
clause in Oracle 12.1 and above. Alas, this is not available in Oracle 11 or earlier.
使用Oracle 12.1及更高版本中的MATCH_RECOGNIZE子句可以非常轻松地解决此问题。唉,这在Oracle 11或更早版本中不可用。
with
test_data ( userid, trans_time ) as (
select '00000001', to_timestamp('24-FEB-17 15.13.51.713000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.10.20.781000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.10.32.659000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.13.04.070000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.13.49.339000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.22.33.467000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.23.09.755000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.25.51.994000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('16-MAR-17 10.26.08.130000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
select '00000001', to_timestamp('29-MAR-17 10.23.01.665000000', 'dd-MON-yy hh24.mi.ss.ff') from dual
)
-- End of test data (not part of the solution). SQL query begins below this line.
select userid, session_start as trans_time
from test_data
match_recognize (
partition by userid
order by trans_time
measures a.trans_time as session_start
pattern ( a b* )
define b as b.trans_time <= a.trans_time + interval '15' minute
)
order by userid, trans_time -- if needed
;
USERID TRANS_TIME
-------- ------------------------------
00000001 24-FEB-2017 15.13.51.713000000
00000001 16-MAR-2017 10.10.20.781000000
00000001 16-MAR-2017 10.25.51.994000000
00000001 29-MAR-2017 10.23.01.665000000
#2
1
Just use lag()
:
只需使用lag():
select t.*
from (select t.*,
lag(trans_time) over (partition by userid order by trans_time) as prev_tt
from t
) t
where prev_tt is null or
trans_time > prev_tt + (15 / (24 * 60));
Note: You can write the where
using interval
notation instead (that is actually a better approach):
注意:您可以使用区间符号来编写where(这实际上是一种更好的方法):
where prev_tt is null or
trans_time > prev_tt + interval '15' minute;
#3
1
With the same assumptions I made in my other answer (using the MATCH_RECOGNIZE clause), here is another way to solve the problem.
我在另一个答案中使用相同的假设(使用MATCH_RECOGNIZE子句),这是解决问题的另一种方法。
This solution uses recursive subquery factoring (recursive CTE), and therefore will work in Oracle 11.2 (but, unfortunately, not in earlier versions).
此解决方案使用递归子查询因子(递归CTE),因此将在Oracle 11.2中工作(但遗憾的是,不在早期版本中)。
with
-- Begin test data (not part of the solution)
test_data ( userid, trans_time ) as (
[ select ...... SAME AS IN THE OTHER ANSWER ]
),
-- End of test data (not part of the solution). SQL query begins below this line.
prep ( userid, trans_time, rn ) as (
select userid, trans_time,
row_number() over (partition by userid order by trans_time)
from test_data
),
rec ( userid, trans_time, rn, session_start ) as (
select userid, min(trans_time), 1, min(trans_time)
from prep
group by userid
union all
select p.userid, p.trans_time, p.rn,
case when p.trans_time > r.session_start + interval '15' minute
then p.trans_time
else r.session_start
end
from prep p join rec r on p.userid = r.userid and p.rn = r.rn + 1
)
select distinct userid, trans_time
from rec
where trans_time = session_start
order by userid, trans_time -- if needed
;