改进SAS SQL查询的想法

时间:2021-04-01 03:53:38

This is based on a question I answered here about how to use SAS to solve the following problem. [NOTE: There is a working SAS data step solution, but I was just trying to see how it would work in SQL.]

这是基于我在这里回答的有关如何使用SAS解决以下问题的问题。 [注意:有一个可用的SAS数据步骤解决方案,但我只是想看看它在SQL中是如何工作的。]

The problem (a little hard to describe) is to look at each transaction for a customer, look back 90 days from that transaction's date, count the total number of the customer's transactions in that 90-day window, save that number with the original transaction, count the number of distinct account managers handling the customer's transactions in that 90-day window, and save that number with the original transaction.

问题(有点难以描述)是查看客户的每笔交易,回顾该交易日期后90天,计算该90天窗口中客户交易的总数,用原始交易保存该数字,计算在该90天窗口中处理客户交易的不同客户经理的数量,并将该数字与原始交易一起保存。

Here is the data step to initialize a test transaction dataset and my SAS SQL solution:

以下是初始化测试事务数据集和SAS SQL解决方案的数据步骤:

data transaction;
    length customerid $ 12 accountmanager $7 transactionid $ 12;
    input CustomerID AccountManager TransactionID Transaction_Time datetime.;
    format transaction_time datetime.;
    datalines;
1111111111  FA001          TR2016001      08SEP16:11:19:25
1111111111  FA001          TR2016002      26OCT16:08:22:49
1111111111  FA002          TR2016003      04NOV16:08:05:36
1111111111  FA003          TR2016004      04NOV16:17:15:52
1111111111  FA004          TR2016005      25NOV16:13:04:16
1231231234  FA005          TR2016006      25AUG15:08:03:29
1231231234  FA005          TR2016007      16SEP15:08:24:24
1231231234  FA008          TR2016008      18SEP15:14:42:29
;;;;
run;

proc sql;
    create table want as
        select mgrs.*, trans.tranct
        from
            (select t3.customerid, t3.accountmanager, t3.transactionid, t3.transaction_time, count(*) as mgrct
                from (select distinct t1.*, t2.accountmanager as m2
                      from transaction t1, transaction t2
                      where t1.customerid=t2.customerid
                            and
                          datepart(t1.transaction_time) >= datepart(t2.transaction_time)-90
                            and
                          t2.transaction_time <= t1.transaction_time) t3
                group by t3.customerid, t3.accountmanager, t3.transactionid, t3.transaction_time) mgrs,
            (select t4.customerid, t4.transactionid, count(*) as tranct
                from transaction t4, transaction t5
                where t4.customerid=t5.customerid
                        and
                      datepart(t4.transaction_time) >= datepart(t5.transaction_time)-90
                        and
                      t5.transaction_time <= t4.transaction_time
                group by t4.customerid, t4.transactionid) trans
        where mgrs.customerid=trans.customerid and mgrs.transactionid=trans.transactionid;
quit;

The result looks like this:

结果如下:

customerid  accountmanager transactionid  Transaction_Time mgrct tranct
1111111111  FA001          TR2016001      08SEP16:11:19:25   1     1 
1111111111  FA001          TR2016002      26OCT16:08:22:49   1     2
1111111111  FA002          TR2016003      04NOV16:08:05:36   2     3
1111111111  FA003          TR2016004      04NOV16:17:15:52   3     4
1111111111  FA004          TR2016005      25NOV16:13:04:16   4     5
1231231234  FA005          TR2016006      25AUG15:08:03:29   1     1
1231231234  FA005          TR2016007      16SEP15:08:24:24   1     2
1231231234  FA008          TR2016008      18SEP15:14:42:29   2     3

I haven't used SQL for a long time, so I would like to know if there is a more elegant SQL solution. I thought it would be simpler, but actually the SAS data step code seems simpler than this SQL query.

我已经很久没有使用SQL了,所以我想知道是否有更优雅的SQL解决方案。我认为它会更简单,但实际上SAS数据步骤代码似乎比这个SQL查询更简单。

1 个解决方案

#1


3  

I think you just need to join the table with itself.

我想你只需要自己加入这个桌子。

proc sql noprint ;
 create table want as
   select a.*
        , count(distinct b.accountmanager) as mgrct
        , count(*) as tranct
   from transaction a
   left join transaction b
   on a.customerid = b.customerid
    and b.transaction_time <= a.transaction_time
    and datepart(a.transaction_time)-datepart(b.transaction_time)
        between 0 and 90
   group by 1,2,3,4
 ;
quit;

Results

结果

1111111111 FA001 TR2016001 08SEP16:11:19:25 1 1  
1111111111 FA001 TR2016002 26OCT16:08:22:49 1 2  
1111111111 FA002 TR2016003 04NOV16:08:05:36 2 3  
1111111111 FA003 TR2016004 04NOV16:17:15:52 3 4  
1111111111 FA004 TR2016005 25NOV16:13:04:16 4 5  
1231231234 FA005 TR2016006 25AUG15:08:03:29 1 1  
1231231234 FA005 TR2016007 16SEP15:08:24:24 1 2  
1231231234 FA008 TR2016008 18SEP15:14:42:29 2 3  

#1


3  

I think you just need to join the table with itself.

我想你只需要自己加入这个桌子。

proc sql noprint ;
 create table want as
   select a.*
        , count(distinct b.accountmanager) as mgrct
        , count(*) as tranct
   from transaction a
   left join transaction b
   on a.customerid = b.customerid
    and b.transaction_time <= a.transaction_time
    and datepart(a.transaction_time)-datepart(b.transaction_time)
        between 0 and 90
   group by 1,2,3,4
 ;
quit;

Results

结果

1111111111 FA001 TR2016001 08SEP16:11:19:25 1 1  
1111111111 FA001 TR2016002 26OCT16:08:22:49 1 2  
1111111111 FA002 TR2016003 04NOV16:08:05:36 2 3  
1111111111 FA003 TR2016004 04NOV16:17:15:52 3 4  
1111111111 FA004 TR2016005 25NOV16:13:04:16 4 5  
1231231234 FA005 TR2016006 25AUG15:08:03:29 1 1  
1231231234 FA005 TR2016007 16SEP15:08:24:24 1 2  
1231231234 FA008 TR2016008 18SEP15:14:42:29 2 3