SQL选择30天频率的失效客户

时间:2021-04-03 22:59:54

The goal is to select the count of distinct customer_id's who have not made a purchase in the rolling 30 day period prior to every day in the calendar year 2016. I have created a calendar table in my database to join to.

目标是选择不同的customer_id的计数,这些客户在2016年历年之前的30天内没有购买。我已经在数据库中创建了一个要加入的日历表。

Here is an example table for reference, let's say you have customers orders normalized as follows:

这里有一个示例表供参考,假设您有以下客户订单标准化:

+-------------+------------+----------+
| customer_id | date       | order_id |
+-------------+------------+----------+
| 123         | 01/25/2016 | 1000     |
+-------------+------------+----------+
| 123         | 04/27/2016 | 1025     |
+-------------+------------+----------+
| 444         | 02/02/2016 | 1010     |
+-------------+------------+----------+
| 521         | 01/23/2016 | 998      |
+-------------+------------+----------+
| 521         | 01/24/2016 | 999      |
+-------------+------------+----------+  

The goal output is effectively a calendar with 1 row for every single day of 2016 with a count on each day of how many customers "lapsed" on that day, meaning their last purchase was 30 days or more prior from that day of the year. The final output will look like this:

目标输出实际上是一个日历,在2016年的每一天都有一行,计算出在这一天有多少客户“失效”,这意味着他们的最后一次购买是在这一天之前30天或更长时间。最终输出如下所示:

+------------+--------------+
| date       | lapsed_count |
+------------+--------------+
| 01/01/2016 | 0            |
+------------+--------------+
| 01/02/2016 | 0            |
+------------+--------------+
| ...        | ...          |
+------------+--------------+
| 03/01/2016 | 12           |
+------------+--------------+
| 03/02/2016 | 9            |
+------------+--------------+
| 03/03/2016 | 7            |
+------------+--------------+  

This data does not exist in 2015, therefore it's not possible for Jan-01-2016 to have a count of lapsed customers because that is the first possible day to ever make a purchase.

这个数据在2015年是不存在的,所以1- 01-2016年1月不可能有客户流失,因为这是第一个可能的购买日。

So for customer_id #123, they purchased on 01/25/2016 and 04/27/2016. They should have 2 lapse counts because their purchases are more than 30 days apart. One lapse occurring on 2/24/2016 and another lapse on 05/27/2016.
Customer_id#444 only purchased once, so they should have one lapse count for 30 days after 02/02/2016 on 03/02/2016.
Customer_id#521 is tricky, since they purchased with a frequency of 1 day we will not count the first purchase on 03/02/2016, so there is only one lapse starting from their last purchase of 03/03/2016. The count for the lapse will occur on 04/02/2016 (+30 days).

因此,对于customer_id #123,他们在2016年1月25日和2016年4月27日购买。他们应该有两个失效计数,因为他们的购买是超过30天的间隔。2016年2月24日发生一次失误,2016年5月27日发生另一次失误。Customer_id#444只购买过一次,因此在2016年3月2日02/02/2016之后,他们应该有一个30天的失效计数。Customer_id#521很棘手,因为他们购买的频率是1天,我们不会在2016年3月2日计算第一次购买,所以从他们最后一次购买03/03/2016开始只有一次失误。此次故障的统计将在2016年2月4日(+30天)进行。

3 个解决方案

#1


1  

Apologies, I didn't read your question properly the first time around. This query will give you all the lapses you have. It takes each order and uses an analytic function to work out the next order date - if the gap is greater than 30 days then a lapse is recorded

不好意思,我第一次没读懂你的问题。这个查询将给出您所有的错误。它获取每一个订单并使用一个解析函数计算下一个订单日期——如果间隔大于30天,则记录一个延时

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
SELECT
 customer_id
,order_date
,order_id
,next_order_date
,order_date + 30   lapse_date
FROM
 (SELECT
   customer_id
  ,order_date
  ,order_id
  ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
  FROM
   cust_orders
 )
WHERE NVL(next_order_date,sysdate) - order_date > 30
;

Now join that to a set of dates and run a COUNT function (enter the year parameter as YYYY) :

现在将其加入到一组日期并运行计数函数(输入年份参数为yyyyy):

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
,calendar (date_value)
 AS
 (SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1) 
  FROM all_tables
  WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
 )
SELECT
 calendar.date_value
,COUNT(*)
FROM
 (
  SELECT
   customer_id
  ,order_date
  ,order_id
  ,next_order_date
  ,order_date + 30   lapse_date
  FROM
   (SELECT
     customer_id
    ,order_date
    ,order_id
    ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
    FROM
     cust_orders
   )
  WHERE NVL(next_order_date,sysdate) - order_date > 30
 )  lapses
,calendar
WHERE 1=1
AND calendar.date_value = TRUNC(lapses.lapse_date)
GROUP BY
 calendar.date_value
;

Or if you really want every date printed out then use this :

或者,如果你真的想把每个日期都打印出来,那就用这个:

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
,lapses
 AS
  (SELECT
    customer_id
   ,order_date
   ,order_id
   ,next_order_date
   ,order_date + 30   lapse_date
   FROM
    (SELECT
      customer_id
     ,order_date
     ,order_id
     ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
     FROM
      cust_orders
    )
   WHERE NVL(next_order_date,sysdate) - order_date > 30
  )  
,calendar (date_value)
 AS
 (SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1) 
  FROM all_tables
  WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
 )
SELECT
 calendar.date_value
,(SELECT COUNT(*)
  FROM lapses
  WHERE calendar.date_value = lapses.lapse_date
 )
FROM
 calendar
WHERE 1=1
ORDER BY
 calendar.date_value
;

#2


2  

If you have a table of dates, here is one expensive method:

如果你有一个日期表,这里有一个昂贵的方法:

select date,
       sum(case when prev_date < date - 30 then 1 else 0 end) as lapsed
from (select c.date, o.customer_id, max(o.date) as prev_date
      from calendar c cross join
           (select distinct customer_id from orders) c left join
           orders o
           on o.date <= c.date and o.customer_id = c.customer_id
      group by c.date, o.customer_id
     ) oc
group by date;

For each date/customer pair, it determines the latest purchase the customer made before the date. It then uses this information to count the lapsed.

对于每个日期/客户对,它决定了客户在日期之前所做的最新购买。然后它使用这些信息来计算失效的数据。

To be honest, this will probably work well on a handful of dates, but not for a full year's worth.

老实说,这可能会在几次约会中奏效,但不是一整年的时间。

#3


1  

Here's how I'd do it:

我是这样做的:

WITH your_table AS (SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12345 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12346 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('25/01/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('24/02/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('16/03/2016', 'dd/mm/yyyy') order_date, 12348 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12349 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('20/02/2016', 'dd/mm/yyyy') order_date, 12350 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('01/03/2016', 'dd/mm/yyyy') order_date, 12351 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('03/03/2016', 'dd/mm/yyyy') order_date, 12352 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12353 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('20/05/2016', 'dd/mm/yyyy') order_date, 12354 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('23/06/2016', 'dd/mm/yyyy') order_date, 12355 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('19/01/2017', 'dd/mm/yyyy') order_date, 12356 order_id FROM dual),
-- end of mimicking your_table with data in it
    lapsed_info AS (SELECT customer_id,
                           order_date,
                           CASE WHEN TRUNC(SYSDATE) - order_date <= 30 THEN NULL
                                WHEN COUNT(*) OVER (PARTITION BY customer_id ORDER BY order_date RANGE BETWEEN 1 FOLLOWING AND 30 FOLLOWING) = 0 THEN order_date+30
                                ELSE NULL
                           END lapsed_date
                    FROM   your_table),
          dates AS (SELECT to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 dt
                    FROM   dual
                    CONNECT BY to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 <= TRUNC(SYSDATE))
SELECT dates.dt,
       COUNT(li.lapsed_date) lapsed_count
FROM   dates
       LEFT OUTER JOIN lapsed_info li ON dates.dt = li.lapsed_date
GROUP BY dates.dt
ORDER BY dates.dt;

Results:

结果:

DT         LAPSED_COUNT
---------- ------------
01/01/2016            0
<snip>
23/01/2016            0
24/01/2016            0
25/01/2016            0
26/01/2016            0
<snip>
19/02/2016            0
20/02/2016            0
21/02/2016            0
22/02/2016            0
23/02/2016            0
24/02/2016            1
25/02/2016            0
<snip>
29/02/2016            0
01/03/2016            0
02/03/2016            0
03/03/2016            0
04/03/2016            0
<snip>
15/03/2016            0
16/03/2016            0
17/03/2016            0
<snip>
20/03/2016            0
21/03/2016            0
22/03/2016            0
<snip>
30/03/2016            0
31/03/2016            0
01/04/2016            0
02/04/2016            1
03/04/2016            0
<snip>
14/04/2016            0
15/04/2016            1
16/04/2016            0
17/04/2016            0
18/04/2016            0
19/04/2016            0
<snip>
17/05/2016            0
18/05/2016            2
19/05/2016            0
20/05/2016            0
21/05/2016            0
<snip>
18/06/2016            0
19/06/2016            1
20/06/2016            0
21/06/2016            0
22/06/2016            0
23/06/2016            0
24/06/2016            0
<snip>
22/07/2016            0
23/07/2016            1
24/07/2016            0
<snip>
18/01/2017            0
19/01/2017            0
20/01/2017            0
<snip>
08/02/2017            0

This takes your data, and uses an the analytic count function to work out the number of rows that have a value within 30 days of (but excluding) the current row's date.

这将获取您的数据,并使用分析计数函数计算出在当前行日期(但不包括当前行的日期)30天内具有值的行数。

Then we apply a case expression to determine that if the row has a date within 30 days of today's date, we'll count those as not lapsed. If a count of 0 was returned, then the row is considered lapsed and we'll output the lapsed date as the order_date plus 30 days. Any other count result means the row has not lapsed.

然后,我们应用一个case表达式来确定,如果该行在今天的日期之后的30天内有一个日期,我们将把这些日期视为未过期。如果返回的计数为0,则行被视为失效,我们将输出失效日期作为order_date加上30天。任何其他计数结果都意味着行没有失效。

The above is all worked out in the lapsed_info subquery.

以上内容都在lapsed_info子查询中实现。

Then all we need to do is list the dates (see the dates subquery) and outer join the lapsed_info subquery to it based on the lapsed_date and then do a count of the lapsed dates for each day.

然后,我们需要做的就是列出日期(请参阅dates子查询),并根据lapsed_date将lapsed_info子查询外部连接到它,然后对每天的过期日期进行计数。

#1


1  

Apologies, I didn't read your question properly the first time around. This query will give you all the lapses you have. It takes each order and uses an analytic function to work out the next order date - if the gap is greater than 30 days then a lapse is recorded

不好意思,我第一次没读懂你的问题。这个查询将给出您所有的错误。它获取每一个订单并使用一个解析函数计算下一个订单日期——如果间隔大于30天,则记录一个延时

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
SELECT
 customer_id
,order_date
,order_id
,next_order_date
,order_date + 30   lapse_date
FROM
 (SELECT
   customer_id
  ,order_date
  ,order_id
  ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
  FROM
   cust_orders
 )
WHERE NVL(next_order_date,sysdate) - order_date > 30
;

Now join that to a set of dates and run a COUNT function (enter the year parameter as YYYY) :

现在将其加入到一组日期并运行计数函数(输入年份参数为yyyyy):

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
,calendar (date_value)
 AS
 (SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1) 
  FROM all_tables
  WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
 )
SELECT
 calendar.date_value
,COUNT(*)
FROM
 (
  SELECT
   customer_id
  ,order_date
  ,order_id
  ,next_order_date
  ,order_date + 30   lapse_date
  FROM
   (SELECT
     customer_id
    ,order_date
    ,order_id
    ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
    FROM
     cust_orders
   )
  WHERE NVL(next_order_date,sysdate) - order_date > 30
 )  lapses
,calendar
WHERE 1=1
AND calendar.date_value = TRUNC(lapses.lapse_date)
GROUP BY
 calendar.date_value
;

Or if you really want every date printed out then use this :

或者,如果你真的想把每个日期都打印出来,那就用这个:

WITH
 cust_orders (customer_id , order_date , order_id   )
 AS
  (SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
   SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
   SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
   SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
   SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
   SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
   SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
   SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
   SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
   SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
   SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
   SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual 
  )
,lapses
 AS
  (SELECT
    customer_id
   ,order_date
   ,order_id
   ,next_order_date
   ,order_date + 30   lapse_date
   FROM
    (SELECT
      customer_id
     ,order_date
     ,order_id
     ,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
     FROM
      cust_orders
    )
   WHERE NVL(next_order_date,sysdate) - order_date > 30
  )  
,calendar (date_value)
 AS
 (SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1) 
  FROM all_tables
  WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
 )
SELECT
 calendar.date_value
,(SELECT COUNT(*)
  FROM lapses
  WHERE calendar.date_value = lapses.lapse_date
 )
FROM
 calendar
WHERE 1=1
ORDER BY
 calendar.date_value
;

#2


2  

If you have a table of dates, here is one expensive method:

如果你有一个日期表,这里有一个昂贵的方法:

select date,
       sum(case when prev_date < date - 30 then 1 else 0 end) as lapsed
from (select c.date, o.customer_id, max(o.date) as prev_date
      from calendar c cross join
           (select distinct customer_id from orders) c left join
           orders o
           on o.date <= c.date and o.customer_id = c.customer_id
      group by c.date, o.customer_id
     ) oc
group by date;

For each date/customer pair, it determines the latest purchase the customer made before the date. It then uses this information to count the lapsed.

对于每个日期/客户对,它决定了客户在日期之前所做的最新购买。然后它使用这些信息来计算失效的数据。

To be honest, this will probably work well on a handful of dates, but not for a full year's worth.

老实说,这可能会在几次约会中奏效,但不是一整年的时间。

#3


1  

Here's how I'd do it:

我是这样做的:

WITH your_table AS (SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12345 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12346 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('25/01/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('24/02/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('16/03/2016', 'dd/mm/yyyy') order_date, 12348 order_id FROM dual UNION ALL
                    SELECT 123 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12349 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('20/02/2016', 'dd/mm/yyyy') order_date, 12350 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('01/03/2016', 'dd/mm/yyyy') order_date, 12351 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('03/03/2016', 'dd/mm/yyyy') order_date, 12352 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12353 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('20/05/2016', 'dd/mm/yyyy') order_date, 12354 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('23/06/2016', 'dd/mm/yyyy') order_date, 12355 order_id FROM dual UNION ALL
                    SELECT 456 customer_id, to_date('19/01/2017', 'dd/mm/yyyy') order_date, 12356 order_id FROM dual),
-- end of mimicking your_table with data in it
    lapsed_info AS (SELECT customer_id,
                           order_date,
                           CASE WHEN TRUNC(SYSDATE) - order_date <= 30 THEN NULL
                                WHEN COUNT(*) OVER (PARTITION BY customer_id ORDER BY order_date RANGE BETWEEN 1 FOLLOWING AND 30 FOLLOWING) = 0 THEN order_date+30
                                ELSE NULL
                           END lapsed_date
                    FROM   your_table),
          dates AS (SELECT to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 dt
                    FROM   dual
                    CONNECT BY to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 <= TRUNC(SYSDATE))
SELECT dates.dt,
       COUNT(li.lapsed_date) lapsed_count
FROM   dates
       LEFT OUTER JOIN lapsed_info li ON dates.dt = li.lapsed_date
GROUP BY dates.dt
ORDER BY dates.dt;

Results:

结果:

DT         LAPSED_COUNT
---------- ------------
01/01/2016            0
<snip>
23/01/2016            0
24/01/2016            0
25/01/2016            0
26/01/2016            0
<snip>
19/02/2016            0
20/02/2016            0
21/02/2016            0
22/02/2016            0
23/02/2016            0
24/02/2016            1
25/02/2016            0
<snip>
29/02/2016            0
01/03/2016            0
02/03/2016            0
03/03/2016            0
04/03/2016            0
<snip>
15/03/2016            0
16/03/2016            0
17/03/2016            0
<snip>
20/03/2016            0
21/03/2016            0
22/03/2016            0
<snip>
30/03/2016            0
31/03/2016            0
01/04/2016            0
02/04/2016            1
03/04/2016            0
<snip>
14/04/2016            0
15/04/2016            1
16/04/2016            0
17/04/2016            0
18/04/2016            0
19/04/2016            0
<snip>
17/05/2016            0
18/05/2016            2
19/05/2016            0
20/05/2016            0
21/05/2016            0
<snip>
18/06/2016            0
19/06/2016            1
20/06/2016            0
21/06/2016            0
22/06/2016            0
23/06/2016            0
24/06/2016            0
<snip>
22/07/2016            0
23/07/2016            1
24/07/2016            0
<snip>
18/01/2017            0
19/01/2017            0
20/01/2017            0
<snip>
08/02/2017            0

This takes your data, and uses an the analytic count function to work out the number of rows that have a value within 30 days of (but excluding) the current row's date.

这将获取您的数据,并使用分析计数函数计算出在当前行日期(但不包括当前行的日期)30天内具有值的行数。

Then we apply a case expression to determine that if the row has a date within 30 days of today's date, we'll count those as not lapsed. If a count of 0 was returned, then the row is considered lapsed and we'll output the lapsed date as the order_date plus 30 days. Any other count result means the row has not lapsed.

然后,我们应用一个case表达式来确定,如果该行在今天的日期之后的30天内有一个日期,我们将把这些日期视为未过期。如果返回的计数为0,则行被视为失效,我们将输出失效日期作为order_date加上30天。任何其他计数结果都意味着行没有失效。

The above is all worked out in the lapsed_info subquery.

以上内容都在lapsed_info子查询中实现。

Then all we need to do is list the dates (see the dates subquery) and outer join the lapsed_info subquery to it based on the lapsed_date and then do a count of the lapsed dates for each day.

然后,我们需要做的就是列出日期(请参阅dates子查询),并根据lapsed_date将lapsed_info子查询外部连接到它,然后对每天的过期日期进行计数。