查找SQL Server中的日期之间的间隔

时间:2020-12-04 01:29:07

I have a table including more than 5 million rows of sales transactions. I would like to find sum of date intervals between each customer three recent purchases.

我有一张包含超过500万行销售交易的表格。我想找到每个客户最近三次购买之间的日期间隔总和。

Suppose my table looks like this :

假设我的表看起来像这样:

CustomerID        ProductID             ServiceStartDate     ServiceExpiryDate     
   A                X1                     2010-01-01             2010-06-01
   A                X2                     2010-08-12             2010-12-30
   B                X4                     2011-10-01             2012-01-15
   B                X3                     2012-04-01             2012-06-01
   B                X7                     2012-08-01             2013-10-01
   A                X5                     2013-01-01             2015-06-01

The Result that I'm looking for may looks like this :

我正在寻找的结果可能如下所示:

CustomerID        IntervalDays 
    A                  802
    B                  135               

I know the query need to first retrieve 3 resent transactions of each customer (based on ServiceStartDate) and then calculate the interval between startDate and ExpiryDate of his/her transactions.

我知道查询需要首先检索每个客户的3个重新发送的事务(基于ServiceStartDate),然后计算他/她的事务的startDate和ExpiryDate之间的间隔。

3 个解决方案

#1


You want to calculate the difference between the previous row's ServiceExpiryDate and the current row's ServiceStartDate based on descending dates and then sum up the last two differences:

您希望根据降序日期计算上一行的ServiceExpiryDate与当前行的ServiceStartDate之间的差异,然后总结最后两个差异:

with cte as
 (
   select tab.*, 
      row_number()
      over (partition by customerId
            order by ServiceStartDate desc
                   , ServiceExpiryDate desc -- don't know if this 2nd column is necessary 
           ) as rn
   from tab
 ) 
select t2.customerId,
   sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
   ,count(*) as purchases
from cte as t2 left join cte as t1
on t1.customerId = t2.customerId
and t1.rn = t2.rn+1     -- previous and current row
where t2.rn  <= 3       -- last three rows
group by t2.customerId;

Same result using LEAD:

使用LEAD的结果相同:

with cte as
 (
   select tab.*, 
      row_number()
      over (partition by customerId
            order by ServiceStartDate desc) as rn
     ,lead(ServiceExpiryDate)
      over (partition by customerId
            order by ServiceStartDate desc
            ) as prevEnd
   from tab
 ) 
select customerId,
   sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
   ,count(*) as purchases
from cte 
where rn <= 3
group by customerId;

Both will not return the expected result unless you subtract purchases (or max(rn)) from Intervaldays. But as you only sum two differences this seems to be not correct for me either...

除非您从Intervaldays中减去购买量(或max(rn)),否则两者都不会返回预期结果。但是,由于你只是总结了两个差异,这似乎对我来说不正确......

Additional logic must be applied based on your rules regarding:

必须根据您的规则应用其他逻辑:

  • customer has less than 3 purchases
  • 客户购买少于3次

  • overlapping intervals

#2


Assuming there are no overlaps, I think you want this:

假设没有重叠,我想你想要这个:

select customerId,
       sum(datediff(day, ServiceStartDate, ServieEndDate) as Intervaldays
from (select t.*, row_number() over (partition by customerId
                                     order by ServiceStartDate desc) as seqnum
      from table t
     ) t
where seqnum <= 3
group by customerId;

#3


Try this:

SELECT dt.CustomerID,
        SUM(DATEDIFF(DAY, dt.PrevExpiry, dt.ServiceStartDate)) As IntervalDays
FROM (
    SELECT *
            , ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY ServiceStartDate DESC) AS rn
            , (SELECT Max(ti.ServiceExpiryDate) 
               FROM yourTable ti 
               WHERE t.CustomerID = ti.CustomerID 
                 AND ti.ServiceStartDate < t.ServiceStartDate) As PrevExpiry
    FROM yourTable t )dt
GROUP BY dt.CustomerID

Result will be:

结果将是:

CustomerId | IntervalDays
-----------+--------------
A          | 805
B          | 138

#1


You want to calculate the difference between the previous row's ServiceExpiryDate and the current row's ServiceStartDate based on descending dates and then sum up the last two differences:

您希望根据降序日期计算上一行的ServiceExpiryDate与当前行的ServiceStartDate之间的差异,然后总结最后两个差异:

with cte as
 (
   select tab.*, 
      row_number()
      over (partition by customerId
            order by ServiceStartDate desc
                   , ServiceExpiryDate desc -- don't know if this 2nd column is necessary 
           ) as rn
   from tab
 ) 
select t2.customerId,
   sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
   ,count(*) as purchases
from cte as t2 left join cte as t1
on t1.customerId = t2.customerId
and t1.rn = t2.rn+1     -- previous and current row
where t2.rn  <= 3       -- last three rows
group by t2.customerId;

Same result using LEAD:

使用LEAD的结果相同:

with cte as
 (
   select tab.*, 
      row_number()
      over (partition by customerId
            order by ServiceStartDate desc) as rn
     ,lead(ServiceExpiryDate)
      over (partition by customerId
            order by ServiceStartDate desc
            ) as prevEnd
   from tab
 ) 
select customerId,
   sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
   ,count(*) as purchases
from cte 
where rn <= 3
group by customerId;

Both will not return the expected result unless you subtract purchases (or max(rn)) from Intervaldays. But as you only sum two differences this seems to be not correct for me either...

除非您从Intervaldays中减去购买量(或max(rn)),否则两者都不会返回预期结果。但是,由于你只是总结了两个差异,这似乎对我来说不正确......

Additional logic must be applied based on your rules regarding:

必须根据您的规则应用其他逻辑:

  • customer has less than 3 purchases
  • 客户购买少于3次

  • overlapping intervals

#2


Assuming there are no overlaps, I think you want this:

假设没有重叠,我想你想要这个:

select customerId,
       sum(datediff(day, ServiceStartDate, ServieEndDate) as Intervaldays
from (select t.*, row_number() over (partition by customerId
                                     order by ServiceStartDate desc) as seqnum
      from table t
     ) t
where seqnum <= 3
group by customerId;

#3


Try this:

SELECT dt.CustomerID,
        SUM(DATEDIFF(DAY, dt.PrevExpiry, dt.ServiceStartDate)) As IntervalDays
FROM (
    SELECT *
            , ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY ServiceStartDate DESC) AS rn
            , (SELECT Max(ti.ServiceExpiryDate) 
               FROM yourTable ti 
               WHERE t.CustomerID = ti.CustomerID 
                 AND ti.ServiceStartDate < t.ServiceStartDate) As PrevExpiry
    FROM yourTable t )dt
GROUP BY dt.CustomerID

Result will be:

结果将是:

CustomerId | IntervalDays
-----------+--------------
A          | 805
B          | 138