如何按日期范围分组?

时间:2022-02-02 16:49:15

I have 3 fields: id, date, treatment. There are 3 types of treatment: Cold, fever, cholera. Assume there are 1000 patients and the first patient's data looks like this

我有3个字段:id,date,treatment。治疗有3种类型:感冒,发烧,霍乱。假设有1000名患者,第一名患者的数据看起来像这样

pt    treatment_date   treatment
A     05-05-2017          Cold
A     05-07-2017          Cold
A     05-09-2017          Fever
A     05-13-2017          Fever
A     05-15-2017          Cholera
A     05-17-2017          Cholera
A     05-19-2017          Cold
A     05-21-2017          Cold
A     05-23-2017          Fever

I need my output to look like this-

我需要我的输出看起来像这样 -

pt    start_date   end_date    treatment   Number_of_days  Conversion_date    Days_before_cholera(start date of cholera- end date of treatment immediately before it)
 A    05-05-2017   05-07-2017   Cold           2               0               0       
 A    05-09-2017   05-13-2017   Fever          4               0               0
 A    05-15-2017   05-17-2017   Cholera        2              05-13-2017       2
 A    05-19-2017   05-21-2017   Cold           2                0              0
 A    05-23-2017   05-23-2017   Fever          1                0              0

So goes on for all patient_ids.

所以对所有patient_ids继续。

2 个解决方案

#1


1  

This is a "gaps-and-islands" problem. I show you have to handle the calculation of the rows. You can fill in the additional columns.

这是一个“差距和岛屿”问题。我告诉你必须处理行的计算。您可以填写其他列。

One way to solve it is using the difference of row numbers:

解决它的一种方法是使用行号的差异:

select pt, min(treatment_date), max(treatment_date), . . .
from (select t.*,
             row_number() over (partition by pt order by treatment_date) as seqnum_p,
             row_number() over (partition by pt, treatment order by treatment_date) as seqnum_ptt
      from t
     ) t
group by pt, (seqnum_p - seqnum_ptt);

#2


0  

You're going to need to join the table to itself for this one. I'd try something along these lines.

你将需要加入这个桌子。我会尝试这些方法。

SELECT
    a.pt
    ,a.treatment
    ,a.treatment_date AS start_date
    ,CASE    /*this is for your last fever row with the same date*/
        WHEN b.treatment_date IS NULL
        THEN a.treatment_date
        ELSE b.treatment_date
        END AS end_date
    /*other fields here*/

FROM
    MyTable a
    LEFT JOIN MyTable b
    ON a.pt = b.pt
    AND a.treatment = b.treatment
WHERE
    a.treatment_date < b.treatment_date
    /*make sure there isn't any date in between, 
      this should stop you from joining rows you didn't intend on joining on*/
    AND NOT EXISTS (
                        SELECT
                            x.treatment_date
                        FROM
                            MyTable x
                        WHERE
                            a.pt = x.pt
                            AND a.treatment = x.treatment
                            AND x.treatment_date < b.treatment_date
                            AND x.treatment_date > a.treatment_date
                    )

#1


1  

This is a "gaps-and-islands" problem. I show you have to handle the calculation of the rows. You can fill in the additional columns.

这是一个“差距和岛屿”问题。我告诉你必须处理行的计算。您可以填写其他列。

One way to solve it is using the difference of row numbers:

解决它的一种方法是使用行号的差异:

select pt, min(treatment_date), max(treatment_date), . . .
from (select t.*,
             row_number() over (partition by pt order by treatment_date) as seqnum_p,
             row_number() over (partition by pt, treatment order by treatment_date) as seqnum_ptt
      from t
     ) t
group by pt, (seqnum_p - seqnum_ptt);

#2


0  

You're going to need to join the table to itself for this one. I'd try something along these lines.

你将需要加入这个桌子。我会尝试这些方法。

SELECT
    a.pt
    ,a.treatment
    ,a.treatment_date AS start_date
    ,CASE    /*this is for your last fever row with the same date*/
        WHEN b.treatment_date IS NULL
        THEN a.treatment_date
        ELSE b.treatment_date
        END AS end_date
    /*other fields here*/

FROM
    MyTable a
    LEFT JOIN MyTable b
    ON a.pt = b.pt
    AND a.treatment = b.treatment
WHERE
    a.treatment_date < b.treatment_date
    /*make sure there isn't any date in between, 
      this should stop you from joining rows you didn't intend on joining on*/
    AND NOT EXISTS (
                        SELECT
                            x.treatment_date
                        FROM
                            MyTable x
                        WHERE
                            a.pt = x.pt
                            AND a.treatment = x.treatment
                            AND x.treatment_date < b.treatment_date
                            AND x.treatment_date > a.treatment_date
                    )