I have 3 fields: id, date, treatment. There are 3 types of treatment: Cold, fever, cholera. Assume there are 1000 patients and the first patient's data looks like this
我有3个字段:id,date,treatment。治疗有3种类型:感冒,发烧,霍乱。假设有1000名患者,第一名患者的数据看起来像这样
pt treatment_date treatment
A 05-05-2017 Cold
A 05-07-2017 Cold
A 05-09-2017 Fever
A 05-13-2017 Fever
A 05-15-2017 Cholera
A 05-17-2017 Cholera
A 05-19-2017 Cold
A 05-21-2017 Cold
A 05-23-2017 Fever
I need my output to look like this-
我需要我的输出看起来像这样 -
pt start_date end_date treatment Number_of_days Conversion_date Days_before_cholera(start date of cholera- end date of treatment immediately before it)
A 05-05-2017 05-07-2017 Cold 2 0 0
A 05-09-2017 05-13-2017 Fever 4 0 0
A 05-15-2017 05-17-2017 Cholera 2 05-13-2017 2
A 05-19-2017 05-21-2017 Cold 2 0 0
A 05-23-2017 05-23-2017 Fever 1 0 0
So goes on for all patient_ids.
所以对所有patient_ids继续。
2 个解决方案
#1
1
This is a "gaps-and-islands" problem. I show you have to handle the calculation of the rows. You can fill in the additional columns.
这是一个“差距和岛屿”问题。我告诉你必须处理行的计算。您可以填写其他列。
One way to solve it is using the difference of row numbers:
解决它的一种方法是使用行号的差异:
select pt, min(treatment_date), max(treatment_date), . . .
from (select t.*,
row_number() over (partition by pt order by treatment_date) as seqnum_p,
row_number() over (partition by pt, treatment order by treatment_date) as seqnum_ptt
from t
) t
group by pt, (seqnum_p - seqnum_ptt);
#2
0
You're going to need to join the table to itself for this one. I'd try something along these lines.
你将需要加入这个桌子。我会尝试这些方法。
SELECT
a.pt
,a.treatment
,a.treatment_date AS start_date
,CASE /*this is for your last fever row with the same date*/
WHEN b.treatment_date IS NULL
THEN a.treatment_date
ELSE b.treatment_date
END AS end_date
/*other fields here*/
FROM
MyTable a
LEFT JOIN MyTable b
ON a.pt = b.pt
AND a.treatment = b.treatment
WHERE
a.treatment_date < b.treatment_date
/*make sure there isn't any date in between,
this should stop you from joining rows you didn't intend on joining on*/
AND NOT EXISTS (
SELECT
x.treatment_date
FROM
MyTable x
WHERE
a.pt = x.pt
AND a.treatment = x.treatment
AND x.treatment_date < b.treatment_date
AND x.treatment_date > a.treatment_date
)
#1
1
This is a "gaps-and-islands" problem. I show you have to handle the calculation of the rows. You can fill in the additional columns.
这是一个“差距和岛屿”问题。我告诉你必须处理行的计算。您可以填写其他列。
One way to solve it is using the difference of row numbers:
解决它的一种方法是使用行号的差异:
select pt, min(treatment_date), max(treatment_date), . . .
from (select t.*,
row_number() over (partition by pt order by treatment_date) as seqnum_p,
row_number() over (partition by pt, treatment order by treatment_date) as seqnum_ptt
from t
) t
group by pt, (seqnum_p - seqnum_ptt);
#2
0
You're going to need to join the table to itself for this one. I'd try something along these lines.
你将需要加入这个桌子。我会尝试这些方法。
SELECT
a.pt
,a.treatment
,a.treatment_date AS start_date
,CASE /*this is for your last fever row with the same date*/
WHEN b.treatment_date IS NULL
THEN a.treatment_date
ELSE b.treatment_date
END AS end_date
/*other fields here*/
FROM
MyTable a
LEFT JOIN MyTable b
ON a.pt = b.pt
AND a.treatment = b.treatment
WHERE
a.treatment_date < b.treatment_date
/*make sure there isn't any date in between,
this should stop you from joining rows you didn't intend on joining on*/
AND NOT EXISTS (
SELECT
x.treatment_date
FROM
MyTable x
WHERE
a.pt = x.pt
AND a.treatment = x.treatment
AND x.treatment_date < b.treatment_date
AND x.treatment_date > a.treatment_date
)