T-SQL中的行之间的时间比较

I am using t-sql.

我用t - sql。

I have a simple table called mytable

我有一个简单的表格叫mytable

It looks like this:

它看起来像这样:

ID    Num    Date
1      0     2015-01-01 00:00:00
1      0     2015-01-02 00:00:00
1      1     2015-01-03 00:00:00
1      2     2015-01-04 00:00:00
2      0     2015-01-01 00:00:00
2      1     2015-02-01 00:00:00
2      0     2015-03-01 00:00:00
3      1     2014-01-01 00:00:00
3      2     2014-01-02 00:00:00
4      2     2015-02-01 00:00:00
4      0     2015-02-02 00:00:00
4      2     2015-02-05 00:00:00

The situation with this table is simply that any time a value of 1 or 2 has been entered into the table, the values that come later (chronologically speaking) cannot be a 0. This is a data entry error and must be fixed by changing the 0 to a 2.

这个表的情况很简单，只要在表中输入1或2的值，后面的值(按时间顺序)就不能是0。这是一个数据输入错误，必须通过将0改为2来修正。

So, in the simplified example above, ID has an error for person 2 and 4.

因此，在上面的简化示例中，ID对于person 2和person 4有一个错误。

For person 2, somebody keyed in a 0 on 2015-01-01 00:00:00, whereas for person 4, somebody keyed in a 0 at 2015-01-01 00:00:00.

对于第2个人，有人在2015-01-01 00:00键入0，而对于第4个人，有人在2015-01-01 00:00键入0。

I am new to SQL and honestly would rather just export the whole thing as a csv, open it in R, find the problems, and then update values with an update statement back in the database. But I feel like this is an opportunity to get better at SQL -- unfortunately, I'm stuck.

我是SQL新手，老实说，我宁愿把整个东西导出为csv，在R中打开它，找到问题，然后用数据库中的update语句更新值。但我觉得这是一个在SQL中更好的机会——不幸的是，我被卡住了。

Here I need some way to compare rows within a table to each other, as them being group by ID, yet also to consider this chronological situation. I've tried a cartesian join with a CASE statement, which didn't work. Any help would be greatly appreciated.

在这里，我需要某种方法来比较表中的行与其他行，因为它们是按ID分组的，但也要考虑这种按时间顺序排列的情况。我尝试过用笛卡尔连接来表示一个情况，但没有成功。如有任何帮助，我们将不胜感激。

2 个解决方案

#1

This query will select all problematic records:

此查询将选择所有有问题的记录:

SELECT *
FROM mytable AS t
WHERE Num = 0 AND EXISTS (SELECT 1
                          FROM mytable
                          WHERE Num IN (1,2) AND ID = t.ID AND Date < t.Date)

It selects all Num=0 records which have either a Num=1 or a Num=2 preceding record for the same ID.

它选择的所有Num=0记录都有一个Num=1或一个Num=2之前的相同ID的记录。

Output:

输出:

ID  Num Date
------------------
2   0   2015-03-01
4   0   2015-02-02

To update the table simply do:

要更新该表，只需:

UPDATE mytable
SET Num = 2
FROM mytable AS t
WHERE Num = 0 AND EXISTS (SELECT 1
                          FROM mytable
                          WHERE Num IN (1,2) AND ID = t.ID AND Date < t.Date)

#2

You can join a table back to itself and put the logic in, like this:

您可以将一个表连接回自身，并将逻辑放入其中，如下所示:

select *
from mytable t
join mytable p on t.id = p.id 
                   and t.date > p.date 
                   and t.num < p.num

this will give you "extra" rows if there is more than one prior problem. To fix this you can group by:

如果有多个先前的问题，这将为您提供“额外”行。要解决这个问题，你可以分组如下:

select id, Date, max(priornum) as max_prior 
from (
  select t.id, t.Date, p.num as priornum
  from mytable t
  join mytable p on t.id = p.id 
                and t.date > p.date 
                and t.num < p.num
) sub
group by id, Date

or use over and distinct (for more modern server versions):

或使用超过和独特(更现代的服务器版本):

select distinct t.id, t.num, t.Date,
                max(p.num) OVER (partition by t.id, t.Date) as max_prior
from mytable t
join mytable p on t.id = p.id 
                and t.date > p.date 
                and t.num < p.num

#1