I have a table
我有一张桌子
T (variable_name, start_no, end_no)
that holds values like:
具有如下值的值:
(x, 10, 20)
(x, 30, 50)
(x, 60, 70)
(y, 1, 3)
(y, 7, 8)
All intervals are guaranteed to be disjoint.
所有间隔都保证不相交。
I want to write a query in T-SQL that computes the intervals where a variable is not searched:
我想在T-SQL中编写一个查询来计算未搜索变量的间隔:
(x, 21, 29)
(x, 51, 59)
(y, 4, 6)
Can I do this without a cursor?
我可以不用光标吗?
I was thinking of partitioning by variable_name and then ordering by start_no. But how to proceed next? Given the current row in the rowset, how to access the "next" one?
我正在考虑通过variable_name进行分区,然后按start_no进行排序。但接下来该怎么办呢?给定行集中的当前行,如何访问“下一行”?
5 个解决方案
#1
8
Since you didn't specify which version of SQL Server, I have multiple solutions. If you have are still rocking SQL Server 2005, then Giorgi's uses CROSS APPLY quite nicely.
由于您没有指定哪个版本的SQL Server,我有多个解决方案。如果你还在摇晃SQL Server 2005,那么Giorgi's很好地使用了CROSS APPLY。
Note: For both solutions, I use the where clause to filter out improper values so even if the the data is bad and the rows overlap, it will ignore those values.
注意:对于这两种解决方案,我使用where子句来过滤掉不正确的值,因此即使数据错误且行重叠,它也会忽略这些值。
My Version of Your Table
DECLARE @T TABLE (variable_name CHAR, start_no INT, end_no INT)
INSERT INTO @T
VALUES ('x', 10, 20),
('x', 30, 50),
('x', 60, 70),
('y', 1, 3),
('y', 7, 8);
Solution for SQL Server 2012 and Above
SELECT *
FROM
(
SELECT variable_name,
LAG(end_no,1) OVER (PARTITION BY variable_name ORDER BY start_no) + 1 AS start_range,
start_no - 1 AS end_range
FROM @T
) A
WHERE end_range > start_range
Solution for SQL 2008 and Above
WITH CTE
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY variable_name ORDER BY start_no) row_num,
*
FROM @T
)
SELECT A.variable_name,
B.end_no + 1 AS start_range,
A.start_no - 1 AS end_range
FROM CTE AS A
INNER JOIN CTE AS B
ON A.variable_name = B.variable_name
AND A.row_num = B.row_num + 1
WHERE A.start_no - 1 /*end_range*/ > B.end_no + 1 /*start_range*/
#2
3
Here is another version with cross apply
:
这是另一个交叉申请版本:
DECLARE @t TABLE ( v CHAR(1), sn INT, en INT )
INSERT INTO @t
VALUES ( 'x', 10, 20 ),
( 'x', 30, 50 ),
( 'x', 60, 70 ),
( 'y', 1, 3 ),
( 'y', 7, 8 );
SELECT t.v, t.en + 1, c.sn - 1 FROM @t t
CROSS APPLY(SELECT TOP 1 * FROM @t WHERE v = t.v AND sn > t.sn ORDER BY sn)c
WHERE t.en + 1 < c.sn
Fiddle http://sqlfiddle.com/#!3/d6458/3
小提琴http://sqlfiddle.com/#!3/d6458/3
#3
0
For each end_no
you should find the nearest start_no
> end_no
then exclude rows without nearest start_no
(last rows for the variable_name
)
对于每个end_no,您应该找到最近的start_no> end_no,然后排除没有最近的start_no的行(variable_name的最后一行)
WITH A AS
(
SELECT variable_name, end_no+1 as x1,
(SELECT MIN(start_no)-1 FROM t
WHERE t.variable_name = t1.variable_name
AND t.start_no>t1.end_no) as x2
FROM t as t1 )
SELECT * FROM A WHERE x2 IS NOT NULL
ORDER BY variable_name,x1
SQLFiddle演示
Also here is my old answer to the similar question:
这也是我对类似问题的旧答案:
Allen's Interval Algebra operations in SQL
艾伦在SQL中的Interval Algebra操作
#4
0
Here's a non-CTE version that seems to work: http://sqlfiddle.com/#!9/4fdb4/1
这是一个似乎有效的非CTE版本:http://sqlfiddle.com/#!9 / 4fdb4 / 1
Given the guaranteed disjoint ranges, I just joined T to itself, computed the next range as the increment/decrement of the adjoining range, then ensuring the new range didn't overlap any existing ranges.
给定保证不相交的范围,我只是将T加到自身,计算下一个范围作为相邻范围的增量/减量,然后确保新范围不与任何现有范围重叠。
select t1.variable_name, t1.end_no+1, t2.start_no-1
from t t1
join t t2
on t1.variable_name=t2.variable_name
where t1.start_no < t2.start_no
and t1.end_no < t2.end_no
and not exists (select *
from t
where ((t2.start_no-1< t.end_no
and t1.end_no+1 > t.start_no) or
(t1.end_no + 1 < t.end_no and
t2.start_no-1 > t.end_no))
and t.variable_name=t1.variable_name)
#5
0
This is very portable as it doesn't require CTEs or analytic functions. I could also easily be rewritten without the derived table if that were ever necessary.
这非常便携,因为它不需要CTE或分析功能。如果有必要,我也可以在没有派生表的情况下轻松地重写。
select * from (
select
variable_name,
end_no + 1 as start_no,
(
select min(start_no) - 1
from T as t2
where t2.variable_name = t1.variable_name and t2.start_no > t1.end_no
) as end_no
from T as t1
) as intervals
where start_no <= end_no
The number of complemented intervals will be at maximum one fewer than the what you start with. (Some will be eliminated if two ranges were actually consecutive.) So it's easy to take each separate intervals and calculate the one just to its right (or left if you wanted to reverse some of the logic.)
补充间隔的数量最多比您开始时的数量少一个。 (如果两个范围实际上是连续的,那么有些将被消除。)因此,很容易取出每个单独的间隔并计算右边的一个(或者如果你想要反转某些逻辑则左边)。
#1
8
Since you didn't specify which version of SQL Server, I have multiple solutions. If you have are still rocking SQL Server 2005, then Giorgi's uses CROSS APPLY quite nicely.
由于您没有指定哪个版本的SQL Server,我有多个解决方案。如果你还在摇晃SQL Server 2005,那么Giorgi's很好地使用了CROSS APPLY。
Note: For both solutions, I use the where clause to filter out improper values so even if the the data is bad and the rows overlap, it will ignore those values.
注意:对于这两种解决方案,我使用where子句来过滤掉不正确的值,因此即使数据错误且行重叠,它也会忽略这些值。
My Version of Your Table
DECLARE @T TABLE (variable_name CHAR, start_no INT, end_no INT)
INSERT INTO @T
VALUES ('x', 10, 20),
('x', 30, 50),
('x', 60, 70),
('y', 1, 3),
('y', 7, 8);
Solution for SQL Server 2012 and Above
SELECT *
FROM
(
SELECT variable_name,
LAG(end_no,1) OVER (PARTITION BY variable_name ORDER BY start_no) + 1 AS start_range,
start_no - 1 AS end_range
FROM @T
) A
WHERE end_range > start_range
Solution for SQL 2008 and Above
WITH CTE
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY variable_name ORDER BY start_no) row_num,
*
FROM @T
)
SELECT A.variable_name,
B.end_no + 1 AS start_range,
A.start_no - 1 AS end_range
FROM CTE AS A
INNER JOIN CTE AS B
ON A.variable_name = B.variable_name
AND A.row_num = B.row_num + 1
WHERE A.start_no - 1 /*end_range*/ > B.end_no + 1 /*start_range*/
#2
3
Here is another version with cross apply
:
这是另一个交叉申请版本:
DECLARE @t TABLE ( v CHAR(1), sn INT, en INT )
INSERT INTO @t
VALUES ( 'x', 10, 20 ),
( 'x', 30, 50 ),
( 'x', 60, 70 ),
( 'y', 1, 3 ),
( 'y', 7, 8 );
SELECT t.v, t.en + 1, c.sn - 1 FROM @t t
CROSS APPLY(SELECT TOP 1 * FROM @t WHERE v = t.v AND sn > t.sn ORDER BY sn)c
WHERE t.en + 1 < c.sn
Fiddle http://sqlfiddle.com/#!3/d6458/3
小提琴http://sqlfiddle.com/#!3/d6458/3
#3
0
For each end_no
you should find the nearest start_no
> end_no
then exclude rows without nearest start_no
(last rows for the variable_name
)
对于每个end_no,您应该找到最近的start_no> end_no,然后排除没有最近的start_no的行(variable_name的最后一行)
WITH A AS
(
SELECT variable_name, end_no+1 as x1,
(SELECT MIN(start_no)-1 FROM t
WHERE t.variable_name = t1.variable_name
AND t.start_no>t1.end_no) as x2
FROM t as t1 )
SELECT * FROM A WHERE x2 IS NOT NULL
ORDER BY variable_name,x1
SQLFiddle演示
Also here is my old answer to the similar question:
这也是我对类似问题的旧答案:
Allen's Interval Algebra operations in SQL
艾伦在SQL中的Interval Algebra操作
#4
0
Here's a non-CTE version that seems to work: http://sqlfiddle.com/#!9/4fdb4/1
这是一个似乎有效的非CTE版本:http://sqlfiddle.com/#!9 / 4fdb4 / 1
Given the guaranteed disjoint ranges, I just joined T to itself, computed the next range as the increment/decrement of the adjoining range, then ensuring the new range didn't overlap any existing ranges.
给定保证不相交的范围,我只是将T加到自身,计算下一个范围作为相邻范围的增量/减量,然后确保新范围不与任何现有范围重叠。
select t1.variable_name, t1.end_no+1, t2.start_no-1
from t t1
join t t2
on t1.variable_name=t2.variable_name
where t1.start_no < t2.start_no
and t1.end_no < t2.end_no
and not exists (select *
from t
where ((t2.start_no-1< t.end_no
and t1.end_no+1 > t.start_no) or
(t1.end_no + 1 < t.end_no and
t2.start_no-1 > t.end_no))
and t.variable_name=t1.variable_name)
#5
0
This is very portable as it doesn't require CTEs or analytic functions. I could also easily be rewritten without the derived table if that were ever necessary.
这非常便携,因为它不需要CTE或分析功能。如果有必要,我也可以在没有派生表的情况下轻松地重写。
select * from (
select
variable_name,
end_no + 1 as start_no,
(
select min(start_no) - 1
from T as t2
where t2.variable_name = t1.variable_name and t2.start_no > t1.end_no
) as end_no
from T as t1
) as intervals
where start_no <= end_no
The number of complemented intervals will be at maximum one fewer than the what you start with. (Some will be eliminated if two ranges were actually consecutive.) So it's easy to take each separate intervals and calculate the one just to its right (or left if you wanted to reverse some of the logic.)
补充间隔的数量最多比您开始时的数量少一个。 (如果两个范围实际上是连续的,那么有些将被消除。)因此,很容易取出每个单独的间隔并计算右边的一个(或者如果你想要反转某些逻辑则左边)。