I am trying to select the last non-null value per column based on a date value.
我正在尝试根据日期值为每列选择最后一个非空值。
I have a table that looks like this -
我有一张像这样的桌子
Email Name1 Name2 Job Date
Test1@test.com Ron NULL NULL 2015-01-01 00:00:00.000
Test1@test.com Dave Smith NULL 2014-01-01 00:00:00.000
Test1@test.com NULL NULL NULL 2013-01-01 00:00:00.000
Test2@test.com NULL Smith NULL 2014-01-01 00:00:00.000
Test2@test.com NULL Ford Plumber 2015-01-01 00:00:00.000`
I want to display the newest non-null value for each column per email address.
我想要显示每个邮件地址每个列的最新非空值。
The output should be -
输出应该是-
Email Name1 Name2 Job
Test1@test.com Ron Smith NULL
Test2@test.com NULL Ford Plumber
I have already written some pretty ugly SQL to solve this problem however I want to apply this logic to another table with a lot more columns.
我已经编写了一些非常难看的SQL来解决这个问题,但是我想将这个逻辑应用到另一个包含更多列的表。
My Question is - Is there an easier way to do this without having to join for each column?
我的问题是——是否有一种更简单的方法可以做到这一点,而不必为每个列都加入?
Current solution is as follows -
目前的解决方案如下:
select distinct a.[Email],b.[Name1],c.[Name2],d.[job] from
(
select [Email] from #test
)
A
left join
(
SELECT [Email],
FIRST_VALUE([Name1]) over(partition by [Email] order by [Date] desc) as [Name1]
from #test
where [Name1] is not null
) b
on a.[Email] = b.[Email]
left join
(
SELECT [Email],
FIRST_VALUE([Name2]) over(partition by [Email] order by [Date] desc) as [Name2]
from #test
where [Name2] is not null
) c
on a.[Email] = c.[Email]
left join
(
select [Email],
FIRST_VALUE([Job]) over(partition by [Email] order by [Date] desc) as [Job]
from #test
where [Job] is not null
) d
on a.[Email] = d.[Email]
Here's the DDL/DML for the example table if this helps -
下面是示例表的DDL/DML,如果这有帮助的话
create table #test
([Email] nvarchar(50),
[Name1] nvarchar(50),
[Name2] nvarchar(50),
[Job] nvarchar(50),
[Date] datetime)
insert into #test
values
('Test1@test.com', 'Ron', null,null,'20150101'),
('Test1@test.com', 'Dave' ,'Smith',null, '20140101'),
('Test1@test.com', null, null, null ,'20130101'),
('Test2@test.com', null, 'Smith', null, '20140101'),
('Test2@test.com', null, 'Ford', 'Plumber','20150101')
2 个解决方案
#1
2
There are ways that don't require so many joins. None are simple because SQL Server doesn't support the ignore nulls
option on lag()
.
有一些方法不需要这么多连接。它们都不简单,因为SQL Server不支持lag()上的ignore nulls选项。
Basically, you'll need to do logic on each column. One method without a subquery is:
基本上,您需要在每个列上执行逻辑。没有子查询的一种方法是:
select distinct email,
first_value(name1) over (partition by email
order by (case when name1 is not null then date else '2000-01-01' end) desc
) as name1,
. . .
from #test;
An alternative uses outer apply:
另一种选择是使用外部应用:
select t.email, name1, . . .
from (select distinct email from #test t) t outer apply
(select top 1 name1
from #test t2
where t2.email = t.email and name1 is not null
order by date desc
) name1 . . .
#2
2
You can use FIRST_VALUE
with DISTINCT
:
您可以使用FIRST_VALUE与不同的:
SELECT DISTINCT Email,
FIRST_VALUE(Name1) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Name1 IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Name1,
FIRST_VALUE(Name2) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Name2 IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Name2,
FIRST_VALUE(Job) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Job IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Job
FROM test
#1
2
There are ways that don't require so many joins. None are simple because SQL Server doesn't support the ignore nulls
option on lag()
.
有一些方法不需要这么多连接。它们都不简单,因为SQL Server不支持lag()上的ignore nulls选项。
Basically, you'll need to do logic on each column. One method without a subquery is:
基本上,您需要在每个列上执行逻辑。没有子查询的一种方法是:
select distinct email,
first_value(name1) over (partition by email
order by (case when name1 is not null then date else '2000-01-01' end) desc
) as name1,
. . .
from #test;
An alternative uses outer apply:
另一种选择是使用外部应用:
select t.email, name1, . . .
from (select distinct email from #test t) t outer apply
(select top 1 name1
from #test t2
where t2.email = t.email and name1 is not null
order by date desc
) name1 . . .
#2
2
You can use FIRST_VALUE
with DISTINCT
:
您可以使用FIRST_VALUE与不同的:
SELECT DISTINCT Email,
FIRST_VALUE(Name1) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Name1 IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Name1,
FIRST_VALUE(Name2) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Name2 IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Name2,
FIRST_VALUE(Job) OVER (PARTITION BY Email
ORDER BY CASE
WHEN Job IS NULL THEN '19000101'
ELSE [Date]
END DESC) AS Job
FROM test