在SQL Server中选择首选项

时间:2021-11-25 15:41:53

I have a table in SQL Server 2000 with data similar to the following:

我在SQL Server 2000中有一个表,其数据类似于以下内容:

ReferenceNumber    ReferenceValue
00001              Not assigned
00002              Not assigned
00002              ABCDE

in which each ReferenceNumber can appear multiple times in the table, either with a ReferenceValue of 'Not assigned' or a true ReferenceValue.

其中每个ReferenceNumber可以在表中多次出现,具有“未分配”的ReferenceValue或真正的ReferenceValue。

I want to dump the data into a cleaned-up table with only one row per ReferenceNumber and a true ReferenceValue if it exists, or 'Not assigned' if there are no true ReferenceValues.

我想将数据转储到一个清理的表中,每个ReferenceNumber只有一行,如果存在,则为真的ReferenceValue,如果没有真正的ReferenceValues,则为“未分配”。

I can see how to do it with two queries:

我可以看到如何使用两个查询:

SELECT TOP 1 ReferenceNumber, ReferenceValue
INTO clean
FROM duplicates
WHERE ReferenceValue <> 'Not assigned'

INSERT INTO clean(ReferenceNumber, ReferenceValue)
SELECT TOP 1 ReferenceNumber, ReferenceValue
WHERE ReferenceValue = 'Not assigned' 
AND ReferenceNumber NOT IN (SELECT ReferenceNumber FROM clean)

but I'm thinking there must be a better way. Any ideas?

但我认为必须有更好的方法。有任何想法吗?

2 个解决方案

#1


2  

Something like this:

像这样的东西:

SELECT 
  ReferenceNumber
, ReferenceValue = ISNULL(MAX(NULLIF(ReferenceValue,'Not assigned')),'Not assigned')
INTO Table1_Clean
FROM Table1
GROUP BY
  ReferenceNumber

MAX() ignores NULLs, so convert whatever you don't want to NULL first, then MAX(), then convert NULLs back to a dummy value.

MAX()忽略NULL,因此首先将您不想要的任何内容转换为NULL,然后再转换为MAX(),然后将NULL转换回虚拟值。

One pass, in-line, can't get much more efficient.

一次通过,内联,无法提高效率。

#2


2  

For SQL SERVER 2000, this is probably easiest. First clause = "real" values, second clause where not found in first clause. And an extension of your idea.

对于SQL SERVER 2000,这可能是最简单的。第一个子句=“真实”值,第二个子句在第一个子句中找不到。并延伸你的想法。

SELECT d2.ReferenceNumber, d2.ReferenceValue
FROM duplicates d2
WHERE d2.ReferenceValue <> 'Not assigned'
UNION ALL
SELECT d1.ReferenceNumber, d1.ReferenceValue
FROM duplicates d1
WHERE NOT EXISTS (SELECT *
         FROM duplicates d2
         WHERE d2.ReferenceNumber = d1.ReferenceNumber AND
                 d2.ReferenceValue <> 'Not assigned')

However, what criteria do you want to tie break between "true" reference values? or just pick one?

但是,您希望在“真实”参考值之间打破什么标准?或者只选一个?

#1


2  

Something like this:

像这样的东西:

SELECT 
  ReferenceNumber
, ReferenceValue = ISNULL(MAX(NULLIF(ReferenceValue,'Not assigned')),'Not assigned')
INTO Table1_Clean
FROM Table1
GROUP BY
  ReferenceNumber

MAX() ignores NULLs, so convert whatever you don't want to NULL first, then MAX(), then convert NULLs back to a dummy value.

MAX()忽略NULL,因此首先将您不想要的任何内容转换为NULL,然后再转换为MAX(),然后将NULL转换回虚拟值。

One pass, in-line, can't get much more efficient.

一次通过,内联,无法提高效率。

#2


2  

For SQL SERVER 2000, this is probably easiest. First clause = "real" values, second clause where not found in first clause. And an extension of your idea.

对于SQL SERVER 2000,这可能是最简单的。第一个子句=“真实”值,第二个子句在第一个子句中找不到。并延伸你的想法。

SELECT d2.ReferenceNumber, d2.ReferenceValue
FROM duplicates d2
WHERE d2.ReferenceValue <> 'Not assigned'
UNION ALL
SELECT d1.ReferenceNumber, d1.ReferenceValue
FROM duplicates d1
WHERE NOT EXISTS (SELECT *
         FROM duplicates d2
         WHERE d2.ReferenceNumber = d1.ReferenceNumber AND
                 d2.ReferenceValue <> 'Not assigned')

However, what criteria do you want to tie break between "true" reference values? or just pick one?

但是,您希望在“真实”参考值之间打破什么标准?或者只选一个?