I am trying to write a query that will give me the details of all duplicate invoices per vendor.
我正在尝试编写一个查询,它将向我提供每个供应商的所有重复发票的详细信息。
I cannot use the group by since I need all the details of the invoices. So far this is what I have tried
由于我需要发票的所有细节,所以无法使用group by。到目前为止,这就是我所尝试的
select
vendid, InvcNbr as InvcNbr, InvcDate, OrigDocAmt, PayDate,
dense_RANK() over (partition by vendid order by invcnbr) RN
from APDoc
where InvcDate >= '10/01/2013'
Not sure how to proceed from here.
不知道该如何继续。
vendid InvcNbr InvcDate OrigDoc Paydate RN
AAA 1067458361 10/2/2013 0.00 11/1/2013 8
AAA 1067461099 10/2/2013 16.08 11/1/2013 9
AAA 1067461099 10/2/2013 16.08 11/1/2013 9
AAA 1067461101 10/2/2013 16.08 11/1/2013 10
AAA 1067461101 10/2/2013 16.08 11/1/2013 10
AAA 1067461102 10/2/2013 16.08 11/1/2013 11
AAA 1067461102 10/2/2013 16.08 11/1/2013 11
AAA 1067461103 10/2/2013 92.45 11/1/2013 12
AAA 1067461103 10/2/2013 92.45 11/1/2013 12
4 个解决方案
#1
4
Use Group By
with the Having
clause to identify the duplicates, and then join these results to an outer query to see the details of the duplicates.
使用带有have子句的Group By来标识副本,然后将这些结果连接到外部查询以查看副本的详细信息。
Here is an example of how you can do this.
这里有一个例子,说明如何做到这一点。
SELECT a.vendid,a.InvcNbr as InvcNbr,a.InvcDate,a.OrigDocAmt,a.PayDate
FROM APDoc a
JOIN (
SELECT vendid, InvcNbr
FROM APDoc
WHERE InvcDate >= '10/01/2013'
GROUP BY vendid,InvcNbr HAVING COUNT(*) > 1
) b ON a.vendid = b.vendid AND a.InvcNbr = b.InvcNbr
#2
0
Something like this using Common-Table-Expressions could build the query up as required.
类似这样使用通用表表达式的东西可以根据需要构建查询。
WITH TempCTE AS (SELECT InvcNbr, vendid, ROW_NUMBER() OVER (PARITION
BY vendid, InvcNbr order by invcnbr ) AS RowNum FROM APDoc),
// Find all combinations of InvcNbr/vendid exist
TempCTE2 AS (SELECT InvcNbr, vendid FROM TempCTE WHERE RowNum > 1)
// Get all the combinations of InvcNbr/vendid
SELECT * FROM TempCTE2
INNER JOIN APDoc ON TempCTE2.InvcNbr = APDoc.InvcNbr
AND APDoc.vendid = TempCTE2.vendid
#3
0
This also works and might be easier to understand.
这也很有效,可能更容易理解。
select InvcNbr, COUNT(InvcNbr) as [count]
into #temp1
from #APDoc
group by InvcNbr
select a.vendid, a.InvcNbr, a.InvcDate, a.OrigDoc, a.Paydate, a.RN
from APDoc a, #temp1 b
where a.InvcNbr = b.InvcNbr
and b.[count] = 2
#4
0
Assuming you have a primary key on the table, you can execute this (potentially) very quickly by using the EXISTS clause
假设表上有一个主键,可以通过使用exist子句(可能)非常快速地执行它
select * from APDoc a1 where exists
(
select 1 from APDoc a2
where a1.pk <> a2.pk
and a1.vendid = a2.vendid and a1.invcnbr = a2.invcnbr
)
and InvcDate >= '10/01/2013'
order by vendid , invcnbr
This allows the query optimizer to generate a plan that does not require aggregation which in a very large table with many duplicates will be expensive.
这允许查询优化器生成一个不需要聚合的计划,而在一个具有许多重复的非常大的表中,这种计划将会非常昂贵。
#1
4
Use Group By
with the Having
clause to identify the duplicates, and then join these results to an outer query to see the details of the duplicates.
使用带有have子句的Group By来标识副本,然后将这些结果连接到外部查询以查看副本的详细信息。
Here is an example of how you can do this.
这里有一个例子,说明如何做到这一点。
SELECT a.vendid,a.InvcNbr as InvcNbr,a.InvcDate,a.OrigDocAmt,a.PayDate
FROM APDoc a
JOIN (
SELECT vendid, InvcNbr
FROM APDoc
WHERE InvcDate >= '10/01/2013'
GROUP BY vendid,InvcNbr HAVING COUNT(*) > 1
) b ON a.vendid = b.vendid AND a.InvcNbr = b.InvcNbr
#2
0
Something like this using Common-Table-Expressions could build the query up as required.
类似这样使用通用表表达式的东西可以根据需要构建查询。
WITH TempCTE AS (SELECT InvcNbr, vendid, ROW_NUMBER() OVER (PARITION
BY vendid, InvcNbr order by invcnbr ) AS RowNum FROM APDoc),
// Find all combinations of InvcNbr/vendid exist
TempCTE2 AS (SELECT InvcNbr, vendid FROM TempCTE WHERE RowNum > 1)
// Get all the combinations of InvcNbr/vendid
SELECT * FROM TempCTE2
INNER JOIN APDoc ON TempCTE2.InvcNbr = APDoc.InvcNbr
AND APDoc.vendid = TempCTE2.vendid
#3
0
This also works and might be easier to understand.
这也很有效,可能更容易理解。
select InvcNbr, COUNT(InvcNbr) as [count]
into #temp1
from #APDoc
group by InvcNbr
select a.vendid, a.InvcNbr, a.InvcDate, a.OrigDoc, a.Paydate, a.RN
from APDoc a, #temp1 b
where a.InvcNbr = b.InvcNbr
and b.[count] = 2
#4
0
Assuming you have a primary key on the table, you can execute this (potentially) very quickly by using the EXISTS clause
假设表上有一个主键,可以通过使用exist子句(可能)非常快速地执行它
select * from APDoc a1 where exists
(
select 1 from APDoc a2
where a1.pk <> a2.pk
and a1.vendid = a2.vendid and a1.invcnbr = a2.invcnbr
)
and InvcDate >= '10/01/2013'
order by vendid , invcnbr
This allows the query optimizer to generate a plan that does not require aggregation which in a very large table with many duplicates will be expensive.
这允许查询优化器生成一个不需要聚合的计划,而在一个具有许多重复的非常大的表中,这种计划将会非常昂贵。