Say i have duplicate rows in my table and well my database design is of 3rd class :-
假设我的表中有重复的行,而且我的数据库设计是第3类: -
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Lux','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Crowning Glory','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (2,'Cinthol','nice soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (3,'Lux','nice soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (3,'Lux','nice soap','soap');
I want only 1 instance of each row should be present in my table. Thus 2nd, 3rd and last row
whcih are completely identical should be deleted. What query can i write for this? Can it be done without creating temp tables? Just in one single query?
我希望每个表中只有1个实例存在于我的表中。因此,应删除第2行,第3行和最后一行完全相同的行。我可以为此写什么查询?可以在不创建临时表的情况下完成吗?只需一个查询?
Thanks in advance :)
提前致谢 :)
4 个解决方案
#1
18
Try this - it will delete all duplicates from your table:
试试这个 - 它将删除表中的所有重复项:
;WITH duplicates AS
(
SELECT
ProductID, ProductName, Description, Category,
ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
ORDER BY ProductID) 'RowNum'
FROM dbo.tblProduct
)
DELETE FROM duplicates
WHERE RowNum > 1
GO
SELECT * FROM dbo.tblProduct
GO
Your duplicates should be gone now: output is:
您的副本现在应该消失了:输出是:
ProductID ProductName DESCRIPTION Category
1 Cinthol cosmetic soap soap
1 Lux cosmetic soap soap
1 Crowning Glory cosmetic soap soap
2 Cinthol nice soap soap
3 Lux nice soap soap
#2
4
DELETE tblProduct
FROM tblProduct
LEFT OUTER JOIN (
SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
FROM tblProduct
GROUP BY ProductName, Description, Category
) as KeepRows ON
tblProduct.ProductId= KeepRows.ProductId
WHERE
KeepRows.ProductId IS NULL
Stolen from How can I remove duplicate rows?
被盗如何删除重复的行?
UPDATE:
更新:
This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.
这仅在ProductId是主键(不是主键)时才有效。你最好使用@marc_s'方法,但我会留下这个以防万一使用PK的人遇到这篇文章。
#3
1
I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:
几个星期前我不得不这样做...你使用的是什么版本的SQL Server?在SQL Server 2005及更高版本中,您可以使用Row_Number作为选择的一部分,并且只选择Row_Number为1的位置。我忘记了确切的语法,但是它有详细记录......有些内容如下:
Select t0.ProductID,
t0.ProductName,
t0.Description,
t0.Category
Into tblCleanData
From (
Select ProductID,
ProductName,
Description,
Category,
Row_Number() Over (
Partition By ProductID,
ProductName,
Description,
Category
Order By ProductID,
ProductName,
Description,
Category
) As RowNumber
From MyTable
) As t0
Where t0.RowNumber = 1
Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.
查看http://msdn.microsoft.com/en-us/library/ms186734.aspx,这应该会让你朝着正确的方向前进。
#4
0
First use a SELECT... INTO
:
首先使用SELECT ... INTO:
SELECT DISTINCT ProductID, ProductName, Description, Category
INTO tblProductClean
FROM tblProduct
The drop the first table.
放下第一张桌子。
#1
18
Try this - it will delete all duplicates from your table:
试试这个 - 它将删除表中的所有重复项:
;WITH duplicates AS
(
SELECT
ProductID, ProductName, Description, Category,
ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
ORDER BY ProductID) 'RowNum'
FROM dbo.tblProduct
)
DELETE FROM duplicates
WHERE RowNum > 1
GO
SELECT * FROM dbo.tblProduct
GO
Your duplicates should be gone now: output is:
您的副本现在应该消失了:输出是:
ProductID ProductName DESCRIPTION Category
1 Cinthol cosmetic soap soap
1 Lux cosmetic soap soap
1 Crowning Glory cosmetic soap soap
2 Cinthol nice soap soap
3 Lux nice soap soap
#2
4
DELETE tblProduct
FROM tblProduct
LEFT OUTER JOIN (
SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
FROM tblProduct
GROUP BY ProductName, Description, Category
) as KeepRows ON
tblProduct.ProductId= KeepRows.ProductId
WHERE
KeepRows.ProductId IS NULL
Stolen from How can I remove duplicate rows?
被盗如何删除重复的行?
UPDATE:
更新:
This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.
这仅在ProductId是主键(不是主键)时才有效。你最好使用@marc_s'方法,但我会留下这个以防万一使用PK的人遇到这篇文章。
#3
1
I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:
几个星期前我不得不这样做...你使用的是什么版本的SQL Server?在SQL Server 2005及更高版本中,您可以使用Row_Number作为选择的一部分,并且只选择Row_Number为1的位置。我忘记了确切的语法,但是它有详细记录......有些内容如下:
Select t0.ProductID,
t0.ProductName,
t0.Description,
t0.Category
Into tblCleanData
From (
Select ProductID,
ProductName,
Description,
Category,
Row_Number() Over (
Partition By ProductID,
ProductName,
Description,
Category
Order By ProductID,
ProductName,
Description,
Category
) As RowNumber
From MyTable
) As t0
Where t0.RowNumber = 1
Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.
查看http://msdn.microsoft.com/en-us/library/ms186734.aspx,这应该会让你朝着正确的方向前进。
#4
0
First use a SELECT... INTO
:
首先使用SELECT ... INTO:
SELECT DISTINCT ProductID, ProductName, Description, Category
INTO tblProductClean
FROM tblProduct
The drop the first table.
放下第一张桌子。