如何删除完全重复的行

时间:2022-03-08 09:12:53

Say i have duplicate rows in my table and well my database design is of 3rd class :-

假设我的表中有重复的行,而且我的数据库设计是第3类: -

Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Cinthol','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Lux','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (1,'Crowning Glory','cosmetic soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (2,'Cinthol','nice soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (3,'Lux','nice soap','soap');
Insert Into tblProduct (ProductId,ProductName,Description,Category) Values (3,'Lux','nice soap','soap');

I want only 1 instance of each row should be present in my table. Thus 2nd, 3rd and last row whcih are completely identical should be deleted. What query can i write for this? Can it be done without creating temp tables? Just in one single query?

我希望每个表中只有1个实例存在于我的表中。因此,应删除第2行,第3行和最后一行完全相同的行。我可以为此写什么查询?可以在不创建临时表的情况下完成吗?只需一个查询?

Thanks in advance :)

提前致谢 :)

4 个解决方案

#1


18  

Try this - it will delete all duplicates from your table:

试试这个 - 它将删除表中的所有重复项:

;WITH duplicates AS
(
    SELECT 
       ProductID, ProductName, Description, Category,
       ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
                          ORDER BY ProductID) 'RowNum'
    FROM dbo.tblProduct
)
DELETE FROM duplicates
WHERE RowNum > 1
GO

SELECT * FROM dbo.tblProduct
GO

Your duplicates should be gone now: output is:

您的副本现在应该消失了:输出是:

ProductID   ProductName   DESCRIPTION        Category
   1          Cinthol         cosmetic soap      soap
   1          Lux             cosmetic soap      soap
   1          Crowning Glory  cosmetic soap      soap
   2          Cinthol         nice soap          soap
   3          Lux             nice soap          soap

#2


4  

DELETE tblProduct 
FROM tblProduct 
LEFT OUTER JOIN (
   SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
   FROM tblProduct 
   GROUP BY ProductName, Description, Category
) as KeepRows ON
   tblProduct.ProductId= KeepRows.ProductId
WHERE
   KeepRows.ProductId IS NULL

Stolen from How can I remove duplicate rows?

被盗如何删除重复的行?

UPDATE:

更新:

This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.

这仅在ProductId是主键(不是主键)时才有效。你最好使用@marc_s'方法,但我会留下这个以防万一使用PK的人遇到这篇文章。

#3


1  

I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:

几个星期前我不得不这样做...你使用的是什么版本的SQL Server?在SQL Server 2005及更高版本中,您可以使用Row_Number作为选择的一部分,并且只选择Row_Number为1的位置。我忘记了确切的语法,但是它有详细记录......有些内容如下:

Select t0.ProductID, 
       t0.ProductName, 
       t0.Description, 
       t0.Category
Into   tblCleanData
From   (
    Select ProductID, 
           ProductName, 
           Description, 
           Category, 
           Row_Number() Over (
               Partition By ProductID, 
                            ProductName, 
                            Description, 
                            Category
               Order By     ProductID,
                            ProductName,
                            Description,
                            Category
           ) As RowNumber
    From   MyTable
) As t0
Where t0.RowNumber = 1

Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.

查看http://msdn.microsoft.com/en-us/library/ms186734.aspx,这应该会让你朝着正确的方向前进。

#4


0  

First use a SELECT... INTO:

首先使用SELECT ... INTO:

SELECT DISTINCT ProductID, ProductName, Description, Category
    INTO tblProductClean
    FROM tblProduct

The drop the first table.

放下第一张桌子。

#1


18  

Try this - it will delete all duplicates from your table:

试试这个 - 它将删除表中的所有重复项:

;WITH duplicates AS
(
    SELECT 
       ProductID, ProductName, Description, Category,
       ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
                          ORDER BY ProductID) 'RowNum'
    FROM dbo.tblProduct
)
DELETE FROM duplicates
WHERE RowNum > 1
GO

SELECT * FROM dbo.tblProduct
GO

Your duplicates should be gone now: output is:

您的副本现在应该消失了:输出是:

ProductID   ProductName   DESCRIPTION        Category
   1          Cinthol         cosmetic soap      soap
   1          Lux             cosmetic soap      soap
   1          Crowning Glory  cosmetic soap      soap
   2          Cinthol         nice soap          soap
   3          Lux             nice soap          soap

#2


4  

DELETE tblProduct 
FROM tblProduct 
LEFT OUTER JOIN (
   SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
   FROM tblProduct 
   GROUP BY ProductName, Description, Category
) as KeepRows ON
   tblProduct.ProductId= KeepRows.ProductId
WHERE
   KeepRows.ProductId IS NULL

Stolen from How can I remove duplicate rows?

被盗如何删除重复的行?

UPDATE:

更新:

This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.

这仅在ProductId是主键(不是主键)时才有效。你最好使用@marc_s'方法,但我会留下这个以防万一使用PK的人遇到这篇文章。

#3


1  

I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:

几个星期前我不得不这样做...你使用的是什么版本的SQL Server?在SQL Server 2005及更高版本中,您可以使用Row_Number作为选择的一部分,并且只选择Row_Number为1的位置。我忘记了确切的语法,但是它有详细记录......有些内容如下:

Select t0.ProductID, 
       t0.ProductName, 
       t0.Description, 
       t0.Category
Into   tblCleanData
From   (
    Select ProductID, 
           ProductName, 
           Description, 
           Category, 
           Row_Number() Over (
               Partition By ProductID, 
                            ProductName, 
                            Description, 
                            Category
               Order By     ProductID,
                            ProductName,
                            Description,
                            Category
           ) As RowNumber
    From   MyTable
) As t0
Where t0.RowNumber = 1

Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.

查看http://msdn.microsoft.com/en-us/library/ms186734.aspx,这应该会让你朝着正确的方向前进。

#4


0  

First use a SELECT... INTO:

首先使用SELECT ... INTO:

SELECT DISTINCT ProductID, ProductName, Description, Category
    INTO tblProductClean
    FROM tblProduct

The drop the first table.

放下第一张桌子。