在SQL表中查找所有类似的行

时间:2021-06-07 12:58:57

I need to write query that will find all items from the column Description that have duplicates that are the same or similar.

我需要编写查询,以查找列描述中具有相同或相似重复项的所有项。

My current query will find all values that are exactly the same, but it must include similar values; for example SQL Sql and sql.

我当前的查询将找到完全相同的所有值,但它必须包含类似的值;例如SQL Sql和sql。

SELECT 
    Description, COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    [Description]  
HAVING 
    COUNT(*) > 1   

I know how to use LIKE to search the table for all items similar to something I define, can I apply it to this problem?

我知道如何使用LIKE在表格中搜索与我定义的类似的所有项目,我可以将它应用于此问题吗?

Any and all help is greatly appreciated, thank you.

非常感谢任何和所有的帮助,谢谢。

--Editted 3/26/13

When I say similar, I mean more than case sensitive. I am working with company names and must account for people using different names such as Monsters Inc and Monsters Incorporated.

当我说类似时,我的意思不仅仅是区分大小写。我正在使用公司名称,并且必须考虑使用不同名称的人,例如Monsters Inc和Monsters Incorporated。

I would also like the output to display what the Description is so that I know what companies have redundancies in the database.

我还希望输出显示描述是什么,以便我知道哪些公司在数据库中有冗余。

I have since taken care of case sensitivity with

我已经处理了区分大小写

SELECT
    LOWER (Description), COUNT(*)AS Count_Of

RESOLVED

I have a query to find all that are exact that repeat, and I also have a query that will find all like items for an item I specify.

我有一个查询,以找到所有精确的重复,我也有一个查询,将找到我指定的项目的所有项目。

What I did to solve it was running the first query and storing all the repeated items in a table, and then modifying the second query so when run it would find all similar items there where for each item in the table I just created.

我要解决的是运行第一个查询并将所有重复的项目存储在一个表中,然后修改第二个查询,这样在运行时它会找到所有类似的项目,我刚刚创建的表中的每个项目。

Thank you very much to all that helped

非常感谢所有帮助过的人

3 个解决方案

#1


1  

If you only mean that you wish to carry out a case INsensitive comparison then simply specify the appropriate case-insensitive collation as part of your GROUP BY clause.

如果您只是意味着您希望执行一个不敏感的比较,那么只需指定适当的不区分大小写的排序规则作为GROUP BY子句的一部分。

You could, for example, use the following:

例如,您可以使用以下内容:

SELECT 
    Description COLLATE SQL_Latin1_General_CP1_CI_AS,
    COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    [Description] COLLATE SQL_Latin1_General_CP1_CI_AS
HAVING 
    COUNT(*) > 1 

#2


1  

Depending on what "similar" means, you can find SOUNDEX useful:

根据“类似”的含义,您可以找到有用的SOUNDEX:

http://www.techonthenet.com/oracle/functions/soundex.php

If not, what do you mean by similar ?

如果没有,你的意思是什么?

#3


1  

You can use Group By + CASE WHEN to group on similar values, but needs some cubersome work to do, for example:

您可以使用Group By + CASE WHEN对类似值进行分组,但需要进行一些简单的工作,例如:


 SELECT 
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END , COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END 
HAVING 
    COUNT(*) > 1 

#1


1  

If you only mean that you wish to carry out a case INsensitive comparison then simply specify the appropriate case-insensitive collation as part of your GROUP BY clause.

如果您只是意味着您希望执行一个不敏感的比较,那么只需指定适当的不区分大小写的排序规则作为GROUP BY子句的一部分。

You could, for example, use the following:

例如,您可以使用以下内容:

SELECT 
    Description COLLATE SQL_Latin1_General_CP1_CI_AS,
    COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    [Description] COLLATE SQL_Latin1_General_CP1_CI_AS
HAVING 
    COUNT(*) > 1 

#2


1  

Depending on what "similar" means, you can find SOUNDEX useful:

根据“类似”的含义,您可以找到有用的SOUNDEX:

http://www.techonthenet.com/oracle/functions/soundex.php

If not, what do you mean by similar ?

如果没有,你的意思是什么?

#3


1  

You can use Group By + CASE WHEN to group on similar values, but needs some cubersome work to do, for example:

您可以使用Group By + CASE WHEN对类似值进行分组,但需要进行一些简单的工作,例如:


 SELECT 
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END , COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END 
HAVING 
    COUNT(*) > 1