如何找到一个以上的大写字符

时间:2021-09-04 20:12:08

I'm running a series of SQL queries to find data that needs cleaning up. One of them I want to do is look for:

我正在运行一系列SQL查询以查找需要清理的数据。我想做的其中一件事是:

  • 2 or more uppercase letters in a row
  • 一行中有两个或更多的大写字母
  • starting with a lowercase letter
  • 从小写字母开始
  • space then a lowercase letter
  • 空格然后是小写字母。

For example my name should be "John Doe". I would want it to find "JOhn Doe" or "JOHN DOE" or "John doe", but I would not want it to find "John Doe" since that is formatted correctly.

例如,我的名字应该是“John Doe”。我希望它找到“JOhn Doe”或“JOhn Doe”或“JOhn Doe”,但我不希望它找到“JOhn Doe”,因为它的格式是正确的。

I am using SQL Server 2008.

我正在使用SQL Server 2008。

5 个解决方案

#1


3  

The key is to use a case-sensitive collation, i.e. Latin1_General_BIN*. You can then use a query with a LIKE expression like the following (SQL Fiddle demo):

关键是使用区分大小写的排序,例如Latin1_General_BIN*。然后,您可以使用带有如下表达式的查询(SQL Fiddle demo):

select *
from foo
where name like '%[A-Z][A-Z]%' collate Latin1_General_BIN --two uppercase in a row
or name like '% [a-z]%' collate Latin1_General_BIN --space then lowercase

*As per How do I perform a case-sensitive search using LIKE?, apparently there is a "bug" in the Latin1_General_CS_AS collation where ranges like [A-Z] fail to be case sensitive. The solution is to use Latin1_General_BIN.

*如何使用LIKE执行区分大小写的搜索?显然,在Latin1_General_CS_AS排序中存在一个“bug”,其中[a - z]之类的范围不能区分大小写。解决方案是使用Latin1_General_BIN。

#2


1  

First, I think you should make a function that returns a proper name (sounds like you need one anyway). See here under the heading "Proper Casing a Persons Name". Then find the ones that don't match.

首先,我认为您应该创建一个返回适当名称的函数(听起来您无论如何都需要一个)。请参阅标题“正确填写人名”。然后找出那些不匹配的。

SELECT Id, Name, dbo.ProperCase(Name)
FROM MyTable
WHERE Name <> dbo.PoperCase(Name)  collate Latin1_General_BIN

This will help you clean up the data and tweak the function to what you need.

这将帮助您清理数据并根据需要调整函数。

#3


1  

You can use a regular expression. I'm not a SQL Server whiz, but you want to use RegexMatch. Something like this:

您可以使用正则表达式。我不是SQL服务器高手,但是您需要使用RegexMatch。是这样的:

select columnName
from tableName
where dbo.RegexMatch( columnName, 
        N'[A-Z]\W[A-Z]' ) = 1

#4


0  

If your goal is to update your column to capitalize the first character of each word (in your case firstName and lastName) , you can use the following query.

如果您的目标是更新列以大写每个单词的第一个字符(在您的案例中是firstName和lastName),您可以使用以下查询。

Create a sample table with data

使用数据创建一个示例表

Declare @t table (Id int IDENTITY(1,1),Name varchar(50))
insert into @t (name)values ('john doe'),('lohn foe'),('tohnytty noe'),('gohnsdf fgedsfsdf')

Update query

更新查询

UPDATE @t
SET name =  UPPER(LEFT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), 1)) + RIGHT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), LEN(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1)) - 1) +
            ' ' +
            UPPER(LEFT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), 1)) + RIGHT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), LEN(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000)) - 1)
FROM @t

Output

输出

SELECT * FROM @t

Id  Name
1   John Doe
2   Lohn Foe
3   Tohnytty Noe
4   Gohnsdf Fgedsfsdf

#5


0  

I use this way:

我用这个方法:

;WITH yourTable AS(
    SELECT 'John Doe' As name
    UNION ALL SELECT 'JOhn Doe'
    UNION ALL SELECT 'JOHN DOE'
    UNION ALL SELECT 'John doe'
    UNION ALL SELECT 'John DoE'
    UNION ALL SELECT 'john Doe'
    UNION ALL SELECT 'jOhn dOe'
    UNION ALL SELECT 'jOHN dOE'
    UNION ALL SELECT 'john doe'
)
SELECT name
FROM (
    SELECT  name,
            LOWER(PARSENAME(REPLACE(name, ' ', '.'), 1)) part2,
            LOWER(PARSENAME(REPLACE(name, ' ', '.'), 2)) part1
    FROM yourTable) t
WHERE name COLLATE Latin1_General_BIN = UPPER(LEFT(part1,1)) + RIGHT(part1, LEN(part1) -1) + 
                                  ' ' + UPPER(LEFT(part2,1)) + RIGHT(part2, LEN(part2) -1)

Note:
This will be good for just two parted names for more, it should improved.

注意:这对于两个分开的名字来说会更好,应该会更好。

#1


3  

The key is to use a case-sensitive collation, i.e. Latin1_General_BIN*. You can then use a query with a LIKE expression like the following (SQL Fiddle demo):

关键是使用区分大小写的排序,例如Latin1_General_BIN*。然后,您可以使用带有如下表达式的查询(SQL Fiddle demo):

select *
from foo
where name like '%[A-Z][A-Z]%' collate Latin1_General_BIN --two uppercase in a row
or name like '% [a-z]%' collate Latin1_General_BIN --space then lowercase

*As per How do I perform a case-sensitive search using LIKE?, apparently there is a "bug" in the Latin1_General_CS_AS collation where ranges like [A-Z] fail to be case sensitive. The solution is to use Latin1_General_BIN.

*如何使用LIKE执行区分大小写的搜索?显然,在Latin1_General_CS_AS排序中存在一个“bug”,其中[a - z]之类的范围不能区分大小写。解决方案是使用Latin1_General_BIN。

#2


1  

First, I think you should make a function that returns a proper name (sounds like you need one anyway). See here under the heading "Proper Casing a Persons Name". Then find the ones that don't match.

首先,我认为您应该创建一个返回适当名称的函数(听起来您无论如何都需要一个)。请参阅标题“正确填写人名”。然后找出那些不匹配的。

SELECT Id, Name, dbo.ProperCase(Name)
FROM MyTable
WHERE Name <> dbo.PoperCase(Name)  collate Latin1_General_BIN

This will help you clean up the data and tweak the function to what you need.

这将帮助您清理数据并根据需要调整函数。

#3


1  

You can use a regular expression. I'm not a SQL Server whiz, but you want to use RegexMatch. Something like this:

您可以使用正则表达式。我不是SQL服务器高手,但是您需要使用RegexMatch。是这样的:

select columnName
from tableName
where dbo.RegexMatch( columnName, 
        N'[A-Z]\W[A-Z]' ) = 1

#4


0  

If your goal is to update your column to capitalize the first character of each word (in your case firstName and lastName) , you can use the following query.

如果您的目标是更新列以大写每个单词的第一个字符(在您的案例中是firstName和lastName),您可以使用以下查询。

Create a sample table with data

使用数据创建一个示例表

Declare @t table (Id int IDENTITY(1,1),Name varchar(50))
insert into @t (name)values ('john doe'),('lohn foe'),('tohnytty noe'),('gohnsdf fgedsfsdf')

Update query

更新查询

UPDATE @t
SET name =  UPPER(LEFT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), 1)) + RIGHT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), LEN(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1)) - 1) +
            ' ' +
            UPPER(LEFT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), 1)) + RIGHT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), LEN(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000)) - 1)
FROM @t

Output

输出

SELECT * FROM @t

Id  Name
1   John Doe
2   Lohn Foe
3   Tohnytty Noe
4   Gohnsdf Fgedsfsdf

#5


0  

I use this way:

我用这个方法:

;WITH yourTable AS(
    SELECT 'John Doe' As name
    UNION ALL SELECT 'JOhn Doe'
    UNION ALL SELECT 'JOHN DOE'
    UNION ALL SELECT 'John doe'
    UNION ALL SELECT 'John DoE'
    UNION ALL SELECT 'john Doe'
    UNION ALL SELECT 'jOhn dOe'
    UNION ALL SELECT 'jOHN dOE'
    UNION ALL SELECT 'john doe'
)
SELECT name
FROM (
    SELECT  name,
            LOWER(PARSENAME(REPLACE(name, ' ', '.'), 1)) part2,
            LOWER(PARSENAME(REPLACE(name, ' ', '.'), 2)) part1
    FROM yourTable) t
WHERE name COLLATE Latin1_General_BIN = UPPER(LEFT(part1,1)) + RIGHT(part1, LEN(part1) -1) + 
                                  ' ' + UPPER(LEFT(part2,1)) + RIGHT(part2, LEN(part2) -1)

Note:
This will be good for just two parted names for more, it should improved.

注意:这对于两个分开的名字来说会更好,应该会更好。