Is there a smarter way to remove all special characters rather than having a series of about 15 nested replace statements?
是否有一种更聪明的方法来删除所有特殊字符,而不是使用一系列15个嵌套的替换语句?
The following works, but only handles three characters (ampersand, blank and period).
下面的工作,但只处理三个字符(&,空格和句号)。
select CustomerID, CustomerName,
Replace(Replace(Replace(CustomerName,'&',''),' ',''),'.','') as CustomerNameStripped
from Customer
5 个解决方案
#1
14
One flexible-ish way;
一个flexible-ish方式;
ALTER FUNCTION [dbo].[fnRemovePatternFromString](@BUFFER VARCHAR(MAX), @PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @POS INT = PATINDEX(@PATTERN, @BUFFER)
WHILE @POS > 0 BEGIN
SET @BUFFER = STUFF(@BUFFER, @POS, 1, '')
SET @POS = PATINDEX(@PATTERN, @BUFFER)
END
RETURN @BUFFER
END
select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')
(No column name)
cake beer 399c
#2
7
Create a function:
创建一个函数:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
@s VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE @p INT = 1, @n VARCHAR(255) = '';
WHILE @p <= LEN(@s)
BEGIN
IF SUBSTRING(@s, @p, 1) LIKE '[A-Za-z0-9]'
BEGIN
SET @n += SUBSTRING(@s, @p, 1);
END
SET @p += 1;
END
RETURN(@n);
END
GO
Then:
然后:
SELECT Result = dbo.StripNonAlphaNumerics
('My Customer''s dog & #1 friend are dope, yo!');
Results:
结果:
Result
------
MyCustomersdog1friendaredopeyo
To make it more flexible, you could pass in the pattern you want to allow:
为了使它更灵活,您可以传入您希望允许的模式:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
@s VARCHAR(255),
@pattern VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE @p INT = 1, @n VARCHAR(255) = '';
WHILE @p <= LEN(@s)
BEGIN
IF SUBSTRING(@s, @p, 1) LIKE @pattern
BEGIN
SET @n += SUBSTRING(@s, @p, 1);
END
SET @p += 1;
END
RETURN(@n);
END
GO
Then:
然后:
SELECT r = dbo.StripNonAlphaNumerics
('Bob''s dog & #1 friend are dope, yo!', '[A-Za-z0-9]');
Results:
结果:
r
------
Bobsdog1friendaredopeyo
#3
6
I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:
几年前我遇到了这个问题,所以我编写了一个SQL函数来解决这个问题。这是原始文章(用于从HTML中提取文本)。我已经更新了功能如下:
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
BEGIN
PRINT 'Dropping: dbo.fn_CleanString'
DROP function dbo.fn_CleanString
END
GO
PRINT 'Creating: dbo.fn_CleanString'
GO
CREATE FUNCTION dbo.fn_CleanString
(
@string varchar(8000)
)
returns varchar(8000)
AS
BEGIN
---------------------------------------------------------------------------------------------------
-- Title: CleanString
-- Date Created: March 26, 2011
-- Author: William McEvoy
--
-- Description: This function removes special ascii characters from a string.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
declare @char char(1),
@len int,
@count int,
@newstring varchar(8000),
@replacement char(1)
select @count = 1,
@len = 0,
@newstring = '',
@replacement = ' '
---------------------------------------------------------------------------------------------------
-- M A I N P R O C E S S I N G
---------------------------------------------------------------------------------------------------
-- Remove Backspace characters
select @string = replace(@string,char(8),@replacement)
-- Remove Tabs
select @string = replace(@string,char(9),@replacement)
-- Remove line feed
select @string = replace(@string,char(10),@replacement)
-- Remove carriage return
select @string = replace(@string,char(13),@replacement)
-- Condense multiple spaces into a single space
-- This works by changing all double spaces to be OX where O = a space, and X = a special character
-- then all occurrences of XO are changed to O,
-- then all occurrences of X are changed to nothing, leaving just the O which is actually a single space
select @string = replace(replace(replace(ltrim(rtrim(@string)),' ', ' ' + char(7)),char(7)+' ',''),char(7),'')
-- Parse each character, remove non alpha-numeric
select @len = len(@string)
WHILE (@count <= @len)
BEGIN
-- Examine the character
select @char = substring(@string,@count,1)
IF (@char like '[a-z]') or (@char like '[A-Z]') or (@char like '[0-9]')
select @newstring = @newstring + @char
ELSE
select @newstring = @newstring + @replacement
select @count = @count + 1
END
return @newstring
END
GO
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
PRINT 'Function created.'
ELSE
PRINT 'Function NOT created.'
GO
#4
#5
1
I know this is an old thread, but still, might be handy for others. Here's a quick and dirty (Which I've done inversely - stripping out non-numerics) - using a recursive CTE. What makes this one nice for me is that it's an inline function - so gets around the nasty RBAR effect of the usual scalar and table-valued functions. Adjust your filter as needs be to include or exclude whatever char types.
我知道这是一条老掉牙的线索,但对其他人来说还是很有用的。这里有一个快速的和肮脏的(我做过相反的-除去非数字)-使用一个递归的CTE。这个函数的优点在于它是一个内联函数,因此可以避开通常的标量函数和表值函数的RBAR效应。根据需要调整过滤器,以包含或排除任何char类型。
Create Function fncV1_iStripAlphasFromData (
@iString Varchar(max)
)
Returns
Table With Schemabinding
As
Return(
with RawData as
(
Select @iString as iString
)
,
Anchor as
(
Select Case(IsNumeric (substring(iString, 1, 1))) when 1 then substring(iString, 1, 1) else '' End as oString, 2 as CharPos from RawData
UNION ALL
Select a.oString + Case(IsNumeric (substring(@iString, a.CharPos, 1))) when 1 then substring(@iString, a.CharPos, 1) else '' End, a.CharPos + 1
from RawData r
Inner Join Anchor a on a.CharPos <= len(rtrim(ltrim(@iString)))
)
Select top 1 oString from Anchor order by CharPos Desc
)
Go
select * from dbo.fncV1_iStripAlphasFromData ('00000')
select * from dbo.fncV1_iStripAlphasFromData ('00A00')
select * from dbo.fncV1_iStripAlphasFromData ('12345ABC6789!&*0')
#1
14
One flexible-ish way;
一个flexible-ish方式;
ALTER FUNCTION [dbo].[fnRemovePatternFromString](@BUFFER VARCHAR(MAX), @PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @POS INT = PATINDEX(@PATTERN, @BUFFER)
WHILE @POS > 0 BEGIN
SET @BUFFER = STUFF(@BUFFER, @POS, 1, '')
SET @POS = PATINDEX(@PATTERN, @BUFFER)
END
RETURN @BUFFER
END
select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')
(No column name)
cake beer 399c
#2
7
Create a function:
创建一个函数:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
@s VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE @p INT = 1, @n VARCHAR(255) = '';
WHILE @p <= LEN(@s)
BEGIN
IF SUBSTRING(@s, @p, 1) LIKE '[A-Za-z0-9]'
BEGIN
SET @n += SUBSTRING(@s, @p, 1);
END
SET @p += 1;
END
RETURN(@n);
END
GO
Then:
然后:
SELECT Result = dbo.StripNonAlphaNumerics
('My Customer''s dog & #1 friend are dope, yo!');
Results:
结果:
Result
------
MyCustomersdog1friendaredopeyo
To make it more flexible, you could pass in the pattern you want to allow:
为了使它更灵活,您可以传入您希望允许的模式:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
@s VARCHAR(255),
@pattern VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE @p INT = 1, @n VARCHAR(255) = '';
WHILE @p <= LEN(@s)
BEGIN
IF SUBSTRING(@s, @p, 1) LIKE @pattern
BEGIN
SET @n += SUBSTRING(@s, @p, 1);
END
SET @p += 1;
END
RETURN(@n);
END
GO
Then:
然后:
SELECT r = dbo.StripNonAlphaNumerics
('Bob''s dog & #1 friend are dope, yo!', '[A-Za-z0-9]');
Results:
结果:
r
------
Bobsdog1friendaredopeyo
#3
6
I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:
几年前我遇到了这个问题,所以我编写了一个SQL函数来解决这个问题。这是原始文章(用于从HTML中提取文本)。我已经更新了功能如下:
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
BEGIN
PRINT 'Dropping: dbo.fn_CleanString'
DROP function dbo.fn_CleanString
END
GO
PRINT 'Creating: dbo.fn_CleanString'
GO
CREATE FUNCTION dbo.fn_CleanString
(
@string varchar(8000)
)
returns varchar(8000)
AS
BEGIN
---------------------------------------------------------------------------------------------------
-- Title: CleanString
-- Date Created: March 26, 2011
-- Author: William McEvoy
--
-- Description: This function removes special ascii characters from a string.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
declare @char char(1),
@len int,
@count int,
@newstring varchar(8000),
@replacement char(1)
select @count = 1,
@len = 0,
@newstring = '',
@replacement = ' '
---------------------------------------------------------------------------------------------------
-- M A I N P R O C E S S I N G
---------------------------------------------------------------------------------------------------
-- Remove Backspace characters
select @string = replace(@string,char(8),@replacement)
-- Remove Tabs
select @string = replace(@string,char(9),@replacement)
-- Remove line feed
select @string = replace(@string,char(10),@replacement)
-- Remove carriage return
select @string = replace(@string,char(13),@replacement)
-- Condense multiple spaces into a single space
-- This works by changing all double spaces to be OX where O = a space, and X = a special character
-- then all occurrences of XO are changed to O,
-- then all occurrences of X are changed to nothing, leaving just the O which is actually a single space
select @string = replace(replace(replace(ltrim(rtrim(@string)),' ', ' ' + char(7)),char(7)+' ',''),char(7),'')
-- Parse each character, remove non alpha-numeric
select @len = len(@string)
WHILE (@count <= @len)
BEGIN
-- Examine the character
select @char = substring(@string,@count,1)
IF (@char like '[a-z]') or (@char like '[A-Z]') or (@char like '[0-9]')
select @newstring = @newstring + @char
ELSE
select @newstring = @newstring + @replacement
select @count = @count + 1
END
return @newstring
END
GO
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
PRINT 'Function created.'
ELSE
PRINT 'Function NOT created.'
GO
#4
1
If you can use SQL CLR you can use .NET regular expressions for this.
如果您可以使用SQL CLR,您可以使用。net正则表达式。
There is a third party (free) package that includes this and more - SQL Sharp .
有一个第三方(免费)包,包括这个和更多的SQL Sharp。
#5
1
I know this is an old thread, but still, might be handy for others. Here's a quick and dirty (Which I've done inversely - stripping out non-numerics) - using a recursive CTE. What makes this one nice for me is that it's an inline function - so gets around the nasty RBAR effect of the usual scalar and table-valued functions. Adjust your filter as needs be to include or exclude whatever char types.
我知道这是一条老掉牙的线索,但对其他人来说还是很有用的。这里有一个快速的和肮脏的(我做过相反的-除去非数字)-使用一个递归的CTE。这个函数的优点在于它是一个内联函数,因此可以避开通常的标量函数和表值函数的RBAR效应。根据需要调整过滤器,以包含或排除任何char类型。
Create Function fncV1_iStripAlphasFromData (
@iString Varchar(max)
)
Returns
Table With Schemabinding
As
Return(
with RawData as
(
Select @iString as iString
)
,
Anchor as
(
Select Case(IsNumeric (substring(iString, 1, 1))) when 1 then substring(iString, 1, 1) else '' End as oString, 2 as CharPos from RawData
UNION ALL
Select a.oString + Case(IsNumeric (substring(@iString, a.CharPos, 1))) when 1 then substring(@iString, a.CharPos, 1) else '' End, a.CharPos + 1
from RawData r
Inner Join Anchor a on a.CharPos <= len(rtrim(ltrim(@iString)))
)
Select top 1 oString from Anchor order by CharPos Desc
)
Go
select * from dbo.fncV1_iStripAlphasFromData ('00000')
select * from dbo.fncV1_iStripAlphasFromData ('00A00')
select * from dbo.fncV1_iStripAlphasFromData ('12345ABC6789!&*0')