改进SQL Server中前导零的技术?

时间:2022-01-09 15:53:47

I've been using this for some time:

我已经用了一段时间了:

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col), LEN(str_col))

However recently, I've found a problem with columns with all "0" characters like '00000000' because it never finds a non-"0" character to match.

然而,最近,我发现所有“0”字符(比如“00000000”)的列都有问题,因为它从来没有找到要匹配的非“0”字符。

An alternative technique I've seen is to use TRIM:

我看到的另一种技术是使用修剪:

REPLACE(LTRIM(REPLACE(str_col, '0', ' ')), ' ', '0')

This has a problem if there are embedded spaces, because they will be turned into "0"s when the spaces are turned back into "0"s.

如果有嵌入式空间,这就有问题了,因为当空间重新回到“0”时,它们就会变成“0”。

I'm trying to avoid a scalar UDF. I've found a lot of performance problems with UDFs in SQL Server 2005.

我尽量避免使用标量UDF。我在SQL Server 2005中发现了许多udf的性能问题。

11 个解决方案

#1


200  

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))

#2


25  

Why don't you just cast the value to INTEGER and then back to VARCHAR?

为什么不将值转换为INTEGER然后返回到VARCHAR呢?

SELECT  CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

--------
       0

#3


11  

Other answers here to not take into consideration if you have all-zero's (or even a single zero).
Some always default an empty string to zero, which is wrong when it is supposed to remain blank.
Re-read the original question. This answers what the Questioner wants.

如果你的答案都是零(甚至是一个零),你就不需要考虑其他的答案。有些总是将空字符串默认为零,这是错误的,因为它应该是空的。重读最初的问题。这就回答了提问者的问题。

Solution #1:

--This example uses both Leading and Trailing zero's.
--Avoid losing those Trailing zero's and converting embedded spaces into more zeros.
--I added a non-whitespace character ("_") to retain trailing zero's after calling Replace().
--Simply remove the RTrim() function call if you want to preserve trailing spaces.
--If you treat zero's and empty-strings as the same thing for your application,
--  then you may skip the Case-Statement entirely and just use CN.CleanNumber .
DECLARE @WackadooNumber VarChar(50) = ' 0 0123ABC D0 '--'000'--
SELECT WN.WackadooNumber, CN.CleanNumber,
       (CASE WHEN WN.WackadooNumber LIKE '%0%' AND CN.CleanNumber = '' THEN '0' ELSE CN.CleanNumber END)[AllowZero]
 FROM (SELECT @WackadooNumber[WackadooNumber]) AS WN
 OUTER APPLY (SELECT RTRIM(RIGHT(WN.WackadooNumber, LEN(LTRIM(REPLACE(WN.WackadooNumber + '_', '0', ' '))) - 1))[CleanNumber]) AS CN
--Result: "123ABC D0"

Solution #2 (with sample data):

SELECT O.Type, O.Value, Parsed.Value[WrongValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.Value) = 0--And the trimmed length is zero.
             THEN '0' ELSE Parsed.Value END)[FinalValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.TrimmedValue) = 0--And the trimmed length is zero.
             THEN '0' ELSE LTRIM(RTRIM(Parsed.TrimmedValue)) END)[FinalTrimmedValue]
  FROM 
  (
    VALUES ('Null', NULL), ('EmptyString', ''),
           ('Zero', '0'), ('Zero', '0000'), ('Zero', '000.000'),
           ('Spaces', '    0   A B C '), ('Number', '000123'),
           ('AlphaNum', '000ABC123'), ('NoZero', 'NoZerosHere')
  ) AS O(Type, Value)--O is for Original.
  CROSS APPLY
  ( --This Step is Optional.  Use if you also want to remove leading spaces.
    SELECT LTRIM(RTRIM(O.Value))[Value]
  ) AS T--T is for Trimmed.
  CROSS APPLY
  ( --From @CadeRoux's Post.
    SELECT SUBSTRING(O.Value, PATINDEX('%[^0]%', O.Value + '.'), LEN(O.Value))[Value],
           SUBSTRING(T.Value, PATINDEX('%[^0]%', T.Value + '.'), LEN(T.Value))[TrimmedValue]
  ) AS Parsed

Results:

改进SQL Server中前导零的技术?

Summary:

You could use what I have above for a one-off removal of leading-zero's.
If you plan on reusing it a lot, then place it in an Inline-Table-Valued-Function (ITVF).
Your concerns about performance problems with UDF's is understandable.
However, this problem only applies to All-Scalar-Functions and Multi-Statement-Table-Functions.
Using ITVF's is perfectly fine.

I have the same problem with our 3rd-Party database.
With Alpha-Numeric fields many are entered in without the leading spaces, dang humans!
This makes joins impossible without cleaning up the missing leading-zeros.

你可以用我上面的方法一次性去除零铅。如果您打算重用它,那么将它放在一个内联表值函数(ITVF)中。您对UDF性能问题的担忧是可以理解的。但是,这个问题只适用于全尺度函数和多状态表函数。使用ITVF完全没问题。我对我们的第三方数据库也有同样的问题。有了字母数字的字段,许多人都进入了没有领先空间的,当人!这使得连接成为不可能,而不清理丢失的导入- 0。

Conclusion:

Instead of removing the leading-zeros, you may want to consider just padding your trimmed-values with leading-zeros when you do your joins.
Better yet, clean up your data in the table by adding leading zeros, then rebuilding your indexes.
I think this would be WAY faster and less complex.

除了删除前导零之外,您还可以考虑在进行连接时使用前导零填充trimmed值。更好的是,通过添加前导0来清理表中的数据,然后重新构建索引。我认为这将会更快更简单。

SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF(' 0A10  ', ''))), 10)--0000000A10
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF('', ''))), 10)--NULL --When Blank.

#4


5  

Instead of a space replace the 0's with a 'rare' whitespace character that shouldn't normally be in the column's text. A line feed is probably good enough for a column like this. Then you can LTrim normally and replace the special character with 0's again.

不是空格,而是用一个“罕见”的空格字符替换0,这个字符通常不应该出现在列的文本中。行提要对于这样的列来说可能已经足够了。然后你可以正常地修剪,再用0替换特殊字符。

#5


3  

The following will return '0' if the string consists entirely of zeros:

如果字符串完全由0组成,则返回'0':

CASE WHEN SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) = '' THEN '0' ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) END AS str_col

#6


2  

This makes a nice Function....

这是一个很好的函数....

DROP FUNCTION [dbo].[FN_StripLeading]
GO
CREATE FUNCTION [dbo].[FN_StripLeading] (@string VarChar(128), @stripChar VarChar(1))
RETURNS VarChar(128)
AS
BEGIN
-- http://*.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server
    DECLARE @retVal VarChar(128),
            @pattern varChar(10)
    SELECT @pattern = '%[^'+@stripChar+']%'
    SELECT @retVal = CASE WHEN SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) = '' THEN @stripChar ELSE SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) END
    RETURN (@retVal)
END
GO
GRANT EXECUTE ON [dbo].[FN_StripLeading] TO PUBLIC

#7


1  

cast(value as int) will always work if string is a number

如果字符串是数字,那么cast(值为int)将始终有效

#8


1  

replace(ltrim(replace(Fieldname.TableName, '0', '')), '', '0')

The suggestion from Thomas G worked for our needs.

托马斯·G的建议符合我们的需要。

The field in our case was already string and only the leading zeros needed to be trimmed. Mostly it's all numeric but sometimes there are letters so the previous INT conversion would crash.

在我们的例子中,字段已经是字符串,只需要修改前导0。大多数都是数值形式,但有时会有字母,所以之前的INT转换会崩溃。

#9


0  

My version of this is an adaptation of Arvo's work, with a little more added on to ensure two other cases.

我的版本是改编自Arvo的作品,再加上一点以确保另外两个案例。

1) If we have all 0s, we should return the digit 0.

如果我们都有0,我们应该返回数字0。

2) If we have a blank, we should still return a blank character.

如果我们有一个空字符,我们仍然应该返回一个空字符。

CASE 
    WHEN PATINDEX('%[^0]%', str_col + '.') > LEN(str_col) THEN RIGHT(str_col, 1) 
    ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col + '.'), LEN(str_col))
 END

#10


0  

SELECT CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

This has a limit on the length of the string that can be converted to an INT

这对可以转换为INT的字符串的长度有一个限制

#11


-1  

Try this:

试试这个:

replace(ltrim(replace(@str, '0', ' ')), ' ', '0')

#1


200  

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))

#2


25  

Why don't you just cast the value to INTEGER and then back to VARCHAR?

为什么不将值转换为INTEGER然后返回到VARCHAR呢?

SELECT  CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

--------
       0

#3


11  

Other answers here to not take into consideration if you have all-zero's (or even a single zero).
Some always default an empty string to zero, which is wrong when it is supposed to remain blank.
Re-read the original question. This answers what the Questioner wants.

如果你的答案都是零(甚至是一个零),你就不需要考虑其他的答案。有些总是将空字符串默认为零,这是错误的,因为它应该是空的。重读最初的问题。这就回答了提问者的问题。

Solution #1:

--This example uses both Leading and Trailing zero's.
--Avoid losing those Trailing zero's and converting embedded spaces into more zeros.
--I added a non-whitespace character ("_") to retain trailing zero's after calling Replace().
--Simply remove the RTrim() function call if you want to preserve trailing spaces.
--If you treat zero's and empty-strings as the same thing for your application,
--  then you may skip the Case-Statement entirely and just use CN.CleanNumber .
DECLARE @WackadooNumber VarChar(50) = ' 0 0123ABC D0 '--'000'--
SELECT WN.WackadooNumber, CN.CleanNumber,
       (CASE WHEN WN.WackadooNumber LIKE '%0%' AND CN.CleanNumber = '' THEN '0' ELSE CN.CleanNumber END)[AllowZero]
 FROM (SELECT @WackadooNumber[WackadooNumber]) AS WN
 OUTER APPLY (SELECT RTRIM(RIGHT(WN.WackadooNumber, LEN(LTRIM(REPLACE(WN.WackadooNumber + '_', '0', ' '))) - 1))[CleanNumber]) AS CN
--Result: "123ABC D0"

Solution #2 (with sample data):

SELECT O.Type, O.Value, Parsed.Value[WrongValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.Value) = 0--And the trimmed length is zero.
             THEN '0' ELSE Parsed.Value END)[FinalValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.TrimmedValue) = 0--And the trimmed length is zero.
             THEN '0' ELSE LTRIM(RTRIM(Parsed.TrimmedValue)) END)[FinalTrimmedValue]
  FROM 
  (
    VALUES ('Null', NULL), ('EmptyString', ''),
           ('Zero', '0'), ('Zero', '0000'), ('Zero', '000.000'),
           ('Spaces', '    0   A B C '), ('Number', '000123'),
           ('AlphaNum', '000ABC123'), ('NoZero', 'NoZerosHere')
  ) AS O(Type, Value)--O is for Original.
  CROSS APPLY
  ( --This Step is Optional.  Use if you also want to remove leading spaces.
    SELECT LTRIM(RTRIM(O.Value))[Value]
  ) AS T--T is for Trimmed.
  CROSS APPLY
  ( --From @CadeRoux's Post.
    SELECT SUBSTRING(O.Value, PATINDEX('%[^0]%', O.Value + '.'), LEN(O.Value))[Value],
           SUBSTRING(T.Value, PATINDEX('%[^0]%', T.Value + '.'), LEN(T.Value))[TrimmedValue]
  ) AS Parsed

Results:

改进SQL Server中前导零的技术?

Summary:

You could use what I have above for a one-off removal of leading-zero's.
If you plan on reusing it a lot, then place it in an Inline-Table-Valued-Function (ITVF).
Your concerns about performance problems with UDF's is understandable.
However, this problem only applies to All-Scalar-Functions and Multi-Statement-Table-Functions.
Using ITVF's is perfectly fine.

I have the same problem with our 3rd-Party database.
With Alpha-Numeric fields many are entered in without the leading spaces, dang humans!
This makes joins impossible without cleaning up the missing leading-zeros.

你可以用我上面的方法一次性去除零铅。如果您打算重用它,那么将它放在一个内联表值函数(ITVF)中。您对UDF性能问题的担忧是可以理解的。但是,这个问题只适用于全尺度函数和多状态表函数。使用ITVF完全没问题。我对我们的第三方数据库也有同样的问题。有了字母数字的字段,许多人都进入了没有领先空间的,当人!这使得连接成为不可能,而不清理丢失的导入- 0。

Conclusion:

Instead of removing the leading-zeros, you may want to consider just padding your trimmed-values with leading-zeros when you do your joins.
Better yet, clean up your data in the table by adding leading zeros, then rebuilding your indexes.
I think this would be WAY faster and less complex.

除了删除前导零之外,您还可以考虑在进行连接时使用前导零填充trimmed值。更好的是,通过添加前导0来清理表中的数据,然后重新构建索引。我认为这将会更快更简单。

SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF(' 0A10  ', ''))), 10)--0000000A10
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF('', ''))), 10)--NULL --When Blank.

#4


5  

Instead of a space replace the 0's with a 'rare' whitespace character that shouldn't normally be in the column's text. A line feed is probably good enough for a column like this. Then you can LTrim normally and replace the special character with 0's again.

不是空格,而是用一个“罕见”的空格字符替换0,这个字符通常不应该出现在列的文本中。行提要对于这样的列来说可能已经足够了。然后你可以正常地修剪,再用0替换特殊字符。

#5


3  

The following will return '0' if the string consists entirely of zeros:

如果字符串完全由0组成,则返回'0':

CASE WHEN SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) = '' THEN '0' ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) END AS str_col

#6


2  

This makes a nice Function....

这是一个很好的函数....

DROP FUNCTION [dbo].[FN_StripLeading]
GO
CREATE FUNCTION [dbo].[FN_StripLeading] (@string VarChar(128), @stripChar VarChar(1))
RETURNS VarChar(128)
AS
BEGIN
-- http://*.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server
    DECLARE @retVal VarChar(128),
            @pattern varChar(10)
    SELECT @pattern = '%[^'+@stripChar+']%'
    SELECT @retVal = CASE WHEN SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) = '' THEN @stripChar ELSE SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) END
    RETURN (@retVal)
END
GO
GRANT EXECUTE ON [dbo].[FN_StripLeading] TO PUBLIC

#7


1  

cast(value as int) will always work if string is a number

如果字符串是数字,那么cast(值为int)将始终有效

#8


1  

replace(ltrim(replace(Fieldname.TableName, '0', '')), '', '0')

The suggestion from Thomas G worked for our needs.

托马斯·G的建议符合我们的需要。

The field in our case was already string and only the leading zeros needed to be trimmed. Mostly it's all numeric but sometimes there are letters so the previous INT conversion would crash.

在我们的例子中,字段已经是字符串,只需要修改前导0。大多数都是数值形式,但有时会有字母,所以之前的INT转换会崩溃。

#9


0  

My version of this is an adaptation of Arvo's work, with a little more added on to ensure two other cases.

我的版本是改编自Arvo的作品,再加上一点以确保另外两个案例。

1) If we have all 0s, we should return the digit 0.

如果我们都有0,我们应该返回数字0。

2) If we have a blank, we should still return a blank character.

如果我们有一个空字符,我们仍然应该返回一个空字符。

CASE 
    WHEN PATINDEX('%[^0]%', str_col + '.') > LEN(str_col) THEN RIGHT(str_col, 1) 
    ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col + '.'), LEN(str_col))
 END

#10


0  

SELECT CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

This has a limit on the length of the string that can be converted to an INT

这对可以转换为INT的字符串的长度有一个限制

#11


-1  

Try this:

试试这个:

replace(ltrim(replace(@str, '0', ' ')), ' ', '0')