您如何计算SQL varchar中某个子字符串出现的次数?

时间:2022-06-17 22:16:30

I have a column that has values formatted like a,b,c,d. Is there a way to count the number of commas in that value in T-SQL?

我有一个列,它的值像a,b,c,d。是否有一种方法可以计算T-SQL中的逗号的数量?

13 个解决方案

#1


174  

The first way that comes to mind is to do it indirectly by replacing the comma with an empty string and comparing the lengths

首先想到的方法是用空字符串替换逗号,并比较长度。

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

#2


55  

Quick extension of cmsjr's answer that works for strings of more than more character.

快速扩展cmsjr的答案,为字符串的更多的字符。

CREATE FUNCTION dbo.CountOccurancesOfString
(
    @searchString nvarchar(max),
    @searchTerm nvarchar(max)
)
RETURNS INT
AS
BEGIN
    return (LEN(@searchString)-LEN(REPLACE(@searchString,@searchTerm,'')))/LEN(@searchTerm)
END

Usage:

用法:

SELECT * FROM MyTable
where dbo.CountOccurancesOfString(MyColumn, 'MyString') = 1

#3


12  

You can compare the length of the string with one where the commas are removed:

您可以将字符串长度与删除逗号的字符串进行比较:

len(value) - len(replace(value,',',''))

#4


4  

The answer by @csmjr has a problem in some instances.

@csmjr的答案在某些情况下有问题。

His answer was to do this:

他的回答是:

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

This works in most scenarios, however, try running this:

然而,在大多数情况下,这都是可行的:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(@string) - LEN(REPLACE(@string, ',', ''))

For some reason, REPLACE gets rid of the final comma but ALSO the space just before it (not sure why). This results in a returned value of 5 when you'd expect 4. Here is another way to do this which will work even in this special scenario:

出于某种原因,REPLACE去掉了最后的逗号,但也去掉了前面的空格(不确定为什么)。这将导致您期望4的返回值为5。这是另一种方法,即使在这种特殊情况下也会奏效:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(REPLACE(@string, ',', '**')) - LEN(@string)

Note that you don't need to use asterisks. Any two-character replacement will do. The idea is that you lengthen the string by one character for each instance of the character you're counting, then subtract the length of the original. It's basically the opposite method of the original answer which doesn't come with the strange trimming side-effect.

注意,您不需要使用星号。任何两个字符的替换都可以。这样做的目的是,将字符串的长度延长一个字符,以便计算每个字符的实例,然后减去原始字符的长度。它基本上是与原始答案相反的方法,而不是带有奇怪的修剪副作用。

#5


2  

Building on @Andrew's solution, you'll get much better performance using a non-procedural table-valued-function and CROSS APPLY:

使用@Andrew的解决方案,您将获得更好的性能,使用非程序化的表值函数和交叉应用:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*  Usage:
    SELECT t.[YourColumn], c.StringCount
    FROM YourDatabase.dbo.YourTable t
        CROSS APPLY dbo.CountOccurrencesOfString('your search string',     t.[YourColumn]) c
*/
CREATE FUNCTION [dbo].[CountOccurrencesOfString]
(
    @searchTerm nvarchar(max),
    @searchString nvarchar(max)

)
RETURNS TABLE
AS
    RETURN 
    SELECT (DATALENGTH(@searchString)-DATALENGTH(REPLACE(@searchString,@searchTerm,'')))/NULLIF(DATALENGTH(@searchTerm), 0) AS StringCount

#6


1  

Declare @string varchar(1000)

DECLARE @SearchString varchar(100)

Set @string = 'as as df df as as as'

SET @SearchString = 'as'

select ((len(@string) - len(replace(@string, @SearchString, ''))) -(len(@string) - 
        len(replace(@string, @SearchString, ''))) % 2)  / len(@SearchString)

#7


1  

Accepted answer is correct , extending it to use 2 or more character in substring:

被接受的答案是正确的,将其扩展为在子串中使用2个或更多字符:

Declare @string varchar(1000)
Set @string = 'aa,bb,cc,dd'
Set @substring = 'aa'
select (len(@string) - len(replace(@string, @substring, '')))/len(@substring)

#8


0  

DECLARE @records varchar(400)
SELECT @records = 'a,b,c,d'
select  LEN(@records) as 'Before removing Commas' , LEN(@records) - LEN(REPLACE(@records, ',', '')) 'After Removing Commans'

#9


0  

Darrel Lee I think has a pretty good answer. Replace CHARINDEX() with PATINDEX(), and you can do some weak regex searching along a string, too...

达雷尔·李,我觉得这个答案很好。用PATINDEX()替换CHARINDEX(),您还可以在字符串中执行一些弱的regex搜索…

Like, say you use this for @pattern:

比如,你用这个for @pattern:

set @pattern='%[-.|!,'+char(9)+']%'

Why would you maybe want to do something crazy like this?

你为什么要做这样疯狂的事?

Say you're loading delimited text strings into a staging table, where the field holding the data is something like a varchar(8000) or nvarchar(max)...

假设将有限的文本字符串加载到一个staging表中,其中保存数据的字段类似于varchar(8000)或nvarchar(max)…

Sometimes it's easier/faster to do ELT (Extract-Load-Transform) with data rather than ETL (Extract-Transform-Load), and one way to do this is to load the delimited records as-is into a staging table, especially if you may want an simpler way to see the exceptional records rather than deal with them as part of an SSIS package...but that's a holy war for a different thread.

有时/更快更容易做英语教学(Extract-Load-Transform)数据而不是ETL(提取-转换-装载),和这样做的方法之一是按原样带分隔符的记录装载到一个staging表,特别是如果您可能希望看到的一个更简单的方法异常记录而不是处理他们的SSIS包……但这对于不同的线程来说是一场圣战。

#10


0  

The following should do the trick for both single character and multiple character searches:

以下是对单个字符和多个字符搜索的技巧:

CREATE FUNCTION dbo.CountOccurrences
(
   @SearchString VARCHAR(1000),
   @SearchFor    VARCHAR(1000)
)
RETURNS TABLE
AS
   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   (
                       SELECT ROW_NUMBER() OVER (ORDER BY O.object_id) AS n
                       FROM   sys.objects AS O
                    ) AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );
GO

---------------------------------------------------------------------------------------
-- Test the function for single and multiple character searches
---------------------------------------------------------------------------------------
DECLARE @SearchForComma      VARCHAR(10) = ',',
        @SearchForCharacters VARCHAR(10) = 'de';

DECLARE @TestTable TABLE
(
   TestData VARCHAR(30) NOT NULL
);

INSERT INTO @TestTable
     (
        TestData
     )
VALUES
     ('a,b,c,de,de ,d e'),
     ('abc,de,hijk,,'),
     (',,a,b,cde,,');

SELECT TT.TestData,
       CO.Occurrences AS CommaOccurrences,
       CO2.Occurrences AS CharacterOccurrences
FROM   @TestTable AS TT
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForComma) AS CO
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForCharacters) AS CO2;

The function can be simplified a bit using a table of numbers (dbo.Nums):

使用数字表(dbo.Nums)可以简化这个函数:

   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   dbo.Nums AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );

#11


-1  

You can use the following stored procedure to fetch , values.

您可以使用以下存储过程来获取值。

IF  EXISTS (SELECT * FROM sys.objects 
WHERE object_id = OBJECT_ID(N'[dbo].[sp_parsedata]') AND type in (N'P', N'PC'))
    DROP PROCEDURE [dbo].[sp_parsedata]
GO
create procedure sp_parsedata
(@cid integer,@st varchar(1000))
as
  declare @coid integer
  declare @c integer
  declare @c1 integer
  select @c1=len(@st) - len(replace(@st, ',', ''))
  set @c=0
  delete from table1 where complainid=@cid;
  while (@c<=@c1)
    begin
      if (@c<@c1) 
        begin
          select @coid=cast(replace(left(@st,CHARINDEX(',',@st,1)),',','') as integer)
          select @st=SUBSTRING(@st,CHARINDEX(',',@st,1)+1,LEN(@st))
        end
      else
        begin
          select @coid=cast(@st as integer)
        end
      insert into table1(complainid,courtid) values(@cid,@coid)
      set @c=@c+1
    end

#12


-1  

The Replace/Len test is cute, but probably very inefficient (especially in terms of memory). A simple function with a loop will do the job.

Replace/Len测试很可爱,但可能非常低效(特别是在内存方面)。一个带有循环的简单函数将完成这项工作。

CREATE FUNCTION [dbo].[fn_Occurences] 
(
    @pattern varchar(255),
    @expression varchar(max)
)
RETURNS int
AS
BEGIN

    DECLARE @Result int = 0;

    DECLARE @index BigInt = 0
    DECLARE @patLen int = len(@pattern)

    SET @index = CHARINDEX(@pattern, @expression, @index)
    While @index > 0
    BEGIN
        SET @Result = @Result + 1;
        SET @index = CHARINDEX(@pattern, @expression, @index + @patLen)
    END

    RETURN @Result

END

#13


-3  

Perhaps you should not store data that way. It is a bad practice to ever store a comma delimited list in a field. IT is very inefficient for querying. This should be a related table.

也许您不应该以这种方式存储数据。在字段中存储逗号分隔的列表是很糟糕的做法。查询的效率非常低。这应该是一个相关的表。

#1


174  

The first way that comes to mind is to do it indirectly by replacing the comma with an empty string and comparing the lengths

首先想到的方法是用空字符串替换逗号,并比较长度。

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

#2


55  

Quick extension of cmsjr's answer that works for strings of more than more character.

快速扩展cmsjr的答案,为字符串的更多的字符。

CREATE FUNCTION dbo.CountOccurancesOfString
(
    @searchString nvarchar(max),
    @searchTerm nvarchar(max)
)
RETURNS INT
AS
BEGIN
    return (LEN(@searchString)-LEN(REPLACE(@searchString,@searchTerm,'')))/LEN(@searchTerm)
END

Usage:

用法:

SELECT * FROM MyTable
where dbo.CountOccurancesOfString(MyColumn, 'MyString') = 1

#3


12  

You can compare the length of the string with one where the commas are removed:

您可以将字符串长度与删除逗号的字符串进行比较:

len(value) - len(replace(value,',',''))

#4


4  

The answer by @csmjr has a problem in some instances.

@csmjr的答案在某些情况下有问题。

His answer was to do this:

他的回答是:

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

This works in most scenarios, however, try running this:

然而,在大多数情况下,这都是可行的:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(@string) - LEN(REPLACE(@string, ',', ''))

For some reason, REPLACE gets rid of the final comma but ALSO the space just before it (not sure why). This results in a returned value of 5 when you'd expect 4. Here is another way to do this which will work even in this special scenario:

出于某种原因,REPLACE去掉了最后的逗号,但也去掉了前面的空格(不确定为什么)。这将导致您期望4的返回值为5。这是另一种方法,即使在这种特殊情况下也会奏效:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(REPLACE(@string, ',', '**')) - LEN(@string)

Note that you don't need to use asterisks. Any two-character replacement will do. The idea is that you lengthen the string by one character for each instance of the character you're counting, then subtract the length of the original. It's basically the opposite method of the original answer which doesn't come with the strange trimming side-effect.

注意,您不需要使用星号。任何两个字符的替换都可以。这样做的目的是,将字符串的长度延长一个字符,以便计算每个字符的实例,然后减去原始字符的长度。它基本上是与原始答案相反的方法,而不是带有奇怪的修剪副作用。

#5


2  

Building on @Andrew's solution, you'll get much better performance using a non-procedural table-valued-function and CROSS APPLY:

使用@Andrew的解决方案,您将获得更好的性能,使用非程序化的表值函数和交叉应用:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*  Usage:
    SELECT t.[YourColumn], c.StringCount
    FROM YourDatabase.dbo.YourTable t
        CROSS APPLY dbo.CountOccurrencesOfString('your search string',     t.[YourColumn]) c
*/
CREATE FUNCTION [dbo].[CountOccurrencesOfString]
(
    @searchTerm nvarchar(max),
    @searchString nvarchar(max)

)
RETURNS TABLE
AS
    RETURN 
    SELECT (DATALENGTH(@searchString)-DATALENGTH(REPLACE(@searchString,@searchTerm,'')))/NULLIF(DATALENGTH(@searchTerm), 0) AS StringCount

#6


1  

Declare @string varchar(1000)

DECLARE @SearchString varchar(100)

Set @string = 'as as df df as as as'

SET @SearchString = 'as'

select ((len(@string) - len(replace(@string, @SearchString, ''))) -(len(@string) - 
        len(replace(@string, @SearchString, ''))) % 2)  / len(@SearchString)

#7


1  

Accepted answer is correct , extending it to use 2 or more character in substring:

被接受的答案是正确的,将其扩展为在子串中使用2个或更多字符:

Declare @string varchar(1000)
Set @string = 'aa,bb,cc,dd'
Set @substring = 'aa'
select (len(@string) - len(replace(@string, @substring, '')))/len(@substring)

#8


0  

DECLARE @records varchar(400)
SELECT @records = 'a,b,c,d'
select  LEN(@records) as 'Before removing Commas' , LEN(@records) - LEN(REPLACE(@records, ',', '')) 'After Removing Commans'

#9


0  

Darrel Lee I think has a pretty good answer. Replace CHARINDEX() with PATINDEX(), and you can do some weak regex searching along a string, too...

达雷尔·李,我觉得这个答案很好。用PATINDEX()替换CHARINDEX(),您还可以在字符串中执行一些弱的regex搜索…

Like, say you use this for @pattern:

比如,你用这个for @pattern:

set @pattern='%[-.|!,'+char(9)+']%'

Why would you maybe want to do something crazy like this?

你为什么要做这样疯狂的事?

Say you're loading delimited text strings into a staging table, where the field holding the data is something like a varchar(8000) or nvarchar(max)...

假设将有限的文本字符串加载到一个staging表中,其中保存数据的字段类似于varchar(8000)或nvarchar(max)…

Sometimes it's easier/faster to do ELT (Extract-Load-Transform) with data rather than ETL (Extract-Transform-Load), and one way to do this is to load the delimited records as-is into a staging table, especially if you may want an simpler way to see the exceptional records rather than deal with them as part of an SSIS package...but that's a holy war for a different thread.

有时/更快更容易做英语教学(Extract-Load-Transform)数据而不是ETL(提取-转换-装载),和这样做的方法之一是按原样带分隔符的记录装载到一个staging表,特别是如果您可能希望看到的一个更简单的方法异常记录而不是处理他们的SSIS包……但这对于不同的线程来说是一场圣战。

#10


0  

The following should do the trick for both single character and multiple character searches:

以下是对单个字符和多个字符搜索的技巧:

CREATE FUNCTION dbo.CountOccurrences
(
   @SearchString VARCHAR(1000),
   @SearchFor    VARCHAR(1000)
)
RETURNS TABLE
AS
   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   (
                       SELECT ROW_NUMBER() OVER (ORDER BY O.object_id) AS n
                       FROM   sys.objects AS O
                    ) AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );
GO

---------------------------------------------------------------------------------------
-- Test the function for single and multiple character searches
---------------------------------------------------------------------------------------
DECLARE @SearchForComma      VARCHAR(10) = ',',
        @SearchForCharacters VARCHAR(10) = 'de';

DECLARE @TestTable TABLE
(
   TestData VARCHAR(30) NOT NULL
);

INSERT INTO @TestTable
     (
        TestData
     )
VALUES
     ('a,b,c,de,de ,d e'),
     ('abc,de,hijk,,'),
     (',,a,b,cde,,');

SELECT TT.TestData,
       CO.Occurrences AS CommaOccurrences,
       CO2.Occurrences AS CharacterOccurrences
FROM   @TestTable AS TT
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForComma) AS CO
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForCharacters) AS CO2;

The function can be simplified a bit using a table of numbers (dbo.Nums):

使用数字表(dbo.Nums)可以简化这个函数:

   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   dbo.Nums AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );

#11


-1  

You can use the following stored procedure to fetch , values.

您可以使用以下存储过程来获取值。

IF  EXISTS (SELECT * FROM sys.objects 
WHERE object_id = OBJECT_ID(N'[dbo].[sp_parsedata]') AND type in (N'P', N'PC'))
    DROP PROCEDURE [dbo].[sp_parsedata]
GO
create procedure sp_parsedata
(@cid integer,@st varchar(1000))
as
  declare @coid integer
  declare @c integer
  declare @c1 integer
  select @c1=len(@st) - len(replace(@st, ',', ''))
  set @c=0
  delete from table1 where complainid=@cid;
  while (@c<=@c1)
    begin
      if (@c<@c1) 
        begin
          select @coid=cast(replace(left(@st,CHARINDEX(',',@st,1)),',','') as integer)
          select @st=SUBSTRING(@st,CHARINDEX(',',@st,1)+1,LEN(@st))
        end
      else
        begin
          select @coid=cast(@st as integer)
        end
      insert into table1(complainid,courtid) values(@cid,@coid)
      set @c=@c+1
    end

#12


-1  

The Replace/Len test is cute, but probably very inefficient (especially in terms of memory). A simple function with a loop will do the job.

Replace/Len测试很可爱,但可能非常低效(特别是在内存方面)。一个带有循环的简单函数将完成这项工作。

CREATE FUNCTION [dbo].[fn_Occurences] 
(
    @pattern varchar(255),
    @expression varchar(max)
)
RETURNS int
AS
BEGIN

    DECLARE @Result int = 0;

    DECLARE @index BigInt = 0
    DECLARE @patLen int = len(@pattern)

    SET @index = CHARINDEX(@pattern, @expression, @index)
    While @index > 0
    BEGIN
        SET @Result = @Result + 1;
        SET @index = CHARINDEX(@pattern, @expression, @index + @patLen)
    END

    RETURN @Result

END

#13


-3  

Perhaps you should not store data that way. It is a bad practice to ever store a comma delimited list in a field. IT is very inefficient for querying. This should be a related table.

也许您不应该以这种方式存储数据。在字段中存储逗号分隔的列表是很糟糕的做法。查询的效率非常低。这应该是一个相关的表。