使用T-SQL,从字符串返回第n个分隔元素

时间:2021-07-26 08:47:10

I have a need to create a function the will return nth element of a delimited string.

我需要创建一个函数,它将返回分隔字符串的第n个元素。

For a data migration project, I am converting JSON audit records stored in a SQL Server database into a structured report using SQL script. Goal is to deliver a sql script and a sql function used by the script without any code.

对于数据迁移项目,我使用SQL脚本将存储在SQL Server数据库中的JSON审核记录转换为结构化报告。目标是在没有任何代码的情况下提供脚本使用的sql脚本和sql函数。

(This is a short-term fix will be used while a new auditing feature is added the ASP.NET/MVC application)

(这是一个短期修复,将在ASP.NET / MVC应用程序中添加新的审计功能时使用)

There is no shortage of delimited string to table examples available. I've chosen a Common Table Expression example http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

可用的表格示例不缺少分隔的字符串。我选择了一个Common Table Expression示例http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

Example: I want to return 67 from '1,222,2,67,888,1111'

示例:我想从'1,222,2,67,888,1111'返回67

8 个解决方案

#1


14  

This is the easiest answer to rerieve the 67 (type-safe!!):

这是缓解67(类型安全!!)的最简单答案:

SELECT CAST('<x>' + REPLACE('1,222,2,67,888,1111',',','</x><x>') + '</x>' AS XML).value('/x[4]','int')

This question is not about a string split approach, but about how to get the nth element. The easiest, fully inlineable way would be this IMO:

这个问题不是关于字符串拆分方法,而是关于如何获取第n个元素。这个IMO是最简单,完全可以内联的方式:

This is a real one-liner to get part 2 delimited by a space:

这是一个真正的单行程序,可以通过空格分隔第2部分:

DECLARE @input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(@input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')

Of course you can use variables for delimiter and position (use sql:column to retrieve the position directly from a query's value):

当然,您可以使用变量进行分隔符和位置(使用sql:column直接从查询的值中检索位置):

DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')

If your string might include forbidden characters, you still can do it this way. Just use FOR XML PATH on your string first to replace all forbidden characters with the fitting escape sequence implicitly.

如果您的字符串可能包含禁用字符,您仍然可以这样做。首先在字符串上使用FOR XML PATH,隐式替换所有禁用字符和拟合转义序列。

It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:

如果 - 另外 - 你的分隔符是分号,这是一个非常特殊的情况。在这种情况下,我首先将分隔符替换为'#DLMT#',并最终将其替换为XML标记:

SET @input=N'Some <, > and &;Other äöü@€;One more';
SET @dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(@input,@dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)');

#2


6  

Here is my initial solution... It is based on work by Aaron Bertrand http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

这是我最初的解决方案......它基于Aaron Bertrand的工作http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

I simply changed the return type to make it a scalar function.

我只是更改了返回类型以使其成为标量函数。

Example: SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)

示例:SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)

CREATE FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(4000)
AS
BEGIN

   DECLARE @result varchar(4000)    
   DECLARE @Items TABLE ( position int IDENTITY PRIMARY KEY,
                          Item VARCHAR(4000)
                         )  

   DECLARE @ll INT = LEN(@List) + 1, @ld INT = LEN(@Delimiter);  

   WITH a AS
   (
       SELECT
           [start] = 1,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll),
           [value] = SUBSTRING(@List, 1, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll) - 1)
       UNION ALL
       SELECT
           [start] = CONVERT(INT, [end]) + @ld,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll),
           [value] = SUBSTRING(@List, [end] + @ld, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll)-[end]-@ld)
       FROM a
       WHERE [end] < @ll
   )
   INSERT @Items SELECT [value]
   FROM a
   WHERE LEN([value]) > 0
   OPTION (MAXRECURSION 0);

   SELECT @result=Item
   FROM @Items
   WHERE position=@ElementNumber

   RETURN @result;
END
GO

#3


4  

How about:

CREATE FUNCTION dbo.NTH_ELEMENT (@Input NVARCHAR(MAX), @Delim CHAR = '-', @N INT = 0)
RETURNS NVARCHAR(MAX)
AS
BEGIN
RETURN (SELECT VALUE FROM STRING_SPLIT(@Input, @Delim) ORDER BY (SELECT NULL) OFFSET @N ROWS FETCH NEXT 1 ROW ONLY)
END

#4


1  

In a rare moment of lunacy I just thought that split is far easier if we use XML to parse it out for us:

在一个罕见的疯狂时刻,我只是认为如果我们使用XML来解析它,拆分会容易得多:

(Using the variables from @Gary Kindel's answer)

(使用来自@Gary Kindel的答案的变量)

declare @xml xml
set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

select
    el = split.el.value('.','varchar(max)')
from  @xml.nodes('/split/el') split(el))

This lists all elements of the string, split by the specified character.

这将列出字符串的所有元素,并按指定的字符进行拆分。

We can use an xpath test to filter out empty values, and a further xpath test to restrict this to the element we're interested in. In full Gary's function becomes:

我们可以使用xpath测试来过滤掉空值,并使用xpath测试将其限制为我们感兴趣的元素。完整的Gary函数变为:

alter FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(max)
AS
BEGIN

       declare @xml xml
       set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

       declare @ret varchar(max)
       set @ret = (select
              el = split.el.value('.','varchar(max)')
       from  @xml.nodes('/split/el[string-length(.)>0][position() = sql:variable("@elementnumber")]') split(el))

       return @ret

END

#5


1  

you can put this select into UFN. if you need you can customize it for specifying delimiter as well. in that case your ufn will have two input. number Nth and delimiter to use.

你可以将这个选择放入UFN。如果需要,您也可以自定义它以指定分隔符。在这种情况下,你的ufn将有两个输入。要使用的编号为Nth和分隔符。

    DECLARE @tlist varchar(max)='10,20,30,40,50,60,70,80,90,100'
    DECLARE @i INT=1, @nth INT=3
    While len(@tlist) <> 0
    BEGIN
            IF @i=@nth
            BEGIN
              select Case when charindex(',',@tlist) <> 0 Then LEFT(@tlist,charindex(',',@tlist)-1)
                          Else @tlist
                    END
            END

              Select @tlist = Case when charindex(',',@tlist) <> 0 Then substring(@tlist,charindex(',',@tlist)+1,len(@tlist))
                          Else ''
                          END

            SELECT @i=@i+1
    END

#6


0  

I cannot comment on Gary's solution because of my low reputation

由于声誉不佳,我无法评论加里的解决方案

I know Gary was referencing another link.

我知道加里正在引用另一个链接。

I have struggled to understand why we need this variable

我一直在努力理解为什么我们需要这个变量

@ld INT = LEN(@Delimiter)

I also don't understand why charindex has to start at the position of length of delimiter, @ld

我也不明白为什么charindex必须从分隔符长度@ld开始

I tested with many examples with a single character delimiter, and they work. Most of the time, delimiter character is a single character. However, since the developer included the ld as length of delimiter, the code has to work for delimiters that have more than one character

我使用单个字符分隔符测试了许多示例,并且它们可以工作。大多数情况下,分隔符是单个字符。但是,由于开发人员将ld作为分隔符的长度,因此代码必须适用于具有多个字符的分隔符

In this case, the following case will fail

在这种情况下,以下情况将失败

11,,,22,,,33,,,44,,,55,,,

I cloned from the codes from this link. http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/

我从这个链接的代码克隆。 http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/

I have tested various scenarios including the delimiters that have more than one character

我测试了各种场景,包括具有多个角色的分隔符

alter FUNCTION [dbo].[split1]
(
    @string1 VARCHAR(8000) -- List of delimited items
    , @Delimiter VARCHAR(40) = ',' -- delimiter that separates items
    , @ElementNumber int
)
RETURNS varchar(8000)
AS
BEGIN
    declare @position int
    declare @piece varchar(8000)=''
    declare @returnVal varchar(8000)=''
    declare @Pattern varchar(50) = '%' + @Delimiter + '%'
    declare @counter int =0
    declare @ld int = len(@Delimiter)
    declare @ls1 int = len (@string1)
    declare @foundit int = 0

    if patindex(@Pattern , @string1) = 0
        return  ''

    if right(rtrim(@string1),1) <> @Delimiter
        set @string1 = @string1  + @Delimiter

    set @position =  patindex(@Pattern , @string1) + @ld  -1  
    while @position > 0
    begin
        set @counter = @counter +1 
        set @ls1  = len (@string1)
        if (@ls1 >= @ld)
            set @piece = left(@string1, @position - @ld)
        else
            break
        if (@counter = @ElementNumber)
        begin
            set @foundit = 1
                break
        end
        if len(@string1) > 0
        begin
            set @string1 = stuff(@string1, 1, @position, '')
            set @position =  patindex(@Pattern , @string1) + @ld  -1  
        end
        else
        set @position = -1
    end 


    if @foundit =1
        set @returnVal = @piece
    else 
        set @returnVal =  ''
    return @returnVal

#7


0  

@a - the value (f.e. 'a/bb/ccc/dddd/ee/ff/....')

@a - 值(f.e.'a / bb / ccc / dddd / ee / ff / ....')

@p - the desired position (1,2,3...)

@p - 所需位置(1,2,3 ...)

@d - the delimeter ( '/' )

@d - 分隔符('/')

trim(substring(replace(@a,@d,replicate(' ',len(@a))),(@p-1)*len(@a)+1,len(@a)))

only problem is - if desired part has trailing or leading blanks they get trimmed.

唯一的问题是 - 如果需要的部分有拖尾或前导空白,他们会被修剪。

Completely Based on article from https://exceljet.net/formula/split-text-with-delimiter

完全基于https://exceljet.net/formula/split-text-with-delimiter上的文章

#8


-1  

I don't have enough reputation to comment, so I am adding an answer. Please adjust as appropriate.

我没有足够的声誉来评论,所以我正在添加一个答案。请适当调整。

I have a problem with Gary Kindel's answer for cases where there is nothing between the two delimiters

我对Gary Kindel在两个分隔符之间没有任何内容的情况下的答案有疑问

If you do select * from dbo.GetSplitString_CTE('abc^def^^ghi','^',3) you get ghi instead of an empty string

如果你从dbo.GetSplitString_CTE('abc ^ def ^^ ghi','^',3)中选择*你得到ghi而不是空字符串

If you comment out the WHERE LEN([value]) > 0 line, you get the desired result

如果注释掉WHERE LEN([value])> 0行,则会得到所需的结果

#1


14  

This is the easiest answer to rerieve the 67 (type-safe!!):

这是缓解67(类型安全!!)的最简单答案:

SELECT CAST('<x>' + REPLACE('1,222,2,67,888,1111',',','</x><x>') + '</x>' AS XML).value('/x[4]','int')

This question is not about a string split approach, but about how to get the nth element. The easiest, fully inlineable way would be this IMO:

这个问题不是关于字符串拆分方法,而是关于如何获取第n个元素。这个IMO是最简单,完全可以内联的方式:

This is a real one-liner to get part 2 delimited by a space:

这是一个真正的单行程序,可以通过空格分隔第2部分:

DECLARE @input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(@input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')

Of course you can use variables for delimiter and position (use sql:column to retrieve the position directly from a query's value):

当然,您可以使用变量进行分隔符和位置(使用sql:column直接从查询的值中检索位置):

DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')

If your string might include forbidden characters, you still can do it this way. Just use FOR XML PATH on your string first to replace all forbidden characters with the fitting escape sequence implicitly.

如果您的字符串可能包含禁用字符,您仍然可以这样做。首先在字符串上使用FOR XML PATH,隐式替换所有禁用字符和拟合转义序列。

It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:

如果 - 另外 - 你的分隔符是分号,这是一个非常特殊的情况。在这种情况下,我首先将分隔符替换为'#DLMT#',并最终将其替换为XML标记:

SET @input=N'Some <, > and &;Other äöü@€;One more';
SET @dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(@input,@dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)');

#2


6  

Here is my initial solution... It is based on work by Aaron Bertrand http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

这是我最初的解决方案......它基于Aaron Bertrand的工作http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

I simply changed the return type to make it a scalar function.

我只是更改了返回类型以使其成为标量函数。

Example: SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)

示例:SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)

CREATE FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(4000)
AS
BEGIN

   DECLARE @result varchar(4000)    
   DECLARE @Items TABLE ( position int IDENTITY PRIMARY KEY,
                          Item VARCHAR(4000)
                         )  

   DECLARE @ll INT = LEN(@List) + 1, @ld INT = LEN(@Delimiter);  

   WITH a AS
   (
       SELECT
           [start] = 1,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll),
           [value] = SUBSTRING(@List, 1, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll) - 1)
       UNION ALL
       SELECT
           [start] = CONVERT(INT, [end]) + @ld,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll),
           [value] = SUBSTRING(@List, [end] + @ld, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll)-[end]-@ld)
       FROM a
       WHERE [end] < @ll
   )
   INSERT @Items SELECT [value]
   FROM a
   WHERE LEN([value]) > 0
   OPTION (MAXRECURSION 0);

   SELECT @result=Item
   FROM @Items
   WHERE position=@ElementNumber

   RETURN @result;
END
GO

#3


4  

How about:

CREATE FUNCTION dbo.NTH_ELEMENT (@Input NVARCHAR(MAX), @Delim CHAR = '-', @N INT = 0)
RETURNS NVARCHAR(MAX)
AS
BEGIN
RETURN (SELECT VALUE FROM STRING_SPLIT(@Input, @Delim) ORDER BY (SELECT NULL) OFFSET @N ROWS FETCH NEXT 1 ROW ONLY)
END

#4


1  

In a rare moment of lunacy I just thought that split is far easier if we use XML to parse it out for us:

在一个罕见的疯狂时刻,我只是认为如果我们使用XML来解析它,拆分会容易得多:

(Using the variables from @Gary Kindel's answer)

(使用来自@Gary Kindel的答案的变量)

declare @xml xml
set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

select
    el = split.el.value('.','varchar(max)')
from  @xml.nodes('/split/el') split(el))

This lists all elements of the string, split by the specified character.

这将列出字符串的所有元素,并按指定的字符进行拆分。

We can use an xpath test to filter out empty values, and a further xpath test to restrict this to the element we're interested in. In full Gary's function becomes:

我们可以使用xpath测试来过滤掉空值,并使用xpath测试将其限制为我们感兴趣的元素。完整的Gary函数变为:

alter FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(max)
AS
BEGIN

       declare @xml xml
       set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

       declare @ret varchar(max)
       set @ret = (select
              el = split.el.value('.','varchar(max)')
       from  @xml.nodes('/split/el[string-length(.)>0][position() = sql:variable("@elementnumber")]') split(el))

       return @ret

END

#5


1  

you can put this select into UFN. if you need you can customize it for specifying delimiter as well. in that case your ufn will have two input. number Nth and delimiter to use.

你可以将这个选择放入UFN。如果需要,您也可以自定义它以指定分隔符。在这种情况下,你的ufn将有两个输入。要使用的编号为Nth和分隔符。

    DECLARE @tlist varchar(max)='10,20,30,40,50,60,70,80,90,100'
    DECLARE @i INT=1, @nth INT=3
    While len(@tlist) <> 0
    BEGIN
            IF @i=@nth
            BEGIN
              select Case when charindex(',',@tlist) <> 0 Then LEFT(@tlist,charindex(',',@tlist)-1)
                          Else @tlist
                    END
            END

              Select @tlist = Case when charindex(',',@tlist) <> 0 Then substring(@tlist,charindex(',',@tlist)+1,len(@tlist))
                          Else ''
                          END

            SELECT @i=@i+1
    END

#6


0  

I cannot comment on Gary's solution because of my low reputation

由于声誉不佳,我无法评论加里的解决方案

I know Gary was referencing another link.

我知道加里正在引用另一个链接。

I have struggled to understand why we need this variable

我一直在努力理解为什么我们需要这个变量

@ld INT = LEN(@Delimiter)

I also don't understand why charindex has to start at the position of length of delimiter, @ld

我也不明白为什么charindex必须从分隔符长度@ld开始

I tested with many examples with a single character delimiter, and they work. Most of the time, delimiter character is a single character. However, since the developer included the ld as length of delimiter, the code has to work for delimiters that have more than one character

我使用单个字符分隔符测试了许多示例,并且它们可以工作。大多数情况下,分隔符是单个字符。但是,由于开发人员将ld作为分隔符的长度,因此代码必须适用于具有多个字符的分隔符

In this case, the following case will fail

在这种情况下,以下情况将失败

11,,,22,,,33,,,44,,,55,,,

I cloned from the codes from this link. http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/

我从这个链接的代码克隆。 http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/

I have tested various scenarios including the delimiters that have more than one character

我测试了各种场景,包括具有多个角色的分隔符

alter FUNCTION [dbo].[split1]
(
    @string1 VARCHAR(8000) -- List of delimited items
    , @Delimiter VARCHAR(40) = ',' -- delimiter that separates items
    , @ElementNumber int
)
RETURNS varchar(8000)
AS
BEGIN
    declare @position int
    declare @piece varchar(8000)=''
    declare @returnVal varchar(8000)=''
    declare @Pattern varchar(50) = '%' + @Delimiter + '%'
    declare @counter int =0
    declare @ld int = len(@Delimiter)
    declare @ls1 int = len (@string1)
    declare @foundit int = 0

    if patindex(@Pattern , @string1) = 0
        return  ''

    if right(rtrim(@string1),1) <> @Delimiter
        set @string1 = @string1  + @Delimiter

    set @position =  patindex(@Pattern , @string1) + @ld  -1  
    while @position > 0
    begin
        set @counter = @counter +1 
        set @ls1  = len (@string1)
        if (@ls1 >= @ld)
            set @piece = left(@string1, @position - @ld)
        else
            break
        if (@counter = @ElementNumber)
        begin
            set @foundit = 1
                break
        end
        if len(@string1) > 0
        begin
            set @string1 = stuff(@string1, 1, @position, '')
            set @position =  patindex(@Pattern , @string1) + @ld  -1  
        end
        else
        set @position = -1
    end 


    if @foundit =1
        set @returnVal = @piece
    else 
        set @returnVal =  ''
    return @returnVal

#7


0  

@a - the value (f.e. 'a/bb/ccc/dddd/ee/ff/....')

@a - 值(f.e.'a / bb / ccc / dddd / ee / ff / ....')

@p - the desired position (1,2,3...)

@p - 所需位置(1,2,3 ...)

@d - the delimeter ( '/' )

@d - 分隔符('/')

trim(substring(replace(@a,@d,replicate(' ',len(@a))),(@p-1)*len(@a)+1,len(@a)))

only problem is - if desired part has trailing or leading blanks they get trimmed.

唯一的问题是 - 如果需要的部分有拖尾或前导空白,他们会被修剪。

Completely Based on article from https://exceljet.net/formula/split-text-with-delimiter

完全基于https://exceljet.net/formula/split-text-with-delimiter上的文章

#8


-1  

I don't have enough reputation to comment, so I am adding an answer. Please adjust as appropriate.

我没有足够的声誉来评论,所以我正在添加一个答案。请适当调整。

I have a problem with Gary Kindel's answer for cases where there is nothing between the two delimiters

我对Gary Kindel在两个分隔符之间没有任何内容的情况下的答案有疑问

If you do select * from dbo.GetSplitString_CTE('abc^def^^ghi','^',3) you get ghi instead of an empty string

如果你从dbo.GetSplitString_CTE('abc ^ def ^^ ghi','^',3)中选择*你得到ghi而不是空字符串

If you comment out the WHERE LEN([value]) > 0 line, you get the desired result

如果注释掉WHERE LEN([value])> 0行,则会得到所需的结果