使用SQL PATINDEX,不同大小的子字符串提取字符串

时间:2022-09-13 14:05:01

I'm trying to extract ###x###, ###x##, and sometimes #x#. Sometimes there may be a space between the numbers and the x. Essentially, I may run into strings like

我正在尝试提取### x ###,### x ##,有时提取#x#。有时在数字和x之间可能存在空格。从本质上讲,我可能遇到像

  • 720x60
  • 720x600
  • 720 x 60
  • 720 x 60

  • 720_x_60
  • 1x1

I use PATINDEX() to find the first occurrence of the pattern '%[0-9]%x%[0-9]%'. So far so good. Then I use PATINDEX() to find the first occurence of a non-digit string after that. This is where I have trouble. I get results as in the screenshot. Code is also below.

我使用PATINDEX()来查找模式'%[0-9]%x%[0-9]%'的第一个匹配项。到现在为止还挺好。然后我使用PATINDEX()来查找之后第一次出现非数字字符串。这是我遇到麻烦的地方。我在屏幕截图中得到了结果。代码也在下面。

SELECT *
    ,CASE WHEN StartInt > 0
        THEN SUBSTRING(Placement, StartInt, SizeLength) ELSE NULL END AS PlacementSize
FROM
(SELECT Placement
    --find the first occurrence of #*x*#
    ,PATINDEX('%[0-9]%x%[0-9]%',Placement) AS StartInt

    --find the first non-digit after that
    ,PATINDEX(
        '%[^0-9]%'
        ,RIGHT(
            Placement + '_' --this underscore adds at least one non-digit to find
            ,LEN(Placement)
                -
            PATINDEX('%[0-9]%x%[0-9]%',Placement) - 5
            )
        ) + 6 AS SizeLength
FROM [Staging].[Client].[A01_FY14_Reporting_staging]
WHERE [Date] > '2014-07-01') AS a

Results:

使用SQL PATINDEX,不同大小的子字符串提取字符串

1 个解决方案

#1


4  

If you're dealing with a pair of numeric values, but are also dealing with dirty data, and lack the power of Regex, here's what you can do in TSQL.

如果您正在处理一对数值,但也处理脏数据,并且缺乏Regex的强大功能,那么您可以在TSQL中执行此操作。

Essentially, it looks like you're wanting to break the string in half at 'x', then whittle down the outputs until you have numeric only values. Using a set of derived tables, this becomes relatively easy (and not as hard to read)

从本质上讲,看起来你想要在'x'处将字符串分成两半,然后减去输出,直到你只有数字值。使用一组派生表,这变得相对容易(并且不那么难读)

declare @placements table (Placement varchar(10))
insert into @placements values 
('720x60'),
('720x600'),
('720 x 60'),
('720_x_60'),
('1x1')

SELECT LEFT(LeftOfX,PATINDEX('%[^0-9]%',LeftOfX) - 1) + 'x' + RIGHT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', RightOfX) + 1)
FROM (
    SELECT RIGHT(LeftOfX, LEN(LeftOfX) - PATINDEX('%[0-9]%', LeftOfX) + 1) AS LeftOfX, LEFT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', REVERSE(RightOfX)) + 1) AS RightOfX
    FROM (
        SELECT LEFT(p.Placement,x) AS LeftOfX, RIGHT(p.Placement,LEN(p.Placement) - x + 1) AS RightOfX
        FROM (
            SELECT
                  p.Placement
                , CHARINDEX('x',p.Placement) AS x
            FROM @placements p
            ) p
        ) p
    ) p

Here's the SQLFiddle example.

这是SQLFiddle示例。

First, select your placement, the location of your 'x' in Placement, and other columns you want from the table. Pass the other columns up through the derived tables.

首先,选择您的展示位置,展示位置中“x”的位置以及您希望从表中添加的其他列。将其他列向上传递到派生表中。

Next, Split the string into Left and Right.

接下来,将字符串拆分为左和右。

Process left and right in two more queries, the first to take the right of results starting at the numeric portion, then the left of the results ending at the non-numeric portion.

在另外两个查询中左右处理,第一个从数字部分开始取结果右边,然后在非数字部分结束左边的结果。

EDIT: Fixed the outputs, both numbers now selected.

编辑:修正了输出,现在都选择了这两个数字。

#1


4  

If you're dealing with a pair of numeric values, but are also dealing with dirty data, and lack the power of Regex, here's what you can do in TSQL.

如果您正在处理一对数值,但也处理脏数据,并且缺乏Regex的强大功能,那么您可以在TSQL中执行此操作。

Essentially, it looks like you're wanting to break the string in half at 'x', then whittle down the outputs until you have numeric only values. Using a set of derived tables, this becomes relatively easy (and not as hard to read)

从本质上讲,看起来你想要在'x'处将字符串分成两半,然后减去输出,直到你只有数字值。使用一组派生表,这变得相对容易(并且不那么难读)

declare @placements table (Placement varchar(10))
insert into @placements values 
('720x60'),
('720x600'),
('720 x 60'),
('720_x_60'),
('1x1')

SELECT LEFT(LeftOfX,PATINDEX('%[^0-9]%',LeftOfX) - 1) + 'x' + RIGHT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', RightOfX) + 1)
FROM (
    SELECT RIGHT(LeftOfX, LEN(LeftOfX) - PATINDEX('%[0-9]%', LeftOfX) + 1) AS LeftOfX, LEFT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', REVERSE(RightOfX)) + 1) AS RightOfX
    FROM (
        SELECT LEFT(p.Placement,x) AS LeftOfX, RIGHT(p.Placement,LEN(p.Placement) - x + 1) AS RightOfX
        FROM (
            SELECT
                  p.Placement
                , CHARINDEX('x',p.Placement) AS x
            FROM @placements p
            ) p
        ) p
    ) p

Here's the SQLFiddle example.

这是SQLFiddle示例。

First, select your placement, the location of your 'x' in Placement, and other columns you want from the table. Pass the other columns up through the derived tables.

首先,选择您的展示位置,展示位置中“x”的位置以及您希望从表中添加的其他列。将其他列向上传递到派生表中。

Next, Split the string into Left and Right.

接下来,将字符串拆分为左和右。

Process left and right in two more queries, the first to take the right of results starting at the numeric portion, then the left of the results ending at the non-numeric portion.

在另外两个查询中左右处理,第一个从数字部分开始取结果右边,然后在非数字部分结束左边的结果。

EDIT: Fixed the outputs, both numbers now selected.

编辑:修正了输出,现在都选择了这两个数字。