I have a string here, This is a string: AAA123456789
.
这里有一个字符串,这是一个字符串:AAA123456789。
So the idea here is to extract the string AAA123456789
using regex.
这里的想法是使用正则表达式提取字符串AAA123456789。
I am incorporating this with X-Path.
我把它和X-Path合并。
Note: If there is a post to this, kindly lead me to it.
注意:如果有帖子的话,请带我去。
I think, by right, I should substring(myNode, [^AAA\d+{9}])
,
对吧,我想我应该substring(myNode,[^ 3 a \ d + { 9 })),
I am not really sure bout the regex part.
我对regex部分不是很确定。
The idea is to extract the string when met with "AAA" and only numbers but 9 consequent numbers only.
这个想法是在遇到“AAA”时提取字符串,只在数字上加上9个结果。
4 个解决方案
#1
9
Pure XPath solution:
纯XPath的解决方案:
substring-after('This is a string: AAA123456789', ': ')
produces:
生产:
AAA123456789
XPath 2.0 solutions:
XPath 2.0解决方案:
tokenize('This is a string: AAA123456789 but not an double',
' '
)[starts-with(., 'AAA')]
or:
或者:
tokenize('This is a string: AAA123456789 but not an double',
' '
)[matches(., 'AAA\d+')]
or:
或者:
replace('This is a string: AAA123456789 but not an double',
'^.*(A+\d+).*$',
'$1'
)
#2
4
Alright, after referencing answers and comments by wonderful people here, I summarized my findings with this solution which I opted for. Here goes,
好吧,在引用了这里优秀人士的回答和评论之后,我用我选择的这个解决方案总结了我的发现。在这里,
concat("AAA", substring(substring-after(., "AAA"), 1, 9))
.
concat(substring(substring-after(“AAA”。“AAA”),1 9))。
So I firstly, substring-after the string with "AAA" as the 1st argument, with the length of 1 to 9...anything more, is ignored. Then since I used the AAA as a reference, this will not appear, thus, concatenating AAA to the front of the value. So this means that I will get the 1st 9 digits after AAA and then concat AAA in front since its a static data.
所以我首先,在字符串后面加上“AAA”作为第一个参数,长度为1到9……任何更多的,将被忽略。然后,由于我使用AAA作为引用,这将不会出现,从而将AAA连接到值的前面。这就意味着我将得到AAA之后的前9位数字,然后再把AAA放在前面因为它是静态数据。
This will allow the data to be correct no matter what other contributions there is.
这将允许数据是正确的,无论还有什么贡献。
But I like the regex by @Dimitre. The replace part. The tokenize not so as what if there isn't space as the argument. The replace with regex, this is also wonderful. Thanks.
但是我喜欢@Dimitre的regex。替换的部分。符号不是,如果没有空间就像参数。用regex代替,这也很棒。谢谢。
And also thanks to you guys out there to...
还要感谢你们…
#3
1
First, I'm pretty sure you don't mean to have the [^ ... ]
. That defines a "negative character class", i.e. your current regex says, "Give me a single character that is not one of the following: A0123456789{}
". You probably meant, plainly, "AAA(\d{9})"
. Now, according to this handy website, XPath does support capture groups, as well as backreferences, so take your pick:
首先,我敢肯定你不想有[^……]。它定义了一个“负字符类”,即您当前的regex说,“给我一个不属于以下内容的字符:A0123456789{}”。你可能只是简单地说“AAA(\d{9})”。现在,根据这个方便的网站,XPath确实支持捕获组和反向引用,所以您可以选择:
"AAA(\d{9})"
And extracting $1
, the first capture group, or:
提取第一个捕获组$1,或者:
"(?<=AAA)\d{9}"
And taking the whole match ($0
).
取整个匹配项(0美元)
#4
0
Can you try this :
你能试试这个吗?
A{3}(\d{9})
{ 3 }(\ d { 9 })
#1
9
Pure XPath solution:
纯XPath的解决方案:
substring-after('This is a string: AAA123456789', ': ')
produces:
生产:
AAA123456789
XPath 2.0 solutions:
XPath 2.0解决方案:
tokenize('This is a string: AAA123456789 but not an double',
' '
)[starts-with(., 'AAA')]
or:
或者:
tokenize('This is a string: AAA123456789 but not an double',
' '
)[matches(., 'AAA\d+')]
or:
或者:
replace('This is a string: AAA123456789 but not an double',
'^.*(A+\d+).*$',
'$1'
)
#2
4
Alright, after referencing answers and comments by wonderful people here, I summarized my findings with this solution which I opted for. Here goes,
好吧,在引用了这里优秀人士的回答和评论之后,我用我选择的这个解决方案总结了我的发现。在这里,
concat("AAA", substring(substring-after(., "AAA"), 1, 9))
.
concat(substring(substring-after(“AAA”。“AAA”),1 9))。
So I firstly, substring-after the string with "AAA" as the 1st argument, with the length of 1 to 9...anything more, is ignored. Then since I used the AAA as a reference, this will not appear, thus, concatenating AAA to the front of the value. So this means that I will get the 1st 9 digits after AAA and then concat AAA in front since its a static data.
所以我首先,在字符串后面加上“AAA”作为第一个参数,长度为1到9……任何更多的,将被忽略。然后,由于我使用AAA作为引用,这将不会出现,从而将AAA连接到值的前面。这就意味着我将得到AAA之后的前9位数字,然后再把AAA放在前面因为它是静态数据。
This will allow the data to be correct no matter what other contributions there is.
这将允许数据是正确的,无论还有什么贡献。
But I like the regex by @Dimitre. The replace part. The tokenize not so as what if there isn't space as the argument. The replace with regex, this is also wonderful. Thanks.
但是我喜欢@Dimitre的regex。替换的部分。符号不是,如果没有空间就像参数。用regex代替,这也很棒。谢谢。
And also thanks to you guys out there to...
还要感谢你们…
#3
1
First, I'm pretty sure you don't mean to have the [^ ... ]
. That defines a "negative character class", i.e. your current regex says, "Give me a single character that is not one of the following: A0123456789{}
". You probably meant, plainly, "AAA(\d{9})"
. Now, according to this handy website, XPath does support capture groups, as well as backreferences, so take your pick:
首先,我敢肯定你不想有[^……]。它定义了一个“负字符类”,即您当前的regex说,“给我一个不属于以下内容的字符:A0123456789{}”。你可能只是简单地说“AAA(\d{9})”。现在,根据这个方便的网站,XPath确实支持捕获组和反向引用,所以您可以选择:
"AAA(\d{9})"
And extracting $1
, the first capture group, or:
提取第一个捕获组$1,或者:
"(?<=AAA)\d{9}"
And taking the whole match ($0
).
取整个匹配项(0美元)
#4
0
Can you try this :
你能试试这个吗?
A{3}(\d{9})
{ 3 }(\ d { 9 })