I am a noob to regex.
我是正则表达式的菜鸟。
I have string like:-
我有这样的字符串: -
String str = "sbs 01.00 sip ${dreamworks.values} print ${fwVer} to
used ${lang} en given ${model} in ${region}";
and i have to extract all patterns matched with this type ${....}
我必须提取与此类型匹配的所有模式$ {....}
Like:- for given str result should be
喜欢: - 对于给定的str结果应该是
${dreamworks.values}
${fwVer}
${lang}
${model}
${region}
further if it finds any duplicates then gives only one . for ex:-
如果它发现任何重复,那么只给出一个。对于前: -
String feed = "sip ${dreamworks.values} print ${fwVer} to ${fwVer} used
${lang} en ${lang}given ${model} in ${region}"
result should be:-
结果应该是: -
${dreamworks.values}
${fwVer}
${lang}
${model}
${region}
only
this is my answer:-
这是我的答案: -
PLACEHOLDER_PATTERN = "\\$\\{\\w+\\}";
but this one not giving the correct result. it gives only
但是这个没有给出正确的结果。它只给出
${fwVer}
${lang}
${model}
${region}
So please suggest me correct regex.
所以请建议我正确的正则表达式。
2 个解决方案
#1
6
You are not considering the .
in between the word. \\w
does not include the dot(.)
.
你不是在考虑。在这个词之间。 \\ w不包括点(。)。
You need to change your pattern to: -
您需要将模式更改为: -
PLACEHOLDER_PATTERN = "\\$\\{.+?\\}";
dot(.)
matches everything, and that is what you want right?
点(。)匹配一切,这就是你想要的吗?
Also, I have used here reluctant
quantifier - .+?
so that it only matches the first }
after {
, since if you use a greedy quantifier (.+)
, dot(.)
will also match the }
in the way till it finds the last }
.
另外,我在这里使用了不情愿的量词 - 。+?所以它只匹配{之后的{,因为如果你使用贪婪的量词(。+),dot(。)也会匹配}},直到它找到最后一个}。
UPDATE: -
To get just the unique values, you can use this pattern: -
要获得唯一值,您可以使用此模式: -
"(\\$\\{[^}]+\\})(?!.*?\\1)"
It will match only those pattern, which is not followed by the string containing the same pattern.
它将仅匹配那些不包含相同模式的字符串的模式。
NOTE: - Here, I have used [^}]
, in place of .+?
. It will match any character except }
. So, now in this case, you don't need a reluctant
quantifier.
注意: - 这里,我用[^}]代替。+ ?.它将匹配除}之外的任何字符。所以,现在在这种情况下,你不需要一个不情愿的量词。
\1
is used for backreferencing
, but we need to escape it with a backslash, and hence \\1
, and (?!...)
is used for negative look ahead
.
\ 1用于反向引用,但我们需要用反斜杠转义它,因此\\ 1和(?!...)用于负向前看。
#2
1
Thats is, because the .
is not included in \w
. You need to create your own character class then and add it there.
多数民众赞成是因为。未列入\ w。您需要创建自己的角色类,然后将其添加到那里。
PLACEHOLDER_PATTERN = "\\$\\{[\\w.]+\\}";
See the pattern here on Regexr.
请参阅Regexr上的模式。
However, this does not solve the problem, that you want no duplicates, but that is not a job for regular expressions.
但是,这并没有解决问题,您不需要重复,但这不是正则表达式的工作。
If there could be more different characters between the curly brackets, then Rohits answer is better, that would match any characters till the closing bracket.
如果大括号之间可能有更多不同的字符,那么Rohits的答案会更好,这将匹配任何字符直到结束括号。
#1
6
You are not considering the .
in between the word. \\w
does not include the dot(.)
.
你不是在考虑。在这个词之间。 \\ w不包括点(。)。
You need to change your pattern to: -
您需要将模式更改为: -
PLACEHOLDER_PATTERN = "\\$\\{.+?\\}";
dot(.)
matches everything, and that is what you want right?
点(。)匹配一切,这就是你想要的吗?
Also, I have used here reluctant
quantifier - .+?
so that it only matches the first }
after {
, since if you use a greedy quantifier (.+)
, dot(.)
will also match the }
in the way till it finds the last }
.
另外,我在这里使用了不情愿的量词 - 。+?所以它只匹配{之后的{,因为如果你使用贪婪的量词(。+),dot(。)也会匹配}},直到它找到最后一个}。
UPDATE: -
To get just the unique values, you can use this pattern: -
要获得唯一值,您可以使用此模式: -
"(\\$\\{[^}]+\\})(?!.*?\\1)"
It will match only those pattern, which is not followed by the string containing the same pattern.
它将仅匹配那些不包含相同模式的字符串的模式。
NOTE: - Here, I have used [^}]
, in place of .+?
. It will match any character except }
. So, now in this case, you don't need a reluctant
quantifier.
注意: - 这里,我用[^}]代替。+ ?.它将匹配除}之外的任何字符。所以,现在在这种情况下,你不需要一个不情愿的量词。
\1
is used for backreferencing
, but we need to escape it with a backslash, and hence \\1
, and (?!...)
is used for negative look ahead
.
\ 1用于反向引用,但我们需要用反斜杠转义它,因此\\ 1和(?!...)用于负向前看。
#2
1
Thats is, because the .
is not included in \w
. You need to create your own character class then and add it there.
多数民众赞成是因为。未列入\ w。您需要创建自己的角色类,然后将其添加到那里。
PLACEHOLDER_PATTERN = "\\$\\{[\\w.]+\\}";
See the pattern here on Regexr.
请参阅Regexr上的模式。
However, this does not solve the problem, that you want no duplicates, but that is not a job for regular expressions.
但是,这并没有解决问题,您不需要重复,但这不是正则表达式的工作。
If there could be more different characters between the curly brackets, then Rohits answer is better, that would match any characters till the closing bracket.
如果大括号之间可能有更多不同的字符,那么Rohits的答案会更好,这将匹配任何字符直到结束括号。