What is the difference between \\s|*
and \\s|[*]
in regular expression in r?
r中正则表达式中\\ s | *和\\ s | [*]之间有什么区别?
> gsub('\\s|*','','Aug 2013*')
[1] "Aug2013*"
> gsub('\\s|[*]','','Aug 2013*')
[1] "Aug2013"
What is the function of [ ]
here?
[]的功能在这里是什么?
2 个解决方案
#1
3
The first expression is invalid in the way you are using it, hence *
is a special character. If you want to use sub
or gsub
this way with special characters, you can use fixed = TRUE
parameter set.
第一个表达式在您使用它的方式中无效,因此*是一个特殊字符。如果要以特殊字符的方式使用sub或gsub,可以使用fixed = TRUE参数集。
This takes the string representing the pattern being search for as it is and ignores any special characters.
这将获取表示正在搜索的模式的字符串,并忽略任何特殊字符。
See Pattern Matching and Replacement
in the R
documentation.
请参阅R文档中的模式匹配和替换。
x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"
Your second expression is just using a character class []
for *
to avoid escaping, the same as..
你的第二个表达式只是使用一个字符类[]来避免转义,与...相同。
x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"
As far as the explanation of your first expression: \\s|*
至于你的第一个表达的解释:\\ s | *
\s whitespace (\n, \r, \t, \f, and " ")
| OR
And the second expression: \\s|[*]
第二个表达式:\\ s | [*]
\s whitespace (\n, \r, \t, \f, and " ")
| OR
[*] any character of: '*'
#2
3
The use of []
here is nothing else but to escape the *
to a literal asterisk.
在这里使用[]只不过是将*转换为文字星号。
The first regex is invalid (*
is special character meaning "zero or more").
第一个正则表达式无效(*是特殊字符,表示“零或更多”)。
The second regex is equivalent to
第二个正则表达式相当于
'\\s|\\*'
#1
3
The first expression is invalid in the way you are using it, hence *
is a special character. If you want to use sub
or gsub
this way with special characters, you can use fixed = TRUE
parameter set.
第一个表达式在您使用它的方式中无效,因此*是一个特殊字符。如果要以特殊字符的方式使用sub或gsub,可以使用fixed = TRUE参数集。
This takes the string representing the pattern being search for as it is and ignores any special characters.
这将获取表示正在搜索的模式的字符串,并忽略任何特殊字符。
See Pattern Matching and Replacement
in the R
documentation.
请参阅R文档中的模式匹配和替换。
x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"
Your second expression is just using a character class []
for *
to avoid escaping, the same as..
你的第二个表达式只是使用一个字符类[]来避免转义,与...相同。
x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"
As far as the explanation of your first expression: \\s|*
至于你的第一个表达的解释:\\ s | *
\s whitespace (\n, \r, \t, \f, and " ")
| OR
And the second expression: \\s|[*]
第二个表达式:\\ s | [*]
\s whitespace (\n, \r, \t, \f, and " ")
| OR
[*] any character of: '*'
#2
3
The use of []
here is nothing else but to escape the *
to a literal asterisk.
在这里使用[]只不过是将*转换为文字星号。
The first regex is invalid (*
is special character meaning "zero or more").
第一个正则表达式无效(*是特殊字符,表示“零或更多”)。
The second regex is equivalent to
第二个正则表达式相当于
'\\s|\\*'