r中正则表达式中`\\ s | *`和`\\ s | [*]`之间的区别?

时间:2022-01-12 23:26:02

What is the difference between \\s|* and \\s|[*] in regular expression in r?

r中正则表达式中\\ s | *和\\ s | [*]之间有什么区别?

> gsub('\\s|*','','Aug 2013*')
[1] "Aug2013*"
> gsub('\\s|[*]','','Aug 2013*')
[1] "Aug2013"

What is the function of [ ] here?

[]的功能在这里是什么?

2 个解决方案

#1


3  

The first expression is invalid in the way you are using it, hence * is a special character. If you want to use sub or gsub this way with special characters, you can use fixed = TRUE parameter set.

第一个表达式在您使用它的方式中无效,因此*是一个特殊字符。如果要以特殊字符的方式使用sub或gsub,可以使用fixed = TRUE参数集。

This takes the string representing the pattern being search for as it is and ignores any special characters.

这将获取表示正在搜索的模式的字符串,并忽略任何特殊字符。

See Pattern Matching and Replacement in the R documentation.

请参阅R文档中的模式匹配和替换。

x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"

Your second expression is just using a character class [] for * to avoid escaping, the same as..

你的第二个表达式只是使用一个字符类[]来避免转义,与...相同。

x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"

As far as the explanation of your first expression: \\s|*

至于你的第一个表达的解释:\\ s | *

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR

And the second expression: \\s|[*]

第二个表达式:\\ s | [*]

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR
[*]     any character of: '*'

#2


3  

The use of [] here is nothing else but to escape the * to a literal asterisk.

在这里使用[]只不过是将*转换为文字星号。

The first regex is invalid (* is special character meaning "zero or more").

第一个正则表达式无效(*是特殊字符,表示“零或更多”)。

The second regex is equivalent to

第二个正则表达式相当于

'\\s|\\*'

#1


3  

The first expression is invalid in the way you are using it, hence * is a special character. If you want to use sub or gsub this way with special characters, you can use fixed = TRUE parameter set.

第一个表达式在您使用它的方式中无效,因此*是一个特殊字符。如果要以特殊字符的方式使用sub或gsub,可以使用fixed = TRUE参数集。

This takes the string representing the pattern being search for as it is and ignores any special characters.

这将获取表示正在搜索的模式的字符串,并忽略任何特殊字符。

See Pattern Matching and Replacement in the R documentation.

请参阅R文档中的模式匹配和替换。

x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"

Your second expression is just using a character class [] for * to avoid escaping, the same as..

你的第二个表达式只是使用一个字符类[]来避免转义,与...相同。

x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"

As far as the explanation of your first expression: \\s|*

至于你的第一个表达的解释:\\ s | *

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR

And the second expression: \\s|[*]

第二个表达式:\\ s | [*]

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR
[*]     any character of: '*'

#2


3  

The use of [] here is nothing else but to escape the * to a literal asterisk.

在这里使用[]只不过是将*转换为文字星号。

The first regex is invalid (* is special character meaning "zero or more").

第一个正则表达式无效(*是特殊字符,表示“零或更多”)。

The second regex is equivalent to

第二个正则表达式相当于

'\\s|\\*'