从R中的字符串中删除某些字符

I have a string in R which contains a large amount of words. When viewing the string I get a large amount of text which includes text similar to the following:

我在R中有一个字符串包含了大量的单词。在查看字符串时，我得到了大量的文本，其中包括与以下内容类似的文本:

>docs

....

\u009cYes yes for ever for ever the boys cried in their ringing voices with softened faces

....

So I'm wondering how to remove these \u009 characters (all of them, some of which have slightly different numbers) from the string. I've tried using gsub(), but that wasn't effective in removing the content from the strings.

因此，我想知道如何从字符串中删除这些\u009字符(所有的字符，有些字符的数字略有不同)。我尝试过使用gsub()，但这并不能有效地从字符串中删除内容。

2 个解决方案

#1

This should work

这应该工作

gsub('\u009c','','\u009cYes yes for ever for ever the boys ')
"Yes yes for ever for ever the boys "

Here 009c is the hexadecimal number of unicode. You must always specify 4 hexadecimal digits. If you have many , one solution is to separate them by a pipe:

这里009c是十六进制的unicode数字。必须始终指定4个十六进制数字。如果你有很多，一种方法是用管道将它们分开:

gsub('\u009c|\u00F0','','\u009cYes yes \u00F0for ever for ever the boys and the girls')

"Yes yes for ever for ever the boys and the girls"

#2

try: gsub('\\$', '', '$5.00$')

试题:gsub(‘\ \ $’,”,“5.00美元”)

#1