R - 如何根据字符串的值删除字符串中的字符?

时间:2022-06-19 22:21:29

I have a CSV file where numeric values are stored in a way like this:

我有一个CSV文件,其中数值以这样的方式存储:

+000000000000000000000001101.7100

The number above is 1101.71. This string is always the same length, so number of zeroes before the actual number depends on number´s length. How can I drop the + and all 0s before the actual number so I can then convert it to numeric easily?

以上数字是1101.71。此字符串的长度始终相同,因此实际数字之前的零数取决于数字的长度。如何在实际数字之前删除+和全0,以便我可以轻松地将其转换为数字?

2 个解决方案

#1


1  

I may miss an important point, but my best try would be like this:

我可能会错过一个重要的观点,但我最好的尝试是这样的:

1) read the values as a character

1)将值读取为字符

2) use substr to get rid of the first character, namely the plus sign

2)使用substr去掉第一个字符,即加号

3) convert column with as.integer / this way we safely loose any leading zeroes

3)用as.integer转换列/这样我们安全地松开任何前导零

#2


3  

If it is of fixed width, then substring will be a faster option

如果它具有固定宽度,则子串将是更快的选项

as.numeric(substring(str1, nchar(str1)-8))
#[1] 1101.71

but if we don't know how many 0's will be there at the beginning, then another option is sub where we match a + at the start (^) of the string followed by 0 or more elements of 0 (0*) and replace with blank ("")

但是如果我们不知道开头会有多少0,那么另一个选项就是sub,我们在字符串的开头(^)跟上一个+,然后是0或更多的0(0 *)元素并替换空白(“”)

as.numeric(sub("^\\+0*", "", str1))
#[1] 1101.71

Note that we escape the + as it is a metacharacter implying one or more

请注意,我们逃避了+,因为它是一个暗示一个或多个元字符的元字符

#1


1  

I may miss an important point, but my best try would be like this:

我可能会错过一个重要的观点,但我最好的尝试是这样的:

1) read the values as a character

1)将值读取为字符

2) use substr to get rid of the first character, namely the plus sign

2)使用substr去掉第一个字符,即加号

3) convert column with as.integer / this way we safely loose any leading zeroes

3)用as.integer转换列/这样我们安全地松开任何前导零

#2


3  

If it is of fixed width, then substring will be a faster option

如果它具有固定宽度,则子串将是更快的选项

as.numeric(substring(str1, nchar(str1)-8))
#[1] 1101.71

but if we don't know how many 0's will be there at the beginning, then another option is sub where we match a + at the start (^) of the string followed by 0 or more elements of 0 (0*) and replace with blank ("")

但是如果我们不知道开头会有多少0,那么另一个选项就是sub,我们在字符串的开头(^)跟上一个+,然后是0或更多的0(0 *)元素并替换空白(“”)

as.numeric(sub("^\\+0*", "", str1))
#[1] 1101.71

Note that we escape the + as it is a metacharacter implying one or more

请注意,我们逃避了+,因为它是一个暗示一个或多个元字符的元字符