最优雅的方式加载csv点与R中的千位分隔符

时间:2021-11-25 20:21:57

NB: To the best of my knowledge this question is not a duplicate! All the questios/answers I found are either how to eliminate points from data that are already in R or how to change the decimal point to a comma when loading it.

注意:据我所知,这个问题并不重复!我找到的所有问题/答案要么是如何从已经在R中的数据中消除点,要么在加载时如何将小数点更改为逗号。

I have a csv with numbers like: 4.123,98. The problem is that because of the . the output becomes a character string matrix when loading with read.table, read.csv or read.csv2. Changing dec to , doesn't help.

我有一个csv,数字如:4.123,98。问题在于因为。使用read.table,read.csv或read.csv2加载时,输出变为字符串矩阵。改变dec,没有帮助。

My question
What is the most elegant way to load this csv so that the numbers become e.g. 4123.98 as numeric?

我的问题加载此csv的最优雅方法是什么,以便数字变为例如4123.98作为数字?

2 个解决方案

#1


9  

#some sample data
write.csv(data.frame(a=c("1.234,56","1.234,56"),
                     b=c("1.234,56","1.234,56")),
          "test.csv",row.names=FALSE,quote=TRUE)

#define your own numeric class
setClass('myNum')
#define conversion
setAs("character","myNum", function(from) as.numeric(gsub(",","\\.",gsub("\\.","",from))))

#read data with custom colClasses
read_data=read.csv("test.csv",stringsAsFactors=FALSE,colClasses=c("myNum","myNum"))
#let's try whether this is really a numeric
read_data[1,1]*2

#[1] 2469.12

#2


2  

Rather than try to fix it all at loading time, I would load the data into R as a string, then process it to numeric.

我不是在加载时尝试修复它,而是将数据作为字符串加载到R中,然后将其处理为数字。

So after loading, it's a column of strings like "4.123,98"

所以加载后,它是一列字符串,如“4.123,98”

Then do something like:

然后做一些事情:

 number.string <- gsub("\\.", "", number.string)
 number.string <- gsub(",", "\\.", number.string)
 number <- as.numeric(number.string)

#1


9  

#some sample data
write.csv(data.frame(a=c("1.234,56","1.234,56"),
                     b=c("1.234,56","1.234,56")),
          "test.csv",row.names=FALSE,quote=TRUE)

#define your own numeric class
setClass('myNum')
#define conversion
setAs("character","myNum", function(from) as.numeric(gsub(",","\\.",gsub("\\.","",from))))

#read data with custom colClasses
read_data=read.csv("test.csv",stringsAsFactors=FALSE,colClasses=c("myNum","myNum"))
#let's try whether this is really a numeric
read_data[1,1]*2

#[1] 2469.12

#2


2  

Rather than try to fix it all at loading time, I would load the data into R as a string, then process it to numeric.

我不是在加载时尝试修复它,而是将数据作为字符串加载到R中,然后将其处理为数字。

So after loading, it's a column of strings like "4.123,98"

所以加载后,它是一列字符串,如“4.123,98”

Then do something like:

然后做一些事情:

 number.string <- gsub("\\.", "", number.string)
 number.string <- gsub(",", "\\.", number.string)
 number <- as.numeric(number.string)