重新分类数据表中的选择列

时间:2022-02-21 07:24:21

I wish to change the class of selected variables in a data table, using a vectorized operation. I am new to the data.table syntax, and am trying to learn as much as possible. I now the question is basic, but it will help me to better understand the data table way of thinking!

我希望使用向量化操作更改数据表中所选变量的类。我是data.table语法的新手,我正在努力学习。我现在的问题是基本的,但它会帮助我更好地理解数据表的思维方式!

A similar question was asked here! However, the solution seems to pertain to either reclassing just one column or all columns. My question is unique to a select few columns.

这里问了一个类似的问题!但是,该解决方案似乎与重新分类一列或所有列有关。我的问题对于少数几列来说是独一无二的。

### Load package
require(data.table)

### Create pseudo data
data <- data.table(id     = 1:10,
                   height = rnorm(10, mean = 182, sd = 20),
                   weight = rnorm(10, mean = 160, sd = 10),
                   color  = rep(c('blue', 'gold'), times = 5))

### Reclass all columns
data <- data[, lapply(.SD, as.character)]

### Search for columns to be reclassed
index <- grep('(id)|(height)|(weight)', names(data))

### data frame method
df <- data.frame(data)
df[, index] <- lapply(df[, index], as.numeric)

### Failed attempt to reclass columns used the data.table method
data <- data[, lapply(index, as.character), with = F]

Any help would be appreciated. My data are large and so using regular expressions to create a vector of column numbers to reclassify is necessary.

任何帮助,将不胜感激。我的数据很大,因此使用正则表达式来创建列号的向量以重新分类是必要的。

Thank you for your time.

感谢您的时间。

3 个解决方案

#1


8  

I think that @SimonO101 did most of the Job

我认为@ SimonO101做了大部分工作

data[, names(data)[index] := lapply(.SD, as.character) , .SDcols = index ]

You can just use the := magic

你可以使用:= magic

#2


9  

You could avoid the overhead of the construction of .SD within j by using set

你可以通过使用set来避免在j中构造.SD的开销

for(j in index) set(data, j =j ,value = as.character(data[[j]]))

#3


4  

You just need to use .SDcols with your index vector (I learnt that today!), but that will just return a data table with the reclassed columns. @dickoa 's answer is what you are looking for.

你只需要在你的索引向量中使用.SDcols(我今天就学到了!),但这只会返回带有重新列的数据表。 @dickoa的答案正是你要找的。

data <- data[, lapply(.SD, as.character) , .SDcols = index ]
sapply(data , class)
        id      height      weight 
"character" "character" "character" 

#1


8  

I think that @SimonO101 did most of the Job

我认为@ SimonO101做了大部分工作

data[, names(data)[index] := lapply(.SD, as.character) , .SDcols = index ]

You can just use the := magic

你可以使用:= magic

#2


9  

You could avoid the overhead of the construction of .SD within j by using set

你可以通过使用set来避免在j中构造.SD的开销

for(j in index) set(data, j =j ,value = as.character(data[[j]]))

#3


4  

You just need to use .SDcols with your index vector (I learnt that today!), but that will just return a data table with the reclassed columns. @dickoa 's answer is what you are looking for.

你只需要在你的索引向量中使用.SDcols(我今天就学到了!),但这只会返回带有重新列的数据表。 @dickoa的答案正是你要找的。

data <- data[, lapply(.SD, as.character) , .SDcols = index ]
sapply(data , class)
        id      height      weight 
"character" "character" "character"