获得'class()'方法的不同结果

时间:2021-10-08 15:22:15

Here's the smallest piece of code which displays how i am getting different results for class() when called directly for columns vs when called using apply.

这是最小的一段代码,显示了当使用apply直接调用列时,我如何获得class()的不同结果。

data.frame looks like this.

data.frame看起来像这样。

> df
    A             B             C
1 rlm  4.047317e-03  0.0040111713
2 rlm -6.474359e-02 -0.0657461598
3 rlm  1.464302e-01  0.1451224214
4 rlm  3.508878e-01  0.3477540761
5  lm  2.701757e-01  0.2769367280
6  lm  2.580785e-03  0.0025815525
7 rlm  1.638077e-05  0.0000160895

> str(df)
'data.frame':   7 obs. of  3 variables:
 $ A: chr  "rlm" "rlm" "rlm" "rlm" ...
     $ B: num  0.00405 -0.06474 0.14643 0.35089 0.27018 ...
 $ C: num  0.00401 -0.06575 0.14512 0.34775 0.27694 ...

> class(df$A)
    [1] "character"
    > class(df$B)
[1] "numeric"
> apply(df, 2, class)
          A           B           C 
"character" "character" "character" 

So, when called directly class of B is 'numeric', but when called using apply, it's saying 'character'.

因此,当直接调用B的类是'数字'时,但是当使用apply调用时,它会说'character'。

Am i missing anything here ?

我在这里遗漏了什么?

2 个解决方案

#1


1  

Apply coerces data.frames to matrices before applying the function. Since in a matrix each element must have the same class you end up with a character matrix (since you can convert numeric to character without information loss but not the other way). The reason for this is probably that you can apply functions by-row as well, which would be messy with data.frames since your function would need to operate on a list.

在应用函数之前,将data.frames强制应用于矩阵。因为在矩阵中,每个元素必须具有相同的类,最后才能使用字符矩阵(因为您可以将数字转换为字符而不会丢失信息,但不能反过来)。原因可能是你也可以逐行应用函数,这会使data.frames变得混乱,因为你的函数需要在列表上运行。

For what you want check out the lapply and sapply functions, since data.frames are basically lists with each element of the list being one of the columns.

对于你想要的东西,检查lapply和sapply函数,因为data.frames基本上是列表,列表的每个元素都是列之一。

> x <- data.frame(a = "Entry", b = 5)
> sapply(x, class)
        a         b 
 "factor" "numeric"   

#2


0  

I get the same result. I think it might be the same behavior you see in this example:

我得到了相同的结果。我认为这可能与您在此示例中看到的行为相同:

number_m <- matrix(1:6)
mode(number_m) # "numeric"

number_m[2,1] <- "b"
mode(number_m) # "character"
number_m

converting one element of a matrix or vector to a character changes the data type of all the elements.

将矩阵或向量的一个元素转换为字符会改变所有元素的数据类型。

I get the correct result using a loop:

我使用循环得到了正确的结果:

df <- read.table(header=TRUE, text="
    A             B             C
1 rlm  4.047317e-03  0.0040111713
2 rlm -6.474359e-02 -0.0657461598
3 rlm  1.464302e-01  0.1451224214
4 rlm  3.508878e-01  0.3477540761
5  lm  2.701757e-01  0.2769367280
6  lm  2.580785e-03  0.0025815525
7 rlm  1.638077e-05  0.0000160895")

sapply(1:3, function(i) class(df[,i]))

#1


1  

Apply coerces data.frames to matrices before applying the function. Since in a matrix each element must have the same class you end up with a character matrix (since you can convert numeric to character without information loss but not the other way). The reason for this is probably that you can apply functions by-row as well, which would be messy with data.frames since your function would need to operate on a list.

在应用函数之前,将data.frames强制应用于矩阵。因为在矩阵中,每个元素必须具有相同的类,最后才能使用字符矩阵(因为您可以将数字转换为字符而不会丢失信息,但不能反过来)。原因可能是你也可以逐行应用函数,这会使data.frames变得混乱,因为你的函数需要在列表上运行。

For what you want check out the lapply and sapply functions, since data.frames are basically lists with each element of the list being one of the columns.

对于你想要的东西,检查lapply和sapply函数,因为data.frames基本上是列表,列表的每个元素都是列之一。

> x <- data.frame(a = "Entry", b = 5)
> sapply(x, class)
        a         b 
 "factor" "numeric"   

#2


0  

I get the same result. I think it might be the same behavior you see in this example:

我得到了相同的结果。我认为这可能与您在此示例中看到的行为相同:

number_m <- matrix(1:6)
mode(number_m) # "numeric"

number_m[2,1] <- "b"
mode(number_m) # "character"
number_m

converting one element of a matrix or vector to a character changes the data type of all the elements.

将矩阵或向量的一个元素转换为字符会改变所有元素的数据类型。

I get the correct result using a loop:

我使用循环得到了正确的结果:

df <- read.table(header=TRUE, text="
    A             B             C
1 rlm  4.047317e-03  0.0040111713
2 rlm -6.474359e-02 -0.0657461598
3 rlm  1.464302e-01  0.1451224214
4 rlm  3.508878e-01  0.3477540761
5  lm  2.701757e-01  0.2769367280
6  lm  2.580785e-03  0.0025815525
7 rlm  1.638077e-05  0.0000160895")

sapply(1:3, function(i) class(df[,i]))