在R中创建新的类。

时间:2021-10-13 12:42:41

I am toying around with functions, classes and methods in R. To have a "hand on" exercise that could also be useful, I have decided to create my "package" for taking care of my household budget. Simply put, I want a series of functions, classes and methods to calculate stuff, plot different kind of charts and what not. The first thing that I wanted to do is creating a "Budget" class: this should take in a csv with certain columns and return an object "Budget" that inherit the same method of a data frame but to whom I can apply a set of "Budgets" methods. Here is my take

我在研究r中的函数、类和方法,为了“动手做”同样有用的练习,我决定创建一个“包”来管理我的家庭预算。简单地说,我想要一系列函数、类和方法来计算东西,绘制不同类型的图表等等。我要做的第一件事是创建一个“Budget”类:它应该包含一个带有特定列的csv类,并返回一个对象“Budget”,该对象继承相同的数据框架方法,但我可以向其应用一组“Budget”方法。这是我的花

prepareData = function (csv, type=1) {

if (type == 1) {
Data = read.csv(csv,dec = ".")}
else if (type == 2) {
Data = read.csv2(csv,dec = ",")}
else {stop ("Accetable value for type are 1 and 2")}

NamesToHave = c("Date","Title","Amount","Category")

if (sum(as.numeric(colnames(Data) %in% NamesToHave)) < 4) {
    stop ("The csv file has not the mandatory columns (Data, Title, Amount, Category)")}




if (class(try(tolower(Data$Title),silent = T)) == "try-error" | class(try(tolower(Data$Category),silent = T)) == "try-error") {
    stop("Are you sure there are no special character in your csv file ?")} 

Data$Day = sapply(strsplit(as.character(Data$Date), "/"),"[[",1)
Data$Month = month.abb[as.numeric(sapply(strsplit(as.character(Data$Date), "/"),"[[",2))]
Data$Year = sapply(strsplit(as.character(Data$Date), "/"),"[[",3)

Data = Data[with(Data, order(Year, Month, Day)), ]
Data$Amount = as.character(Data$Amount)
Data$Amount = as.numeric(as.character(Data$Amount))

class(Data) <- append(class(Data),"Budget")
return(Data)
}

Now, this return a data frame with all the necessary modifications, and overall it works fine as a function, but if I take a csv as follows

现在,它返回一个数据框架,其中包含了所有必要的修改,总的来说,作为一个函数,它工作得很好,但是如果我使用csv,如下所示

structure(list(Date = structure(c(22L, 1L, 1L, 1L, 1L, 1L), .Label = c("01/10/2016", 
"01/11/2016", "02/10/2016", "04/10/2016", "04/11/2016", "05/10/2016", 
"05/11/2016", "06/10/2016", "06/11/2016", "07/10/2016", "08/10/2016", 
"08/11/2016", "09/10/2016", "09/11/2016", "10/10/2016", "10/11/2016", 
"11/10/2016", "12/11/2016", "14/10/2016", "16/10/2016", "18/10/2016", 
"20/09/2016", "20/10/2016", "21/10/2016", "22/09/2016", "22/10/2016", 
"23/09/2016", "23/10/2016", "25/09/2016", "25/10/2016", "26/09/2016", 
"26/10/2016", "27/10/2016", "28/10/2016", "29/10/2016", "30/10/2016"
), class = "factor"), Title = structure(c(20L, 6L, 36L, 29L, 
30L, 11L), .Label = c("Bagpiper", "beer debaser", "Br", "brewdog", 
"Burger King", "Clas", "coop", "Coop", "Eriksdalbadet", "etc", 
"ETC", "Flippin", "Fotografiska", "Gateau Agneta", "Grekisk fastfood", 
"Grill", "Gunnarson", "Gunnarsson", "hemkop", "HK", "Hotorhallen", 
"ICA", "ICA Skinnskat", "Igor Sport", "Intersport", "Kak", "klattercentret", 
"LullesFagel", "Mae Thai", "MamaWolf", "Material", "Matrerial", 
"Oriental Supermarket", "Paradiset", "Pendeltag Uppsala", "PGW", 
"Pressbyran", "Primeburger", "Primo Ciao ciao", "R Asia", "Systembolaget", 
"taxi Skinnskat", "The Cure drinks", "Udden pensionat", "Ugglan", 
"Wentzels hobby"), class = "factor"), Amount = c(167.27, 331, 
971, 99, 192, 3289), Category = structure(c(10L, 3L, 3L, 6L, 
6L, 3L), .Label = c("Drink", "extra", "Extra", "Extra_Fede", 
"extra_food", "Extra_food", "extra_laure", "Extra_Laure", "food", 
"Food"), class = "factor")), .Names = c("Date", "Title", "Amount", 
"Category"), row.names = c(NA, 6L), class = "data.frame")

and I run

和我跑

Data = prepareData("name.csv")
class(Data)

The output is just "data.frame". But if I then run again from terminal the second to last line of the function

输出只是“data.frame”。但是如果我再从函数的第二行到最后一行运行

class(Data) <- append(class(Data),"Budget")
class(Data)

I got "data.frame" and "Budget" as output.

我得到“data.frame”和“Budget”作为输出。

What am I doing wrong ?

我做错了什么?

1 个解决方案

#1


2  

Your problem was here:

你的问题是:

if (as.numeric(colnames(Data) %in% NamesToHave) != 4) {}

The first comparation will be vectorized performed and return TRUE TRUE TRUE TRUE, which will become 1 1 1 1 when gone throw as.numeric(). Then, this vector will be compared to != 4, which is vectorized performed and return TRUE TRUE TRUE TRUE (all the 'one's are different from four). The if()` statement will not evaluet the whole vector, just it's first element (and throw you a warning message).

第一个比较将被矢量化执行并返回TRUE TRUE TRUE TRUE TRUE TRUE,当抛出as.numeric()时,该值将变为1111。然后,将这个向量与!= 4进行比较,这个向量是向量化的,并返回TRUE TRUE TRUE TRUE TRUE(所有的“1”与4都不同)。if()语句不会求出整个向量的值,它只是第一个元素(并向您抛出一条警告消息)。

To solve this issue, you just have to switch the as.numeric() function to sum().

要解决这个问题,只需将asn .numeric()函数切换到sum()。

if (sum(colnames(Data) %in% NamesToHave) != 4) {}

When you sum a logical vector, Rwill coerce it to numerical: all TRUE become 1 and all FASLE become 0. Now you will have the 4 sum that will evaluet FALSEin the if statement, and the function it run smoothly. Once I solved it, it has both classes when I first run it.

当你对一个逻辑向量求和时,Rwill把它强制成数值:所有的真都变成1,所有的FASLE都变成0。现在您将得到4个和,它们将对if语句求值,并使其运行平稳。一旦我解决了它,当我第一次运行它时,它有两个类。

As said in this article, it good to restart R before posting your question and make sure you're still having the problem you're reporting.

如本文所述,最好在发布问题之前重新启动R,并确保报告的问题仍然存在。

#1


2  

Your problem was here:

你的问题是:

if (as.numeric(colnames(Data) %in% NamesToHave) != 4) {}

The first comparation will be vectorized performed and return TRUE TRUE TRUE TRUE, which will become 1 1 1 1 when gone throw as.numeric(). Then, this vector will be compared to != 4, which is vectorized performed and return TRUE TRUE TRUE TRUE (all the 'one's are different from four). The if()` statement will not evaluet the whole vector, just it's first element (and throw you a warning message).

第一个比较将被矢量化执行并返回TRUE TRUE TRUE TRUE TRUE TRUE,当抛出as.numeric()时,该值将变为1111。然后,将这个向量与!= 4进行比较,这个向量是向量化的,并返回TRUE TRUE TRUE TRUE TRUE(所有的“1”与4都不同)。if()语句不会求出整个向量的值,它只是第一个元素(并向您抛出一条警告消息)。

To solve this issue, you just have to switch the as.numeric() function to sum().

要解决这个问题,只需将asn .numeric()函数切换到sum()。

if (sum(colnames(Data) %in% NamesToHave) != 4) {}

When you sum a logical vector, Rwill coerce it to numerical: all TRUE become 1 and all FASLE become 0. Now you will have the 4 sum that will evaluet FALSEin the if statement, and the function it run smoothly. Once I solved it, it has both classes when I first run it.

当你对一个逻辑向量求和时,Rwill把它强制成数值:所有的真都变成1,所有的FASLE都变成0。现在您将得到4个和,它们将对if语句求值,并使其运行平稳。一旦我解决了它,当我第一次运行它时,它有两个类。

As said in this article, it good to restart R before posting your question and make sure you're still having the problem you're reporting.

如本文所述,最好在发布问题之前重新启动R,并确保报告的问题仍然存在。