I want to define my own distribution functions to be used with fitdist or fitdistr function in R. Using fitdist in the fitdistrplus package as an example. I define a customized distribution called sgamma as follows:
我想定义我自己的分布函数,用fitdist或fitdistr函数在r中使用fitdist。我定义了一个自定义的分布,称为sgamma:
dsgamma<-function(x,shape){return(dgamma(x,shape,scale=1));}
qsgamma<-function(p,shape){return(qgamma(p,shape,scale=1));}
psgamma<-function(q,shape){return(pgamma(q,shape,scale=1));}
rsgamma<-function(n,shape){return(rgamma(n,shape,scale=1));}
My question is where I should define these functions.
我的问题是我应该在哪里定义这些函数。
If the above definitnion and declaration is made in the top environment, then I could call fitdist using this distribution function. In other words, my script test1.R with the following content will run just fine:
如果上面的定义和声明是在顶部环境中生成的,那么我可以使用这个分布函数调用fitdist。换句话说,我的脚本test1。以下内容将会运行良好:
rm(list=ls())
require(fitdistrplus);
dsgamma<-function(x,shape){return(dgamma(x,shape,scale=1));}
qsgamma<-function(p,shape){return(qgamma(p,shape,scale=1));}
psgamma<-function(q,shape){return(pgamma(q,shape,scale=1));}
rsgamma<-function(n,shape){return(rgamma(n,shape,scale=1));}
x<-rgamma(100, shape=0.4, scale=1);
zfit<-fitdist(x, distr=dsgamma, start=list(shape=0.3));
Now, if I wrapped the above code in a function, it does not work. See test2.R below:
现在,如果我将上面的代码封装在一个函数中,它就不起作用了。看到test2。R:
rm(list=ls())
testfit<-function(x)
{
require(fitdistrplus);
dsgamma<-function(x,shape){return(dgamma(x,shape,scale=1));}
qsgamma<-function(p,shape){return(qgamma(p,shape,scale=1));}
psgamma<-function(q,shape){return(pgamma(q,shape,scale=1));}
rsgamma<-function(n,shape){return(rgamma(n,shape,scale=1));}
zfit<-fitdist(x, distr=dsgamma, start=list(shape=0.3));
return(zfit);
}
x<-rgamma(100, shape=0.4, scale=1);
zfit<-testfit(x);
I got the following error:
我有以下错误:
Error in fitdist(x, distr = dsgamma, start = list(shape = 0.3)) :
The dsgamma function must be defined
Note that I still get an error if I replace
请注意,如果我替换,仍然会得到一个错误。
zfit<-fitdist(x, distr=dsgamma, start=list(shape=0.3));
with
与
zfit<-fitdist(x, distr="sgamma", start=list(shape=0.3));
I guess the key question is where fitdist look for the function specified by the parameter distr. I would really appreciate your help.
我想最关键的问题是fitdist在哪里寻找参数distr指定的函数,我非常感谢您的帮助。
1 个解决方案
#1
3
Great question. The reason for this error is that the authors of the fitdistrplus
package use exists()
to check for variations of arguments needed by the function.
好问题。这个错误的原因是,fitplus包的作者使用exists()来检查函数所需的参数的变化。
The following is an excerpt from the code of the fitdist
and mledist
functions. Where the authors take the value given for distr
and search for appropriate density and probability functions in the global environment and the environment where fitdist
and mledist
are defined.
以下摘录自fitdist和mledist函数的代码。作者将在全球环境和环境中定义的适当密度和概率函数,以及在定义了fitdist和mledist的环境中,取其值。
if (!exists(ddistname,mode="function"))
stop(paste("The ", ddistname, " function must be defined"))
pdistname <- paste("p", distname, sep = "")
if (!exists(pdistname,mode="function"))
stop(paste("The ", pdistname, " function must be defined"))
This is an excerpt from how exists works:
这是一段关于如何工作的摘录:
This function looks to see if the name ‘x’ has a value bound to it in the specified environment. If ‘inherits’ is ‘TRUE’ and a value is not found for ‘x’ in the specified environment, the enclosing frames of the environment are searched until the name ‘x’ is encountered. See ‘environment’ and the ‘R Language Definition’ manual for details about the structure of environments and their enclosures.
这个函数检查名称“x”是否在指定的环境中绑定到它。如果“继承”是“TRUE”,并且在指定的环境中没有找到“x”的值,则搜索环境的封闭框架,直到遇到“x”为止。参见“环境”和“R语言定义”手册,了解环境的结构及其附件。
To learn more about why exists is making your function fail check this article: http://adv-r.had.co.nz/Environments.html
要了解更多关于为什么存在的原因,请查看本文:http://adv-r.had.co.nz/Environments.html。
Essentially, fitdist and mledist are not searching in the environment of the function you are creating giving you the error that the dsgamma (and the other functions you define) do not exist.
从本质上说,fitdist和mledist并没有在您创建的函数的环境中进行搜索,从而给您提供了dsgamma(以及您定义的其他函数)不存在的错误。
This can be most easily circumvented by using <<-
instead of <-
to define the functions within your testfit(). This will define your child functions globally.
使用<<-而不是<-来定义testfit()中的函数,可以很容易地绕过它。这将在全局上定义您的子函数。
> testfit<-function(x)
+ {
+ require(fitdistrplus);
+ dsgamma<<-function(x,shape){return(dgamma(x,shape,scale=1))}
+ qsgamma<<-function(p,shape){return(qgamma(p,shape,scale=1))}
+ psgamma<<-function(q,shape){return(pgamma(q,shape,scale=1))}
+ rsgamma<<-function(n,shape){return(rgamma(n,shape,scale=1))}
+ zfit<-function(x){return(fitdist(x,distr="sgamma" , start=list(shape=0.3)))};
+ return(zfit(x))
+ }
!> testfit(x)
Fitting of the distribution ' sgamma ' by maximum likelihood
Parameters:
estimate Std. Error
shape 0.408349 0.03775797
You can alter the code of fitdist to search in your function's environment by adding envir=parent.frame() to the exists checks like follows, but I do not recommend this.
您可以通过添加envir=parent.frame()来修改fitdist在您的函数环境中搜索的代码,如下所示,但我不推荐这样做。
if (!exists(ddistname,mode="function",envir=parent.frame()))
However, this still doesn't solve your problem as fitdist
calls mledist
and mledist
has the same problem.
然而,这仍然不能解决你的问题,fitdist称mledist和mledist有同样的问题。
Error in mledist(data, distname, start, fix.arg, ...) (from #43) :
The dsgamma function must be defined
To pursue this approach you will have to alter mledist
as well and tell it to search in the parent.frame of fitdistr
. You will have to make these changes each time you load the library.
要追求这种方法,你必须改变mledist,并告诉它在父母身上搜索。每次加载库时,都必须进行这些更改。
#1
3
Great question. The reason for this error is that the authors of the fitdistrplus
package use exists()
to check for variations of arguments needed by the function.
好问题。这个错误的原因是,fitplus包的作者使用exists()来检查函数所需的参数的变化。
The following is an excerpt from the code of the fitdist
and mledist
functions. Where the authors take the value given for distr
and search for appropriate density and probability functions in the global environment and the environment where fitdist
and mledist
are defined.
以下摘录自fitdist和mledist函数的代码。作者将在全球环境和环境中定义的适当密度和概率函数,以及在定义了fitdist和mledist的环境中,取其值。
if (!exists(ddistname,mode="function"))
stop(paste("The ", ddistname, " function must be defined"))
pdistname <- paste("p", distname, sep = "")
if (!exists(pdistname,mode="function"))
stop(paste("The ", pdistname, " function must be defined"))
This is an excerpt from how exists works:
这是一段关于如何工作的摘录:
This function looks to see if the name ‘x’ has a value bound to it in the specified environment. If ‘inherits’ is ‘TRUE’ and a value is not found for ‘x’ in the specified environment, the enclosing frames of the environment are searched until the name ‘x’ is encountered. See ‘environment’ and the ‘R Language Definition’ manual for details about the structure of environments and their enclosures.
这个函数检查名称“x”是否在指定的环境中绑定到它。如果“继承”是“TRUE”,并且在指定的环境中没有找到“x”的值,则搜索环境的封闭框架,直到遇到“x”为止。参见“环境”和“R语言定义”手册,了解环境的结构及其附件。
To learn more about why exists is making your function fail check this article: http://adv-r.had.co.nz/Environments.html
要了解更多关于为什么存在的原因,请查看本文:http://adv-r.had.co.nz/Environments.html。
Essentially, fitdist and mledist are not searching in the environment of the function you are creating giving you the error that the dsgamma (and the other functions you define) do not exist.
从本质上说,fitdist和mledist并没有在您创建的函数的环境中进行搜索,从而给您提供了dsgamma(以及您定义的其他函数)不存在的错误。
This can be most easily circumvented by using <<-
instead of <-
to define the functions within your testfit(). This will define your child functions globally.
使用<<-而不是<-来定义testfit()中的函数,可以很容易地绕过它。这将在全局上定义您的子函数。
> testfit<-function(x)
+ {
+ require(fitdistrplus);
+ dsgamma<<-function(x,shape){return(dgamma(x,shape,scale=1))}
+ qsgamma<<-function(p,shape){return(qgamma(p,shape,scale=1))}
+ psgamma<<-function(q,shape){return(pgamma(q,shape,scale=1))}
+ rsgamma<<-function(n,shape){return(rgamma(n,shape,scale=1))}
+ zfit<-function(x){return(fitdist(x,distr="sgamma" , start=list(shape=0.3)))};
+ return(zfit(x))
+ }
!> testfit(x)
Fitting of the distribution ' sgamma ' by maximum likelihood
Parameters:
estimate Std. Error
shape 0.408349 0.03775797
You can alter the code of fitdist to search in your function's environment by adding envir=parent.frame() to the exists checks like follows, but I do not recommend this.
您可以通过添加envir=parent.frame()来修改fitdist在您的函数环境中搜索的代码,如下所示,但我不推荐这样做。
if (!exists(ddistname,mode="function",envir=parent.frame()))
However, this still doesn't solve your problem as fitdist
calls mledist
and mledist
has the same problem.
然而,这仍然不能解决你的问题,fitdist称mledist和mledist有同样的问题。
Error in mledist(data, distname, start, fix.arg, ...) (from #43) :
The dsgamma function must be defined
To pursue this approach you will have to alter mledist
as well and tell it to search in the parent.frame of fitdistr
. You will have to make these changes each time you load the library.
要追求这种方法,你必须改变mledist,并告诉它在父母身上搜索。每次加载库时,都必须进行这些更改。