I'm trying to plot ROC curve of a random forest classification. Plotting works, but I think I'm plotting the wrong data since the resulting plot only has one point (the accuracy).
我正在尝试绘制随机森林分类的ROC曲线。绘图工作,但我认为我正在绘制错误的数据,因为结果绘图只有一个点(精度)。
This is the code I use:
这是我使用的代码:
set.seed(55)
data.controls <- cforest_unbiased(ntree=100, mtry=3)
data.rf <- cforest(type ~ ., data = dataset ,controls=data.controls)
pred <- predict(data.rf, type="response")
preds <- prediction(as.numeric(pred), dataset$type)
perf <- performance(preds,"tpr","fpr")
performance(preds,"auc")@y.values
confusionMatrix(pred, dataset$type)
plot(perf,col='red',lwd=3)
abline(a=0,b=1,lwd=2,lty=2,col="gray")
1 个解决方案
#1
2
To plot a receiver operating curve you need to hand over continuous output of the classifier, e.g. posterior probabilities. That is, you need to predict (data.rf, newdata, type = "prob"
).
要绘制接收器操作曲线,您需要移交分类器的连续输出,例如,后验概率。也就是说,您需要预测(data.rf,newdata,type =“prob”)。
predict
ing with type = "response"
already gives you the "hardened" factor as output. Thus, your working point is implicitly fixed already. With respect to that, your plot is correct.
使用type =“response”进行预测已经为您提供了“硬化”因子作为输出。因此,您的工作点已经隐式修复。关于这一点,你的情节是正确的。
side note: in bag prediction of random forests will be highly overoptimistic!
旁注:随机森林的袋子预测会高度过于乐观!
#1
2
To plot a receiver operating curve you need to hand over continuous output of the classifier, e.g. posterior probabilities. That is, you need to predict (data.rf, newdata, type = "prob"
).
要绘制接收器操作曲线,您需要移交分类器的连续输出,例如,后验概率。也就是说,您需要预测(data.rf,newdata,type =“prob”)。
predict
ing with type = "response"
already gives you the "hardened" factor as output. Thus, your working point is implicitly fixed already. With respect to that, your plot is correct.
使用type =“response”进行预测已经为您提供了“硬化”因子作为输出。因此,您的工作点已经隐式修复。关于这一点,你的情节是正确的。
side note: in bag prediction of random forests will be highly overoptimistic!
旁注:随机森林的袋子预测会高度过于乐观!