精确召回曲线下面积在R或其他汇总量

I plan to use the precision-recall plot (PR plot) to compare models. See the attached figure (partial screenshot, sorry!) below. Obviously I have the true positives, true negatives, false positives and false negatives at hand, and I need a a single summary quantity for each model. Here are my questions:

我计划用精确回忆图(PR图)来比较模型。请看附件的图片(部分截图，抱歉!)显然，我有真正的优点，真正的缺点，假阳性和假阴性，而且我需要一个单一的汇总数量为每个模型。这里是我的问题:

Area Under the PR curve (AUC) is the first quantity, but I don't know how to calculate that in R. I do NOT want to use any package like ROCR because all the codes are written by myself and I hope to write my own codes using the quantities available. It seems that there are many ways -- I hope to know which one is the most implementable.

PR曲线下面积(AUC)是第一个量，但我不知道如何计算r，我不想使用像ROCR这样的包，因为所有的代码都是我自己写的，我希望用可用的数量写我自己的代码。看起来有很多方法——我希望知道哪一种最容易实现。
Another quantity is the F-measure: a measure that combines precision and recall is the harmonic mean of precision and recall, the traditional F-measure or balanced F-score. However, I am curious if this is better than the AUC in #1 or they are describing different things? Moreover, since I have a bunch of Recall and Precision values, how can I calculate a single F measure in this case (see Figure below).

另一个量是F-measure:一种结合了精度和回忆的度量，是精度和回忆的调和平均值，是传统的F-measure或平衡的F-score。然而，我很好奇这是否比第一条中的AUC更好，还是他们描述了不同的东西?此外，由于我有很多回忆和精度值，在这种情况下如何计算单个F度量(见下图)。

Thank you!

谢谢你！

精确召回曲线下面积在R或其他汇总量

1 个解决方案

#1

To calculate the AUC of a curve, you can use a numeric integration function such as trapz() in the caTools package.

要计算曲线的AUC，可以在caTools包中使用数字集成函数trapz()。

auc <- trapz(recall, precision)

auc < - trapz(记得,精度)

The F-score is the harmonic mean for a given cutoff value. In your case, you would get many F-scores for each curve so it would not summarize the curve as you like.

F-score是给定截止值的调和平均值。在你的例子中，你会得到很多f -score，所以它不会像你想的那样总结曲线。

The AUC describes the performance of the model across possible values of the continuous output from the model. The F-score describes a model at a particular cutpoint. It is more of a way to combine recall and precision to a single statistic.

AUC描述了模型在模型连续输出的可能值上的性能。F-score在特定的断点处描述一个模型。它更像是一种将回忆和精确结合在一起的方法。

Be careful when explaining it though. Usually, AUC is discussed in the context of sensitivity and specificity.

不过解释的时候要小心。通常，AUC是在敏感性和特异性的背景下讨论的。

#1

To calculate the AUC of a curve, you can use a numeric integration function such as trapz() in the caTools package.

要计算曲线的AUC，可以在caTools包中使用数字集成函数trapz()。

auc <- trapz(recall, precision)

auc < - trapz(记得,精度)

The F-score is the harmonic mean for a given cutoff value. In your case, you would get many F-scores for each curve so it would not summarize the curve as you like.

F-score是给定截止值的调和平均值。在你的例子中，你会得到很多f -score，所以它不会像你想的那样总结曲线。

AUC描述了模型在模型连续输出的可能值上的性能。F-score在特定的断点处描述一个模型。它更像是一种将回忆和精确结合在一起的方法。

Be careful when explaining it though. Usually, AUC is discussed in the context of sensitivity and specificity.

不过解释的时候要小心。通常，AUC是在敏感性和特异性的背景下讨论的。

秒客网

精确召回曲线下面积在R或其他汇总量

1 个解决方案

#1

#1

相关文章