Precision-Recall检索到的文档的值

时间:2022-12-07 11:11:08

I'm learning precison and recall of documents and I'm having trouble understanding this particular question.

我正在学习精确和回忆文件,我无法理解这个特定的问题。

The table below shows the relevance of the top 6 results returned by two ranked retrieval search engines denoted by A and B. '+' indicates relevant documents and '-' indicates non-relelevant documents.

下表显示了由A和B表示的两个排名检索搜索引擎返回的前6个结果的相关性。“+”表示相关文档,“ - ”表示非相关文档。

Precision-Recall检索到的文档的值

Assuming that the total number of relevant documents in the collection was 4, compute precision-recall values for the two engines for the top 1, 2, 3, 4, 5 and 6 results.

假设集合中相关文档的总数为4,则计算前两个引擎的精确召回值,分别为1,2,3,4,5和6个结果。

The solution given for search engine A was:

为搜索引擎A提供的解决方案是:

Precision:--100%--|--50%--|--33.3%--|--25%--|--40%--|--50%--|
Recall :------25%--|--25%--|--25%-----|--25%--|--50%--|--75%--|

精度: - 100% - | --50% - | --33.3% - | --25% - | --40% - | --50% - |回忆:------ 25% - | --25% - | --25%----- | --25% - | --50% - | --75% - |

The solution for B:

B的解决方案:

Precision: --|100%--|--100%--|--66.6%--|--50%--|--60%--|--50%--|
Recall: ----|---25%---|--50%----|--50%-----|--50%--|--75%--|--75%--|

精度: - | 100% - | --100% - | --66.6% - | --50% - | --60% - | --50% - |回忆:---- | --- 25%--- | --50%---- | --50%----- | --50% - | --75% - | - -75% - |

I know how to calculate for single documents and that Precsion = TP/(TP+FP) and Recall is TP/(TP+FN). I'm just not sure how some of the values above are calculated.

我知道如何计算单个文档,并且Precsion = TP /(TP + FP)和Recall是TP /(TP + FN)。我只是不确定如何计算上面的一些值。

1 个解决方案

#1


This is too long for a comment.

这个评论太长了。

Instead of trying to memorize formulas, try to understand the concepts.

不要试图记住公式,而是试着理解这些概念。

"Precision" is: What proportion of the results are correct? Hence, for both A and B, if you take the top result, it is correct. The precision is 100%.

“精确度”是:结果的比例是正确的?因此,对于A和B,如果你取得最好的结果,那就是正确的。精度为100%。

"Recall" is: What proportion of the correct results are present? Hence, for both A and B, if you take the top result, you have one out of four correct values, so the recall is 25%.

“召回”是:正确结果的比例是多少?因此,对于A和B,如果取得最高结果,则您有四分之一的正确值,因此召回率为25%。

#1


This is too long for a comment.

这个评论太长了。

Instead of trying to memorize formulas, try to understand the concepts.

不要试图记住公式,而是试着理解这些概念。

"Precision" is: What proportion of the results are correct? Hence, for both A and B, if you take the top result, it is correct. The precision is 100%.

“精确度”是:结果的比例是正确的?因此,对于A和B,如果你取得最好的结果,那就是正确的。精度为100%。

"Recall" is: What proportion of the correct results are present? Hence, for both A and B, if you take the top result, you have one out of four correct values, so the recall is 25%.

“召回”是:正确结果的比例是多少?因此,对于A和B,如果取得最高结果,则您有四分之一的正确值,因此召回率为25%。