在某些条件下，布隆过滤器是否会返回误报？

Assume a bloom filter api, with 2 parameters - 1. number of bits in bloom filter (n) and 2. expected number of insertions (m).

假设具有2个参数的布隆过滤器api - 1.布隆过滤器中的比特数（n）和2.预期的插入数量（m）。

Question:

Will m > n always lead to complete false positives? By complete I intend to say, will every test for 'contains(element)' method return true, after m > n condition ?

m> n总会导致完全误报吗？通过完成我打算说，在m> n条件之后，'contains（element）'方法的每个测试都会返回true吗？

1 个解决方案

#1

The bloom filter will always answer yes not when your m > n, but when all n of its bits are 1 - then every query of h positions (where h is the number of hash functions) will yield h 1s. Still, the typical setup that optimizes the space vs. false positive rate tradeoff is when the probability of any bit being set is 1/2. The analysis is shown on the Bloom filter wikipedia article: http://en.wikipedia.org/wiki/Bloom_filter

布隆过滤器总是回答是，而不是当你的m> n，但是当它的所有n个位都是1时 - 那么h位置的每个查询（其中h是散列函数的数量）将产生h 1s。尽管如此，优化空间与误报率权衡的典型设置是当任何比特被设置的概率为1/2时。该分析显示在Bloom过滤器*文章中：http：//en.wikipedia.org/wiki/Bloom_filter

秒客网

在某些条件下，布隆过滤器是否会返回误报？

Question:

1 个解决方案

#1

#1

相关文章