T
- the type of instances that the
BloomFilter
accepts
@Beta public final class BloomFilter<T> extends Objectimplements Predicate <T>, Serializable
T
. A Bloom filter offers an approximate containment test with one-sided error: if it claims that an element is contained in it, this might be in error, but if it claims that an element is
not contained in it, then this is definitely true.
If you are unfamiliar with Bloom filters, this nice tutorial may help you understand how they work.
The false positive probability (FPP
) of a bloom filter is defined as the probability that mightContain(Object) will erroneously return true
for an object that has not actually been put in the BloomFilter
.
Bloom filters are serializable. They also support a more compact serial representation via the writeTo(java.io.OutputStream)
and readFrom(java.io.InputStream, com.google.common.hash.Funnel<T>)
methods. Both serialized forms will continue to be supported by future versions of this library. However, serial forms generated by newer versions of the code may not be readable by older versions of the code (e.g., a serialized bloom filter generated today may not be readable by a binary that was compiled 6 months ago).
Modifier and Type | Method and Description |
---|---|
boolean |
apply(T input)
Deprecated.
Provided only to satisfy the
Predicate interface; use mightContain(T) instead.
|
BloomFilter |
copy()
Creates a new
BloomFilter that's a copy of this instance.
|
static <T> BloomFilter |
create(Funnel
Creates a
BloomFilter
with the expected number of insertions and a default expected false positive probability of 3%.
|
static <T> BloomFilter |
create(Funnel
Creates a
BloomFilter
with the expected number of insertions and expected false positive probability.
|
static <T> BloomFilter |
create(Funnel
Creates a
BloomFilter
with the expected number of insertions and a default expected false positive probability of 3%.
|
static <T> BloomFilter |
create(Funnel
Creates a
BloomFilter
with the expected number of insertions and expected false positive probability.
|
boolean |
equals(Object
Indicates whether another object is equal to this predicate.
|
double |
expectedFpp()
Returns the probability that
mightContain(Object) will erroneously return
true for an object that has not actually been put in the
BloomFilter .
|
int |
hashCode()
|
boolean |
isCompatible(BloomFilter
Determines whether a given bloom filter is compatible with this bloom filter.
|
boolean |
mightContain(T object)
Returns
true if the element
might have been put in this Bloom filter,
false if this is
definitely not the case.
|
boolean |
put(T object)
Puts an element into this
BloomFilter .
|
void |
putAll(BloomFilter
Combines this bloom filter with another bloom filter by performing a bitwise OR of the underlying data.
|
static <T> BloomFilter |
readFrom(InputStream
Reads a byte stream, which was written by
writeTo(OutputStream), into a
BloomFilter<T> .
|
void |
writeTo(OutputStream
Writes this
BloomFilter to an output stream, with a custom format (not Java serialization).
|
public BloomFilter<T> copy()
BloomFilter
that's a copy of this instance. The new instance is equal to this instance but shares no mutable state.
public boolean mightContain(T object)
true
if the element
might have been put in this Bloom filter,
false
if this is
definitely not the case.
@Deprecated public boolean apply(T input)
Predicate
interface; use mightContain(T)
instead.
Predicate
input
. This method is
generally expected, but not absolutely required, to have the following properties:
Objects.equal
(a, b)
implies that predicate.apply(a) == predicate.apply(b))
. public boolean put(T object)
BloomFilter
. Ensures that subsequent invocations of
mightContain(Object)
with the same element will always return
true
.
object
has been added to the filter. If the bits haven't changed, this
might be the first time
object
has been added to the filter. Note that
put(t)
always returns the
opposite result to what
mightContain(t)
would have returned at the time it is called."
void
return type})
public double expectedFpp()
true
for an object that has not actually been put in the
BloomFilter
.
Ideally, this number should be close to the fpp
parameter passed in create(Funnel, int, double), or smaller. If it is significantly higher, it is usually the case that too many elements (more than expected) have been put in the BloomFilter
, degenerating it.
public boolean isCompatible(BloomFilter<T> that)
that
- The bloom filter to check for compatibility.
public void putAll(BloomFilter<T> that)
that
- The bloom filter to combine this bloom filter with. It is not mutated.
IllegalArgumentException
- if
isCompatible(that) == false
public boolean equals(Objectobject)
Predicate
Most implementations will have no reason to override the behavior of Object
. However, an implementation may also choose to return true
whenever object
is a Predicate
that it considers interchangeable with this one. "Interchangeable" typically means that this.apply(t) == that.apply(t)
for all t
of type T
). Note that a false
result from this method does not imply that the predicates are known not to be interchangeable.
public int hashCode()
public static <T> BloomFilter<T> create(Funnel <? super T> funnel, int expectedInsertions, double fpp)
BloomFilter
with the expected number of insertions and expected false positive probability.
Note that overflowing a BloomFilter
with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter<T>
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed
BloomFilter<T>
will use
expectedInsertions
- the number of expected insertions to the constructed
BloomFilter<T>
; must be positive
fpp
- the desired false positive probability (must be positive and less than 1.0)
BloomFilter
public static <T> BloomFilter<T> create(Funnel <? super T> funnel, long expectedInsertions, double fpp)
BloomFilter
with the expected number of insertions and expected false positive probability.
Note that overflowing a BloomFilter
with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter<T>
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed
BloomFilter<T>
will use
expectedInsertions
- the number of expected insertions to the constructed
BloomFilter<T>
; must be positive
fpp
- the desired false positive probability (must be positive and less than 1.0)
BloomFilter
public static <T> BloomFilter<T> create(Funnel <? super T> funnel, int expectedInsertions)
BloomFilter
with the expected number of insertions and a default expected false positive probability of 3%.
Note that overflowing a BloomFilter
with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter<T>
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed
BloomFilter<T>
will use
expectedInsertions
- the number of expected insertions to the constructed
BloomFilter<T>
; must be positive
BloomFilter
public static <T> BloomFilter<T> create(Funnel <? super T> funnel, long expectedInsertions)
BloomFilter
with the expected number of insertions and a default expected false positive probability of 3%.
Note that overflowing a BloomFilter
with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter<T>
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed
BloomFilter<T>
will use
expectedInsertions
- the number of expected insertions to the constructed
BloomFilter<T>
; must be positive
BloomFilter
public void writeTo(OutputStreamout) throws IOException
BloomFilter
to an output stream, with a custom format (not Java serialization). This has been measured to save at least 400 bytes compared to regular serialization.
Use readFrom(InputStream, Funnel) to reconstruct the written BloomFilter.
IOException
public static <T> BloomFilter<T> readFrom(InputStream in, Funnel <T> funnel) throws IOException
BloomFilter<T>
. The
Funnel
to be used is not encoded in the stream, so it must be provided here.
Warning: the funnel provided
must behave identically to the one used to populate the original Bloom filter!
IOException
- if the InputStream throws an
IOException
, or if its data does not appear to be a BloomFilter serialized using the
writeTo(OutputStream) method.