3 parts: 1) Bloom Filter, 2)Flajolet-Martin algorithm, 3)AMS algorithm
-
Have branch, bloom_filter.
-
Installed nltk and bitarray.
-
implemented and tested universal hash function.
-
Bloom filter made and timed
- Takes < 200 seconds
-
Data stream checked against filter and timed.
- Takes ~ 30 seconds
Total number of word collsion: 66350
Number of words in Proper.txt: 32657
Number of false positives: 33693
False positive rate: 33693/66350 = 0.508
- completed
- completed