90
Leveraging bloom filters on Redis

Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Leveraging bloom filters on Redis

Page 2: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Cristian [email protected] | [email protected]

https://cristian.io

Page 3: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Stream processing at Scopely

Page 4: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Stream processing at Scopely

Page 5: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 6: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 7: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 8: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 9: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 10: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 11: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 12: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 13: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 14: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 15: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 16: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Idempotence

Page 17: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

An operation is said to be idempotent when applying it multiple times has the same

effect.

Page 18: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Simplest approach to idempotence

Page 19: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Idempotence with Redis sets

Page 20: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Idempotence with Redis sets

Page 21: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Idempotence with Redis sets

Page 22: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Idempotence with Redis sets

Page 23: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Memory usage per idempotence store320 million records/day ≈ 70GB of memory

Page 24: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Is there a better way?

Page 25: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Is there a better way?• Space-efficient

Page 26: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Is there a better way?• Space-efficient

• Cost-effective

Page 27: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Is there a better way?• Space-efficient

• Cost-effective

• More performant

Page 28: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Is there a better way?• Space-efficient

• Cost-effective

• More performant

• Awesome

Page 29: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Enter bloom filtersProbabilistic data structure to

check for item membership

Page 30: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Enter bloom filtersProbabilistic data structure to

check for item membership

Page 31: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters query

Page 32: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters query• Definitely not in the set

Page 33: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters query• Definitely not in the set

• Probably in the set

Page 34: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters query• Definitely not in the set

• Probably in the set

• Configurable error rate

Page 35: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

Page 36: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

• Redis set: 1GB

Page 37: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

• Redis set: 1GB

• Plain text: ~300 MB

Page 38: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

• Redis set: 1GB

• Plain text: ~300 MB

• gzip: ~150 MB

Page 39: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

• Redis set: 1GB

• Plain text: ~300 MB

• gzip: ~150 MB

• Bloom filter with 1e-05 error rate: ~30MB(i.e., 1 in a million)

Page 40: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom fiters space efficiencyGiven 10.000.000 UUIDs...

• Redis set: 1GB

• Plain text: ~300 MB

• gzip: ~150 MB

• Bloom filter with 1e-05 error rate: ~30MB(i.e., 1 in a million)

• Bloom filter with 1e-11 error rate: ~60MB(i.e., 1 in a million million)

Page 41: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Memory usage comparisonSets 70GB vs Bloom Filters 7GB

Page 42: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Latency comparison

Redis sets Bloom filters

Page 43: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters example

Page 44: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 45: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 46: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 47: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 48: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 49: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 50: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 51: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 52: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 53: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 54: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 55: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 56: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 57: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 58: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 59: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be
Page 60: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

False positive == dropped data

Page 61: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Bloom filters characteristics

• Capacity

• Error rate probability

Page 62: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 63: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 64: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 65: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 66: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 67: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 68: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 69: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Scaling bloom filters

Page 70: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Tuning bloom filtersSize depends on capacity/error

probability

Page 71: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Tuning bloom filters

Page 72: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Tuning bloom filters

• False positive probability:

• Depends on your use case

Page 73: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Tuning bloom filters

• False positive probability:

• Depends on your use case

• Initial capacity:

• Can't be too generous

• Can't be too conservative

Page 74: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

First attempt: LUA scripts

Page 75: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Second attempt: bloomd

github.com/armon/bloomd

Page 76: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks

Page 77: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks• Lack of High Availability

Page 78: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks• Lack of High Availability

• No clustering support

Page 79: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks• Lack of High Availability

• No clustering support

• Maintenance

Page 80: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks• Lack of High Availability

• No clustering support

• Maintenance

• Rigid API

Page 81: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

bloomd drawbacks• Lack of High Availability

• No clustering support

• Maintenance

• Rigid API

• Feels like abandonware

Page 82: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloomBloom filters as a Redis module

Page 83: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom example> BF.RESERVE your_filter 0.00001 50000000OK

> BF.ADD your_filter foo1

> BF.EXISTS your_filter foo1

> BF.EXISTS your_filter bar0

Page 84: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom

Page 85: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom• Clustering

Page 86: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom• Clustering

• Redundancy/replication

Page 87: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom• Clustering

• Redundancy/replication

• Lower cognitive overhead

Page 88: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom• Clustering

• Redundancy/replication

• Lower cognitive overhead

• Powerful API

Page 89: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

ReBloom• Clustering

• Redundancy/replication

• Lower cognitive overhead

• Powerful API

• No maintainance

Page 90: Leveraging bloom › bloom-presentation.pdfTuning bloom filters •False positive probability: •Depends on your use case •Initial capacity: •Can't be too generous •Can't be

Summary

• Bloom filters significantly reduce memory usage and latency • Redis modules allows your custom data structures to scale

github.com/casidiablocristian.io