10
COMP 103 Bitsets

COMP 103 Bitsets. 2 Sets, and more Sets! Unsorted Array Sorted ArrayO(n) for at least one of Linked Listcontains, add, remove Binary Search TreeO(log

Embed Size (px)

Citation preview

Page 1: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

CO

MP 1

03

Bitsets

Page 2: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

2

Sets, and more Sets! Unsorted Array Sorted Array O(n) for at least one of Linked List contains, add, remove

Binary Search Tree O(log n) for everything, if balanced

Can we do even better?

BitSets Hash tables

Same cost, regardless of size!

O(1) !!!

Page 3: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

3

More operations on Sets Operations on single elements:

contains add, remove

Operations on whole sets: size (cardinality) of a set iterate through the set intersection (values common to two sets) union (values in either of two sets) set difference (values in one set but not the

other) test for equality/subset

Depending on the application, we may need any or allthese operations to be fast!

Page 4: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

4

eg. Set Intersection Unsorted arrays / linked lists:

Algorithm: ? Cost: # comparisons =

Sorted arrays / linked lists:

Algorithm: ? Cost: # comparisons =

B X D T Q W E V Z C R F

F Y U J H I M X O K P T

B XD TQ WE V ZC RF

F YUJH I M XOK P T

Page 5: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

5

Set Intersection Binary Search Trees:

Algorithm:

Cost:

Ex: Work out algorithms and costs for the other “whole set” operations, using different Set implementations.

Page 6: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

6

Bit SetsIf the range of possible elements for a set is:

discrete finite not too big

... then can use an array of booleans: one cell for each possible element true if that element is in the set false if that element is not in the set

a b c d e f g h i j k l m n o p q r s t u v w x y z✔ ✗ ✔ ✔ ✔ ✔✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

a b c d e f g h i j k l m n o p q r s t u v w x y z✔ ✔ ✔ ✔ ✔ ✔ ✔✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

a b c d e f g h i j k l m n o p q r s t u v w x y z✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗✔ ✔ ✔ ✔ ✔✗ ✗ ✗ ✗ ✗ ✗ ✗

a,e,i,o,u

b,d,f,h,k,l,t

g,j,p,q,y

Page 7: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

7

BitSet Implementationprivate boolean[] data;

public BitSet(int maxItems) {data = new boolean[maxItems];

}public boolean contains(int value) {

if (value < 0 || value >= data.length)return false;

return data[value]; }public void add(int value) {

if (value >= 0 && value < data.length)data[value] = true;

}public void remove(int value) {

if (value >= 0 && value < data.length)data[value] = false;

}

Exercise: Extend add and remove to return booleans.

Or signal an error. Or use array list and expand as needed.

Page 8: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

8

BitSets: Costs

u y✔✗

set.contains(‘f’) set.add(‘y’) set.remove(‘u’)

Intersection:for (i=0…N)

ans.data[i] = set1.data[i] && set2.data[i] Cost: O(N) (number of possible values!!) but NOT

item comparisons!!

Other operations: union, difference, equal, subset, …?

a b c d e f g h i j k l m n o p q r s t v w x z✔ ✗ ✔ ✔ ✔ ✔✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

Cost: O(1)

Cost: O(1) Cost: O(1)

Page 9: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

9

BitSets are the best Very Fast!

Can be improved: boolean[ ] data uses just one bit in each memory

location could use every bit, by thinking of the whole array

as an int/long!

can then operate on sets using bitwise operations:

& and |.

But: Values must be integers or characters (to index

into an array) Number of possible values must be not too

large(especially for intersection, union, iteration)

Eg: Days, months, timetable hours

0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ….

Java might use less space than this, e.g. one byte per boolean.

Page 10: COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log

10

O(1) Sets with big values?

What about: Sets of objects (including strings, URLs, email

addresses)? Sets of floating point numbers (double)?

Need a way to compute an array index for an object, eg:add(“A sentence that belongs in a set of

sentences”)

“Hashing”: the number is the “hash code” of object

0 1 2 3 4 5 6 7 8 9 581 N✔ ✗ ✔ ✔✗ ✗ ✗ ✗ ✗ ✗ ✗⋯ ⋯✗

Hash function 581