Upload
stewart-hodge
View
224
Download
0
Embed Size (px)
Citation preview
1
Concrete collections II
2
HashSet
• hash codes are used to organize elements in the collections, calculated from the state of an object – hash codes are not necessarily unique
• and so-called buckets are used for hash collisions – initial bucket count is recommended to be
between 75 % and 150% of the expected element count
– automatic rehashing when load factor (default 75%) threshold is reached
• then, the bucket count is doubled • for example,
new HashSet (101, 0.75) // capacity, load factor
3
HashSet (cont.)
4
HashSet (cont.)
• iteration in (seemingly) random order • however, LinkedHashSet can preserve insertion
order
• a general problem involving search structures: – hashCode should not change when called
multiple times on the the same object – changing the state of an element of a set or map
may corrupt the data structure
5
On defining hash functions
• hashCode can be any integer, positive or negative • definitions of equals and hashCode must be
compatible so that:
if x.equals (y), then also x.hashCode () == y.hashCode ()
• equals compares two object for "equality": do they have the same state – the default implementation in Object compares the
objects for "identity": are they the same object • hash code is calculated recursively on every
significant field and referenced object, and then combined into one integer result
• the programmer must determine what is significant
6
On defining hash functions (cont.)
• e.g., see Item.hashCode () in the Ex. 2-3 TreeSetTest
return 13 * description.hashCode() + 17 * partNo; • a compatible Item.equals:
if (this == otherObj) return true;
if (otherObj == null) return false;
if (getClass() != otherObj.getClass()) return false;
Item other = (Item) otherObj; // cast is safe!
return description.equals(other.description)
&& partNumber == other.partNumber;
• writing really good hash functions may involve mathematical/theoretic problems; see, e.g., [Bloch, 2001] for advice on writing Java hash functions
7
TreeSet
• TreeSet defines a sorted collection, i.e., implements SortedSet
• elements implement the interface Comparable <T>:
public int compareTo (T other) {
return ... // -1, 0, 1 if <, =, >
} • otherwise, TreeSet must use a Comparator <Item>:
SortedSet <Item> sortByComparator =
new TreeSet <Item> (
new Comparator <Item> () {
public int compare (Item a, Item b) {
.. return descrA.compareTo (descrB);
}});
8
TreeSet (cont.)
• iteration is done in sorted order
• comparing string ignoring case is a common requirement
– the String class defines a Comparator <String> object String.CASE_INSENSITIVE_ORDER
– may result in an unsatisfactory ordering (since locales are not considered)
9
Concrete Maps
• associative table consisting of key-value pairs • two concrete implementations: HashMap, TreeMap
value = map.get (keyObject) // or null if not found
map.put (keyObject, valueObject) // insert key-value
• put returns the old value – returning null means that there was no previous
definition for that key• keyObject may be null• valueObject cannot be null
10
Concrete Maps (cont.)
• maps are not collections, but you can get views on keys, values, and entries:
Set <K> aSet = map.keySet ();
Set <Map.Entry <K,V>> aSet = map.entrySet ();
Collection <V> c = map.values (); // a multiset• note that keys and entries may be (sorted) sets, but
values are Collections (i.e., multisets or bags) • however, generally collection views can be sets or
multisets, depending on the context • views are collections and thus potentially more powerful
than iterators • a view can be handled as one unit, provide extra
operations, and potentially allow multiple traversals
11
Concrete Maps (cont.)• for example, can enumerate all keys of a map:
Set <String> keys = map.keySet ();
for (String key : keys) { do something with key
} • if want both keys and values:
for (Map.Entry<String, Empl> entry : staff.entrySet()){ String key = entry.getKey (); Empl value = entry.getValue (); do something with key and value}
• can remove (but cannot add) entries through iterators of such view collections for maps
12
Special Map Classes
• IdentityHashMap uses the value returned by System.identityHashCode (anObject) to organize the keys – correspondingly, == is used to test the equality of
the objects (i.e., object states are ignored) • the hash value represents a low-level physical
address/memory reference – the original hashCode value defined by Object
does return distinct integers for distinct objects – typically implemented by converting the internal
address of the object into an integer • useful for object traversal, where only the identity of
objects is meaningful
16
Collections and algorithms • lists can also be handled by algorithms provided in
the Collections class • note the 's' at the end of Collections! • for example, picking up a winning combination:
List <Integer> numbers = . .;
Collections.shuffle (numbers);
List <Integer> winning = numbers.subList (0, 7);
Collections.sort (winning);
System.out.println (winning);
• when implementing your own algorithms or queries, consider returning a copy of values, or a view into the original collection to prevent unwanted modifications of a shared data structure
17
Collections and algorithms (cont.)
• e.g., sorting: Collections.sort (aList, aComparator) – copies the specified list into an array, sorts the
array, and iterates over updating the list – stable sort that preserves order of equal values
• Collections.reverseOrder () returns a comparator that imposes the reverse of the natural ordering
• of course, algorithms may presuppose that given collections allow and support appropriate operations – a List is modifiable if it supports the set method – a List is resizable if it supports the add and
remove methods
18
Collections and algorithms (cont.)
• binary search finds value or locates insertion point:
i = Collections.binarySearch (aList, key, aComp)
– if not found, returns the insertion point (say, 0) + 1, as a negative value (here, -1):
if (i < 0) aList.add (-i-1, key); // -(i+1)
• provides many simple algorithms, e.g.: – obj = Collections.max (aCollection, aComparator) – min, copy, fill, replaceAll, indexOfSubList,
lastIndexOfSubList, reverse, rotate, etc. – the code becomes more readable