Sets and Dictionaries

Sets

A set is an unordered collection of (unique) items. Examples:

examplesets.gif

Operations

s.add(e)
add element e to the set s
s.remove(e)
remove element e from the set s
s.contains(e)
returns whether e is a member of s
s.size()
returns the number of elements in s
s.isEmpty()
returns whether there are no elements in s
s.addAll(t)
adds all the elements of set t to set s
s.removeAll(t)
removes all the elements of t from s
s.retainAll(t)
removes all the elements from s that are not in t
s.union(t)
returns a new set which contains all the elements of set s and all the elements of set t, and no others
s.intersect(t)
returns a new set which contains all and only those elements in both s and t
s.minus(t)
returns a new set which contains all and only those elements in s but not in t
s.symmetricMinus(t)
returns a new set which contains all and only those elements in either set s or set t but not both

Dictionaries

A dictionary (also called a map, associative array, hash, or lookup table) is a collection of key-value pairs where every key is unique. We don't care as much about the pairs as we do about looking up the value associated with a given key. Example:

www.nato.int152.152.96.1
cr.yp.to131.193.178.175
losangeles.citysearch.com209.104.35.15
www.cl.cam.ac.uk128.232.0.20
mit.edu18.7.22.69
www.altan.ie212.108.64.74
securitysearch.net64.94.136.5
linuxassembly.org64.202.189.170
regular-expressions.info66.39.67.31
jpl.nasa.gov137.78.160.180
groups.google.com216.239.57.147
script.aculo.us86.59.5.67

Say: www.nato.int maps to 152.152.96.1, or www.nato.int is bound to 152.152.96.1.

Note that a set is just a kind of dictionary in which the value part is ignored.

Operations

m.put(k,v)
adds new key value pair to map m
m.get(k)
returns the value associated with key k in map m
m.containsKey(k)
returns whether there's a key-value pair in map m with key k
m.containsValue(v)
returns whether there's a key-value pair in map m with value v
m.putAll(otherMap)
adds all the key-value pairs of otherMap to m
m.keySet()
returns the set of keys of m
m.values()
returns the values of m in some collection (not a set, since they don't have to be unique)
m.entrySet()
returns the set of key-value pairs in m
m.remove(k)
removes the key-value pair with key k from map m
m.clear()
removes all the key-value pairs from m
m.size()
returns the number of key-value pairs in m
m.isEmpty()
returns whether there are zero pairs in map m

Representations of Sets and Dictionaries

There are dozens of ways to implement these.

No one representation is best for all situations. You need to take into account:

But remember

If your keys are integers in the range 0..N, use a PLAIN OLD ARRAY

As you go over each data structure, make a little fact sheet such as the following:

Unsorted Associative Array or List
Basic idea: Key-value pairs are stored in a simple unsorted array or linked list
Insertion: Θ(n) if you check for duplicate keys, Θ(1) otherwise
Deletion: Θ(n) since you have to find the item first
Lookup: Θ(n) since the whole list must be scanned
Key type restrictions: None at all!
Good for: Small collections only
Sorted Associative Array
Basic idea: Key-value pairs are stored in an array sorted by key
Insertion: Θ(n) in general due to sifting; Θ(1) if adding at the end
Deletion: Θ(n) in general due to sifting; Θ(1) if from the end
Lookup: Θ(log n) using binary search
Key type restrictions: Must be ordered
Good for: Memory-constrained environments; read-only collections or those filled only in ascending order