Dictionaries Copyright Software Carpentry 2010 This work is licensed under the Creative Commons...

Preview:

DESCRIPTION

Sets and DictionariesDictionaries Back to the data from our summer counting birds in a mosquito-infested swamp in northern Ontario How many birds of each kind did we see?

Citation preview

Dictionaries

Copyright © Software Carpentry 2010This work is licensed under the Creative Commons Attribution LicenseSee http://software-carpentry.org/license.html for more information.

Sets and Dictionaries

Sets and Dictionaries Dictionaries

Back to the data from our summer counting birds ina mosquito-infested swamp in northern Ontario

Sets and Dictionaries Dictionaries

Back to the data from our summer counting birds ina mosquito-infested swamp in northern OntarioHow many birds of each kind did we see?

Sets and Dictionaries Dictionaries

Back to the data from our summer counting birds ina mosquito-infested swamp in northern OntarioHow many birds of each kind did we see?Input is a list of several thousand bird names

Sets and Dictionaries Dictionaries

Back to the data from our summer counting birds ina mosquito-infested swamp in northern OntarioHow many birds of each kind did we see?Input is a list of several thousand bird namesOutput is a list of names and counts

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

List of pairs

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

Name to add

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1)

Look at each pairalready in the list

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

If this is the birdwe're looking for…

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

…add 1 to itscount and finish

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])Otherwise, add

a new pair tothe list

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

Pattern: handle an existing case and return in loop,or take default action if we exit the loop normally

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

start []

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

start []loon [['loon', 1]]

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

start []loon [['loon', 1]]goose [['loon', 1], ['goose', 1]]

Sets and Dictionaries Dictionaries

Could use a list of [name, count] pairs def another_bird(counts, bird_name): for i in range(len(counts)): if counts[i][0] == bird_name: counts[i][1] += 1 return counts.append([bird_name, 1])

start []loon [['loon', 1]]goose [['loon', 1], ['goose', 1]]loon [['loon', 2], ['goose', 1]]

Sets and Dictionaries Dictionaries

There's a better way

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionary

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairs

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:- Immutable

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:- Immutable- Unique

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:- Immutable- Unique- Not stored in any particular order

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:- Immutable- Unique- Not stored in any particular orderNo restrictions on values

Sets and Dictionaries Dictionaries

There's a better wayUse a dictionaryAn unordered collection of key/value pairsLike set elements, keys are:- Immutable- Unique- Not stored in any particular orderNo restrictions on values- Don't have to be immutable or unique

Sets and Dictionaries Dictionaries

Create a dictionary by putting key:value pairs in {}

Sets and Dictionaries Dictionaries

>>> birthdays = {'Newton' : 1642, 'Darwin' : 1809}

Create a dictionary by putting key:value pairs in {}

Sets and Dictionaries Dictionaries

>>> birthdays = {'Newton' : 1642, 'Darwin' : 1809}

Create a dictionary by putting key:value pairs in {}

Retrieve values by putting key in []

Sets and Dictionaries Dictionaries

>>> birthdays = {'Newton' : 1642, 'Darwin' : 1809}

Create a dictionary by putting key:value pairs in {}

Retrieve values by putting key in []Just like indexing strings and lists

Sets and Dictionaries Dictionaries

>>> birthdays = {'Newton' : 1642, 'Darwin' : 1809}

Create a dictionary by putting key:value pairs in {}

>>> print birthdays['Newton']1642

Retrieve values by putting key in []Just like indexing strings and lists

Sets and Dictionaries Dictionaries

>>> birthdays = {'Newton' : 1642, 'Darwin' : 1809}

Create a dictionary by putting key:value pairs in {}

>>> print birthdays['Newton']1642

Retrieve values by putting key in []Just like indexing strings and lists

Just like using a phonebook or dictionary

Sets and Dictionaries Dictionaries

Add another value by assigning to it

Sets and Dictionaries Dictionaries

>>> birthdays['Turing'] = 1612 # that's not right

Add another value by assigning to it

Sets and Dictionaries Dictionaries

>>> birthdays['Turing'] = 1612 # that's not right

Add another value by assigning to it

Overwrite value by assigning to it as well

Sets and Dictionaries Dictionaries

>>> birthdays['Turing'] = 1612 # that's not right

Add another value by assigning to it

>>> birthdays['Turing'] = 1912>>> print birthdays{'Turing' : 1912, 'Newton' : 1642, 'Darwin' : 1809}

Overwrite value by assigning to it as well

Sets and Dictionaries Dictionaries

Note: entries are not in any particular order

Sets and Dictionaries Dictionaries

'Turing'

'Newton'

'Darwin'1912

1642

1809

Note: entries are not in any particular order

Sets and Dictionaries Dictionaries

Key must be in dictionary before use

Sets and Dictionaries Dictionaries

>>> birthdays['Nightingale']KeyError: 'Nightingale'

Key must be in dictionary before use

Sets and Dictionaries Dictionaries

>>> birthdays['Nightingale']KeyError: 'Nightingale'

Key must be in dictionary before use

Test whether key is present using in

Sets and Dictionaries Dictionaries

>>> birthdays['Nightingale']KeyError: 'Nightingale'

Key must be in dictionary before use

>>> 'Nightingale' in birthdaysFalse>>> 'Darwin' in birthdaysTrue

Test whether key is present using in

Sets and Dictionaries Dictionaries

Use for to loop over keys

Sets and Dictionaries Dictionaries

Use for to loop over keysUnlike lists, where for loops over values

Sets and Dictionaries Dictionaries

>>> for name in birthdays:... print name, birthdays[name]

Turing 1912Newton 1642Darwin 1809

Use for to loop over keysUnlike lists, where for loops over values

Sets and Dictionaries Dictionaries

Let's count those birds

Sets and Dictionaries Dictionaries

import sys

if __name__ == '__main__': reader = open(sys.argv[1], 'r') lines = reader.readlines() reader.close() count = count_names(lines) for name in count: print name, count[name]

Let's count those birds

Sets and Dictionaries Dictionaries

import sys

if __name__ == '__main__': reader = open(sys.argv[1], 'r') lines = reader.readlines() reader.close() count = count_names(lines) for name in count: print name, count[name]

Let's count those birds

Read all the data

Sets and Dictionaries Dictionaries

import sys

if __name__ == '__main__': reader = open(sys.argv[1], 'r') lines = reader.readlines() reader.close() count = count_names(lines) for name in count: print name, count[name]

Let's count those birds

Count distinct values

Sets and Dictionaries Dictionaries

import sys

if __name__ == '__main__': reader = open(sys.argv[1], 'r') lines = reader.readlines() reader.close() count = count_names(lines) for name in count: print name, count[name]

Let's count those birds

Show results

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

Explain what we're doingto the next reader

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

Create an emptydictionary to fill

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

Handle input valuesone at a time

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

Clean up beforeprocessing

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

If we haveseen this valuebefore…

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

…add one toits count

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result

But if it's the first timewe have seen this name,store it with a count of 1

Sets and Dictionaries Dictionaries

def count_names(lines): '''Count unique lines of text, returning dictionary.'''

result = {} for name in lines: name = name.strip() if name in result: result[name] = result[name] + 1 else: result[name] = 1

return result Return the result

Sets and Dictionaries Dictionaries

Counter in action

Sets and Dictionaries Dictionaries

Counter in actionstart {}

Sets and Dictionaries Dictionaries

Counter in actionstart {}loon {'loon' : 1}

Sets and Dictionaries Dictionaries

Counter in actionstart {}loon {'loon' : 1}goose {'loon' : 1, 'goose' : 1}

Sets and Dictionaries Dictionaries

Counter in actionstart {}loon {'loon' : 1}goose {'loon' : 1, 'goose' : 1}loon {'loon' : 2, 'goose' : 1}

Sets and Dictionaries Dictionaries

Counter in actionstart {}loon {'loon' : 1}goose {'loon' : 1, 'goose' : 1}loon {'loon' : 2, 'goose' : 1}

But like sets, dictionaries are much more efficientthan lookup lists

July 2010

created by

Greg Wilson

Copyright © Software Carpentry 2010This work is licensed under the Creative Commons Attribution LicenseSee http://software-carpentry.org/license.html for more information.

Recommended