I l@ve RuBoard |
Besides lists, dictionaries are perhaps the most flexible built-in data type in Python. If you think of lists as ordered collections of objects, dictionaries are unordered collections; their chief distinction is that items are stored and fetched in dictionaries by key, instead of offset. As we'll see, built-in dictionaries can replace many of the searching algorithms and data-structures you might have to implement manually in lower-level languages. Dictionaries also sometimes do the work of records and symbol tables used in other languages. In terms of their main properties, dictionaries are:
Dictionaries are sometimes called associative arrays or hashes. They associate a set of values with keys, so that you can fetch an item out of a dictionary using the key that stores it. You use the same indexing operation to get components in a dictionary, but the index takes the form of a key, not a relative offset.
Unlike lists, items stored in a dictionary aren't kept in any particular order; in fact, Python randomizes their order in order to provide quick lookup. Keys provide the symbolic (not physical) location of items in a dictionary.
Like lists, dictionaries can grow and shrink in place (without making a copy), they can contain objects of any type, and support nesting to any depth (they can contain lists, other dictionaries, and so on).
They can be changed in place by assigning to indexes, but don't support the sequence operations we've seen work on strings and lists. In fact, they can't: because dictionaries are unordered collections, operations that depend on a fixed order (e.g., concatenation, slicing) don't make sense. Instead, dictionaries are the only built-in representative of the mapping type category—objects that map keys to values.
If lists are arrays of object references, dictionaries are unordered tables of object references. Internally, dictionaries are implemented as hash tables (data structures that support very fast retrieval), which start small and grow on demand. Moreover, Python employs optimized hashing algorithms to find keys, so retrieval is very fast. But at the bottom, dictionaries store object references (not copies), just like lists.
Table 2.8 summarizes some of the most common dictionary operations (see the library manual for a complete list). Dictionaries are written as a series of key:value pairs, separated by commas, and enclosed in curly braces.[10] An empty dictionary is an empty set of braces, and dictionaries can be nested by writing one as a value in another dictionary, or an item in a list (or tuple).
[10] The same note about the relative rarity of constants applies here: we often build up dictionaries by assigning to new keys at runtime, rather than writing constants. But see the following section on changing dictionaries; lists and dictionaries are grown in different ways. Assignment to new keys works for dictionaries, but fails for lists (lists are grown with append).
Operation |
Interpretation |
---|---|
d1 = {} |
Empty dictionary |
d2 = {'spam': 2, 'eggs': 3} |
Two-item dictionary |
d3 = {'food': {'ham': 1, 'egg': 2}} |
|
d2['eggs'], d3['food']['ham'] |
Indexing by key |
d2.has_key('eggs'), d2.keys(), d2.values() |
Methods: membership test, keys list, values list, etc. |
len(d1) |
Length (number stored entries) |
d2[key] = new, del d2[key] |
Adding/changing, deleting |
As Table 2.8 illustrates, dictionaries are indexed by key; in this case, the key is a string object ('eggs'), and nested dictionary entries are referenced by a series of indexes (keys in square brackets). When Python creates a dictionary, it stores its items in any order it chooses; to fetch a value back, supply the key that stores it.
Let's go back to the interpreter to get a feel for some of the dictionary operations in Table 2.8.
Generally, you create dictionaries and access items by key. The built-in len function works on dictionaries too; it returns the number of items stored away in the dictionary, or equivalently, the length of its keys list. Speaking of keys lists, the dictionary keys method returns all the keys in the dictionary, collected in a list. This can be useful for processing dictionaries sequentially, but you shouldn't depend on the order of the keys list (remember, dictionaries are randomized).
% python >>> d2 = {'spam': 2, 'ham': 1, 'eggs': 3} >>> d2['spam'] # fetch value for key 2 >>> len(d2) # number of entries in dictionary 3 >>> d2.has_key('ham') # key membership test (1 means true) 1 >>> d2.keys() # list of my keys ['eggs', 'spam', 'ham']
Dictionaries are mutable, so you can change, expand, and shrink them in place without making new dictionaries, just as for lists. Simply assign a value to a key to change or create the entry. The del statement works here too; it deletes the entry associated with the key specified as an index. Notice that we're nesting a list inside a dictionary in this example (the value of key "ham"):
>>> d2['ham'] = ['grill', 'bake', 'fry'] # change entry >>> d2 {'eggs': 3, 'spam': 2, 'ham': ['grill', 'bake', 'fry']} >>> del d2['eggs'] # delete entry >>> d2 {'spam': 2, 'ham': ['grill', 'bake', 'fry']} >>> d2['brunch'] = 'Bacon' # add new entry >>> d2 {'brunch': 'Bacon', 'spam': 2, 'ham': ['grill', 'bake', 'fry']}
As with lists, assigning to an existing index in a dictionary changes its associated value. Unlike lists, whenever you assign a new dictionary key (i.e., one that hasn't been assigned before), you create a new entry in the dictionary, as done previously for 'brunch'. This doesn't work for lists, because Python considers an offset out of bounds if it's beyond the end of a list. To expand a list, you need to use such tools as the append method or slice assignment instead.
Here is a more realistic dictionary example. The following example creates a table that maps programming language names (the keys) to their creators (the values). You fetch a creator name by indexing on language name:
>>> table = {'Python': 'Guido van Rossum', ... 'Perl': 'Larry Wall', ... 'Tcl': 'John Ousterhout' } ... >>> language = 'Python' >>> creator = table[language] >>> creator 'Guido van Rossum' >>> for lang in table.keys(): print lang, '\t', table[lang] ... Tcl John Ousterhout Python Guido van Rossum Perl Larry Wall
Notice the last command. Because dictionaries aren't sequences, you can't iterate over them directly with a for statement, as for strings and lists. But if you need to step through the items in a dictionary it's easy: calling the dictionary keys method returns a list of all stored keys you can iterate through with a for. If needed, you can index from key to value inside the for loop as done previously. We'll talk about the print and for statements in more detail in Chapter 3.
Before we move on to more types, here are a few additional details you should be aware of when using dictionaries:
We're being redundant on purpose here, because this is another common question from new users. Dictionaries are mappings, not sequences; because there's no notion of ordering among their items, things like concatenation (an ordered joining) and slicing (extracting contiguous section) simply don't apply. In fact, Python raises an error when your code runs, if you try.
Keys can be created either when you write a dictionary constant (in which case they are embedded in the constant itself), or when you assign values to new keys of an existing dictionary object. The end result is the same.
We've been using strings as keys here, but other immutable objects (not lists) work just as well. In fact, you could use integers as keys, which makes a dictionary look much like a list (albeit, without the ordering). Tuples (up next) are sometimes used as dictionary keys too, allowing for compound key values. And class instance objects (discussed in Chapter 6) can be used as keys, as long as they have the proper protocol methods; they need to tell Python that their values won't change, or else they would be useless as fixed keys.
Why You Will Care: Dictionary InterfacesBesides being a convenient way to store information by key in your programs, some Python extensions also present interfaces that look and work the same as dictionaries. For instance, Python's interface to dbm access-by-key files looks much like a dictionary that must be opened; strings are stored and fetched using key indexes: import anydbm file = anydbm.open("filename") # link to external file file['key'] = 'data' # store data by key data = file['key'] # fetch data by key Later, we'll see that we can store entire Python objects this way too, if we replace "anydbm" in the above with "shelve" (shelves are access-by-key databases of persistent Python objects). For Internet work, Python's CGI script support also presents a dictionary-like interface; a call to cgi.FieldStorage yields a dictionary-like object, with one entry per input field on the client's web page: import cgi form = cgi.FieldStorage() # parse form data (stdin, environ) if form.has_key('name'): showReply('Hello, ' + form['name'].value) All of these (and dictionaries) are instances of mappings. More on CGI scripts in Chapter 9. |
I l@ve RuBoard |