I l@ve RuBoard |
Our next stop on the built-in object tour is the Python list. Lists are Python's most flexible ordered collection object type. Unlike strings, lists can contain any sort of object: numbers, strings, even other lists. Python lists do the work of most of the collection data structures you might have to implement manually in lower-level languages such as C. In terms of some of their main properties, Python lists are:
From a functional view, lists are just a place to collect other objects, so you can treat them as a group. Lists also define a left-to-right positional ordering of the items in the list.
Just as with strings, you can fetch a component object out of a list by indexing the list on the object's offset. Since lists are ordered, you can also do such tasks as slicing and concatenation.
Unlike strings, lists can grow and shrink in place (they're variable length), and may contain any sort of object, not just one-character strings (they're heterogeneous). Because lists can contain other complex objects, lists also support arbitrary nesting; you can create lists of lists of lists, and so on.
In terms of our type category qualifiers, lists can be both changed in place (they're mutable) and respond to all the sequence operations we saw in action on strings in the last section. In fact, sequence operations work the same on lists, so we won't have much to say about them here. On the other hand, because lists are mutable, they also support other operations strings don't, such as deletion, index assignment, and methods.
Technically, Python lists contain zero or more references to other objects. If you've used a language such as C, lists might remind you of arrays of pointers. Fetching an item from a Python list is about as fast as indexing a C array; in fact, lists really are C arrays inside the Python interpreter. Moreover, references are something like pointers (addresses) in a language such as C, except that you never process a reference by itself; Python always follows a reference to an object whenever the reference is used, so your program only deals with objects. Whenever you stuff an object into a data structure or variable name, Python always stores a reference to the object, not a copy of it (unless you request a copy explicitly).
Table 2.7 summarizes common list object operations.
Lists are written as a series of objects (really, expressions that return objects) in square brackets, separated by commas. Nested lists are coded as a nested square-bracketed series, and the empty list is just a square-bracket set with nothing inside.[8]
[8] But we should note that in practice, you won't see many lists written out like this in list-processing programs. It's more common to see code that processes lists constructed dynamically (at runtime). In fact, although constant syntax is important to master, most data structures in Python are built by running program code at runtime.
Most of the operations in Table 2.7 should look familiar, since they are the same sequence operations we put to work on strings earlier—indexing, concatenation, iteration, and so on. The last few table entries are new; lists also respond to method calls (which provide utilities such as sorting, reversing, adding items on the end, etc.), as well as in-place change operations (deleting items, assignment to indexes and slices, and so forth). Remember, lists get these last two operation sets because they are a mutable object type.
Perhaps the best way to understand lists is to see them at work. Let's once again turn to some simple interpreter interactions to illustrate the operations in Table 2.7.
Lists respond to the + and * operators as with strings; they mean concatenation and repetition here too, except that the result is a new list, not a string. And as Forrest Gump was quick to say, "that's all we have to say about that"; grouping types into categories is intellectually frugal (and makes life easy for authors like us).
% python >>> len([1, 2, 3]) # length 3 >>> [1, 2, 3] + [4, 5, 6] # concatenation [1, 2, 3, 4, 5, 6] >>> ['Ni!'] * 4 # repetition ['Ni!', 'Ni!', 'Ni!', 'Ni!'] >>> for x in [1, 2, 3]: print x, # iteration ... 1 2 3
We talk about iteration (as well as range built-ins) in Chapter 3. One exception worth noting here: + expects the same sort of sequence on both sides, otherwise you get a type error when the code runs. For instance, you can't concatenate a list and a string, unless you first convert the list to a string using backquotes or % formatting (we met these in the last section). You could also convert the string to a list; the list built-in function does the trick:
>>> `[1, 2]` + "34" # same as "[1, 2]" + "34" '[1, 2]34' >>> [1, 2] + list("34") # same as [1, 2] + ["3", "4"] [1, 2, '3', '4']
Because lists are sequences, indexing and slicing work the same here too, but the result of indexing a list is whatever type of object lives at the offset you specify, and slicing a list always returns a new list:
>>> L = ['spam', 'Spam', 'SPAM!'] >>> L[2] # offsets start at zero 'SPAM!' >>> L[-2] # negative: count from the right 'Spam' >>> L[1:] # slicing fetches sections ['Spam', 'SPAM!']
Finally something new: because lists are mutable, they support operations that change a list object in-place; that is, the operations in this section all modify the list object directly, without forcing you to make a new copy as you had to for strings. But since Python only deals in object references, the distinction between in-place changes and new objects can matter; if you change an object in place, you might impact more than one reference to it at once. More on that later in this chapter.
When using a list, you can change its contents by assigning to a particular item (offset), or an entire section (slice):
>>> L = ['spam', 'Spam', 'SPAM!'] >>> L[1] = 'eggs' # index assignment >>> L ['spam', 'eggs', 'SPAM!'] >>> L[0:2] = ['eat', 'more'] # slice assignment: delete+insert >>> L # replaces items 0,1 ['eat', 'more', 'SPAM!']
Index assignment works much as it does in C: Python replaces the object reference at the designated slot with a new one. Slice assignment is best thought of as two steps: Python first deletes the slice you specify on the left of the =, and then inserts (splices) the new items into the list at the place where the old slice was deleted. In fact, the number of items inserted doesn't have to match the number of items deleted; for instance, given a list L that has the value [1, 2, 3], the assignment L[1:2] = [4, 5] sets L to the list [1, 4, 5, 3]. Python first deletes the 2 (a one-item slice), then inserts items 4 and 5 where 2 used to be. Python list objects also support method calls:
>>> L.append('please') # append method call >>> L ['eat', 'more', 'SPAM!', 'please'] >>> L.sort() # sort list items ('S' < 'e') >>> L ['SPAM!', 'eat', 'more', 'please']
Methods are like functions, except that they are associated with a particular object. The syntax used to call methods is similar too (they're followed by arguments in parentheses), but you qualify the method name with the list object to get to it. Qualification is coded as a period followed by the name of the method you want; it tells Python to look up the name in the object's namespace—set of qualifiable names. Technically, names such as append and sort are called attributes—names associated with objects. We'll see lots of objects that export attributes later in the book.
The list append method simply tacks a single item (object reference) to the end of the list. Unlike concatenation, append expects us to pass in a single object, not a list. The effect of L.append(X) is similar to L+[X], but the former changes L in place, and the latter makes a new list.[9] The sort method orders a list in-place; by default, it uses Python standard comparison tests (here, string comparisons; you can also pass in a comparison function of your own, but we'll ignore this option here).
[9] Also unlike + concatenation, append doesn't have to generate new objects, and so is usually much faster. On the other hand, you can mimic append with clever slice assignments: L[len(L):]=[X] is like L.append(X), and L[:0]=[X] is like appending at the front of a list. Both delete an empty slice and insert X, changing L in place quickly like append. C programmers might be interested to know that Python lists are implemented as single heap blocks (rather than a linked list), and append is really a call to realloc behind the scenes. Provided your heap manager is smart enough to avoid copying and re-mallocing, append can be very fast. Concatenation, on the other hand, must always create new list objects and copy the items in both operands.
Finally, because lists are mutable, you can also use the del statement to delete an item or section. Since slice assignment is a deletion plus an insert, you can also delete sections of lists by assigning an empty list to a slice (L[i:j] = []); Python deletes the slice named on the left and then inserts nothing. Assigning an empty list to an index, on the other hand, just stores a reference to the empty list in the specified slot: L[0] = [] sets the first item of L to the object [], rather than deleting it (L winds up looking like [[],...]):
>>> L ['SPAM!', 'eat', 'more', 'please'] >>> del L[0] # delete one item >>> L ['eat', 'more', 'please'] >>> del L[1:] # delete an entire section >>> L # same as L[1:] = [] ['eat']
Here are a few pointers before moving on. Although all the operations above are typical, there are additional list methods and operations we won't illustrate here (including methods for reversing and searching). You should always consult Python's manuals or the Python Pocket Reference for a comprehensive and up-to-date list of type tools. Even if this book was complete, it probably couldn't be up to date (new tools may be added any time). We'd also like to remind you one more time that all the in-place change operations above work only for mutable objects: they won't work on strings (or tuples, discussed ahead), no matter how hard you try.
I l@ve RuBoard |