I l@ve RuBoard |
In this and most of the next few chapters, we'll include a discussion of common problems that seem to bite new users (and the occasional expert), along with their solutions. We call these gotchas—a degenerate form of "got you"—because some may catch you by surprise, especially when you're just getting started with Python. Others represent esoteric Python behavior, which comes up rarely (if ever!) in real programming, but tends to get an inordinate amount of attention from language aficionados on the Internet (like us).[12] Either way, all have something to teach us about Python; if you can understand the exceptions, the rest is easy.
[12] We should also note that Guido could make some of the gotchas we describe go away in future Python releases, but most reflect fundamental properties of the language that are unlikely to change (but don't quote us on that).
We've talked about this earlier, but we want to mention it again here, to underscore that it can be a gotcha if you don't understand what's going on with shared references in your program. For instance, in the following, the list object assigned to name L is referenced both from L and from inside the list assigned to name M. Changing L in place changes what M references too:
>>> L = [1, 2, 3] >>> M = ['X', L, 'Y'] # embed a reference to L >>> M ['X', [1, 2, 3], 'Y'] >>> L[1] = 0 # changes M too >>> M ['X', [1, 0, 3], 'Y']
This effect usually becomes important only in larger programs, and sometimes shared references are exactly what you want. If they're not, you can avoid sharing objects by copying them explicitly; for lists, you can always make a top-level copy by using an empty-limits slice:
>>> L = [1, 2, 3] >>> M = ['X', L[:], 'Y'] # embed a copy of L >>> L[1] = 0 # only changes L, not M >>> L [1, 0, 3] >>> M ['X', [1, 2, 3], 'Y']
Remember, slice limits default to and the length of the sequence being sliced; if both are omitted, the slice extracts every item in the sequence, and so makes a top-level copy (a new, unshared object).[13]
[13] Empty-limit slices still only make a top-level copy; if you need a complete copy of a deeply nested data structure, you can also use the standard copy module that traverses objects recursively. See the library manual for details.
When we introduced sequence repetition, we said it's like adding a sequence to itself a number of times. That's true, but when mutable sequences are nested, the effect might not always be what you expect. For instance, in the following, X is assigned to L repeated four times, whereas Y is assigned to a list containing L repeated four times:
>>> L = [4, 5, 6] >>> X = L * 4 # like [4, 5, 6] + [4, 5, 6] + ... >>> Y = [L] * 4 # [L] + [L] + ... = [L, L,...] >>> X [4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6] >>> Y [[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]]
Because L was nested in the second repetition, Y winds up embedding references back to the original list assigned to L, and is open to the same sorts of side effects we noted in the last section:
>>> L[1] = 0 # impacts Y but not X >>> X [4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6] >>> Y [[4, 0, 6], [4, 0, 6], [4, 0, 6], [4, 0, 6]]
This is really another way to trigger the shared mutable object reference issue, so the same solutions above apply here. And if you remember that repetition, concatenation, and slicing copy only the top level of their operand objects, these sorts of cases make much more sense.
We actually encountered this gotcha in a prior exercise: if a compound object contains a reference to itself, it's called a cyclic object. In Python versions before Release 1.5.1, printing such objects failed, because the Python printer wasn't smart enough to notice the cycle (you'll keep seeing the same text printed over and over, until you break execution). This case is now detected, but it's worth knowing; cyclic structures may also cause code of your own to fall into unexpected loops if you're not careful. See the solutions to Chapter 1 exercises for more details.
>>> L = ['hi.']; L.append(L) # append reference to same object >>> L # before 1.5.1: a loop! (cntl-C breaks)
Don't do that. There are good reasons to create cycles, but unless you have code that knows how to handle them, you probably won't want to make your objects reference themselves very often in practice (except as a parlor trick).
Finally, as we've mentioned plenty of times by now: you can't change an immutable object in place:
T = (1, 2, 3) T[2] = 4 # error! T = T[:2] + (4,) # okay: (1, 2, 4)
Construct a new object with slicing, concatenation, and so on, and assign it back to the original reference if needed. That might seem like extra coding work, but the upside is that the previous gotchas can't happen when using immutable objects such as tuples and strings; because they can't be changed in place, they are not open to the sorts of side effects that lists are.
I l@ve RuBoard |