I l@ve RuBoard |
Most class issues can usually be boiled down to namespace issues (which makes sense, given that classes are just namespaces with a few extra tricks up their sleeves).
Theoretically speaking, classes (and class instances) are all mutable objects. Just as with built-in lists and dictionaries, they can be changed in place, by assigning to their attributes. As with lists and dictionaries, this also means that changing a class or instance object may impact multiple references to it.
That's usually what we want (and is how objects change their state in general), but this becomes especially critical to know when changing class attributes. Because all instances generated from a class share the class's namespace, any changes at the class level are reflected in all instances, unless they have their own versions of changed class attributes.
Since classes, modules, and instances are all just objects with attribute namespaces, you can normally change their attributes at runtime by assignments. Consider the following class; inside the class body, the assignment to name a generates an attribute X.a, which lives in the class object at runtime and will be inherited by all of X's instances:
>>> class X: ... a = 1 # class attribute ... >>> I = X() >>> I.a # inherited by instance 1 >>> X.a 1
So far so good. But notice what happens when we change the class attribute dynamically: it also changes it in every object which inherits from the class. Moreover, new instances created from the class get the dynamically set value, regardless of what the class's source code says:
>>> X.a = 2 # may change more than X >>> I.a # I changes too 2 >>> J = X() # J inherits from X's runtime values >>> J.a # (but assigning to J.a changes a in J, not X or I) 2
Useful feature or dangerous trap? You be the judge, but you can actually get work done by changing class attributes, without ever making a single instance. In fact, this technique can simulate "records" or "structs" in other languages. For example, consider the following unusual but legal Python program:
class X: pass # make a few attribute namespaces class Y: pass X.a = 1 # use class attributes as variables X.b = 2 # no instances anywhere to be found X.c = 3 Y.a = X.a + X.b + X.c for X.i in range(Y.a): print X.i # prints 0..5
Here, classes X and Y work like file-less modules—namespaces for storing variables we don't want to clash. This is a perfectly legal Python programming trick, but is less appropriate when applied to classes written by others; you can't always be sure that class attributes you change aren't critical to the class's internal behavior. If you're out to simulate a C struct, you may be better off changing instances than classes, since only one object is affected:
>>> class Record: pass ... >>> X = Record() >>> X.name = 'bob' >>> X.job = 'Pizza maker'
This may be obvious, but is worth underscoring: if you use multiple inheritance, the order in which superclasses are listed in a class statement header can be critical. For instance, in the example we saw earlier, suppose that the Super implemented a _ _ repr __ method too; would we then want to inherit Lister's or Super's? We would get it from whichever class is listed first in Sub's class header, since inheritance searches left to right. But now suppose Super and Lister have their own versions of other names too; if we want one name from Super and one from Lister, we have to override inheritance by manually assigning to the attribute name in the Sub class:
def __repr__(self): ... def other(self): ... def __repr__(self): ... def other(self): ... class Sub(Super, Lister): # pick up Super's __repr__, by listing it first other = Lister.other # but explicitly pick up Lister's version of other def __init__(self): ...
Multiple inheritance is an advanced tool; even if you understood the last paragraph, it's still a good idea to use it sparingly and carefully. Otherwise, the meaning of a name may depend on the order in which classes are mixed in an arbitrarily far removed subclass.
This one is simple if you understand Python's underlying object model, but it tends to trip up new users with backgrounds in other OOP languages (especially Smalltalk). In Python, class method functions can never be called without an instance. Earlier in the chapter, we talked about unbound methods: when we fetch a method function by qualifying a class (instead of an instance), we get an unbound method. Even though they are defined with a def statement, unbound method objects are not simple functions; they cannot be called without an instance.
For example, suppose we want to use class attributes to count how many instances are generated from a class. Remember, class attributes are shared by all instances, so we can store the counter in the class object itself:
class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 def printNumInstances(): print "Number of instances created: ", Spam.numInstances
This won't work: the printNumInstances method still expects an instance to be passed in when called, because the function is associated with a class (even though there are no arguments in the def header):
>>> from spam import * >>> a = Spam() >>> b = Spam() >>> c = Spam() >>> Spam.printNumInstances() Traceback (innermost last): File "<stdin>", line 1, in ? TypeError: unbound method must be called with class instance 1st argument
Don't expect this: unbound methods aren't exactly the same as simple functions. This is really a knowledge issue, but if you want to call functions that access class members without an instance, just make them simple functions, not class methods. This way, an instance isn't expected in the call:
def printNumInstances(): print "Number of instances created: ", Spam.numInstances class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 >>> import spam >>> a = spam.Spam() >>> b = spam.Spam() >>> c = spam.Spam() >>> spam.printNumInstances() Number of instances created: 3
We can also make this work by calling through an instance, as usual:
class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 def printNumInstances(self): print "Number of instances created: ", Spam.numInstances >>> from spam import Spam >>> a, b, c = Spam(), Spam(), Spam() >>> a.printNumInstances() Number of instances created: 3 >>> b.printNumInstances() Number of instances created: 3 >>> Spam().printNumInstances() Number of instances created: 4
Some language theorists claim that this means Python doesn't have class methods, only instance methods. We suspect they really mean Python classes don't work the same as in some other language. Python really has bound and unbound method objects, with well-defined semantics; qualifying a class gets you an unbound method, which is a special kind of function. Python really does have class attributes, but functions in classes expect an instance argument.
Moreover, since Python already provides modules as a namespace partitioning tool, there's usually no need to package functions in classes unless they implement object behavior. Simple functions in modules usually do most of what instance-less class methods could. For example, in the first example in this section, printNumInstances is already associated with the class, because it lives in the same module.
Classes introduce a local scope just as functions do, so the same sorts of scope gotchas can happen in a class statement body. Moreover, methods are further nested functions, so the same issues apply. Confusion seems to be especially common when classes are nested. For instance, in the following example, the generate function is supposed to return an instance of the nested Spam class. Within its code, the class name Spam is assigned in the generate function's local scope. But within the class's method function, the class name Spam is not visible; method has access only to its own local scope, the module surrounding generate, and built-in names:
def generate(): class Spam: count = 1 def method(self): # name Spam not visible: print Spam.count # not local (def), global (module), built-in return Spam() generate().method() C:\python\examples> python nester.py Traceback (innermost last): File "nester.py", line 8, in ? generate().method() File "nester.py", line 5, in method print Spam.count # not local (def), global (module), built-in NameError: Spam
The most general piece of advice we can pass along here is to remember the LGB rule; it works in classes and method functions just as it does in simple functions. For instance, inside a method function, code has unqualified access only to local names (in the method def), global names (in the enclosing module), and built-ins. Notably missing is the enclosing class statement; to get to class attributes, methods need to qualify self, the instance. To call one method from another, the caller must route the call through self (e.g., self.method()).
There are a variety of ways to get the example above to work. One of the simplest is to move the name Spam out to the enclosing module's scope with global declarations; since method sees names in the enclosing module by the LGB rule, Spam references work:
def generate(): global Spam # force Spam to module scope class Spam: count = 1 def method(self): print Spam.count # works: in global (enclosing module) return Spam() generate().method() # prints 1
Perhaps better, we can also restructure the example such that class Spam is defined at the top level of the module by virtue of its nesting level, rather than global declarations. Both the nested method function and the top-level generate find Spam in their global scopes:
def generate(): return Spam() class Spam: # define at module top-level count = 1 def method(self): print Spam.count # works: in global (enclosing module) generate().method()
We can also get rid of the Spam reference in method altogether, by using the special __ class __ attribute, which, as we've seen, returns an instance's class object:
def generate(): class Spam: count = 1 def method(self): print self.__class__.count # works: qualify to get class return Spam() generate().method()
Finally, we could use the mutable default argument trick we saw in Chapter 4 to make this work, but it's so complicated we're almost embarrassed to show you; the prior solutions usually make more sense:
def generate(): class Spam: count = 1 fillin = [None] def method(self, klass=fillin): # save from enclosing scope print klass[0].count # works: default plugged-in Spam.fillin[0] = Spam return Spam() generate().method()
Notice that we can't say klass=Spam in method's def header, because the name Spam isn't visible in Spam's body either; it's not local (in the class body), global (the enclosing module), or built-in. Spam only exists in the generate function's local scope, which neither the nested class nor its method can see. The LGB rule works the same for both.
I l@ve RuBoard |