CONTENTS |
This appendix summarizes prominent changes introduced in Python releases since the first edition of this book. It is divided into three sections, mostly because the sections on 1.6 and 2.0 changes were adapted from release note documents:
Changes introduced in Python 2.0 (and 2.1)
Changes introduced in Python 1.6
Changes between the first edition and Python 1.5.2
Python 1.3 was the most recent release when the first edition was published (October 1996), and Python 1.6 and 2.0 were released just before this second edition was finished. 1.6 was the last release posted by CNRI, and 2.0 was released from BeOpen (Guido's two employers prior to his move to Digital Creations); 2.0 adds a handful of features to 1.6.
With a few notable exceptions, the changes over the last five years have introduced new features to Python, but have not changed it in incompatible ways. Many of the new features are widely useful (e.g., module packages), but some seem to address the whims of Python gurus (e.g., list comprehensions) and can be safely ignored by anyone else. In any event, although it is important to keep in touch with Python evolution, you should not take this appendix too seriously. Frankly, application library and tool usage is much more important in practice than obscure language additions.
For information on the Python changes that will surely occur after this edition's publication, consult either the resources I maintain at this book's web site (http://rmi.net/~lutz/about-pp.html), the resources available at Python's web site (http://www.python.org ), or the release notes that accompany Python releases.
This section lists changes introduced in Python release 2.0. Note that third-party extensions built for Python 1.5.x or 1.6 cannot be used with Python 2.0; these extensions must be rebuilt for 2.0. Python bytecode files (*.pyc and *.pyo) are not compatible between releases either.
The following sections describe changes made to the Python language itself.
After nearly a decade of complaints from C programmers, Guido broke down and added 11 new C-like assignment operators to the language:
+= -= *= /= %= **= <<= >>= &= ^= |=
The statement A += B is similar to A = A + B except that A is evaluated only once (useful if it is a complex expression). If A is a mutable object, it may be modified in place; for instance, if it is a list, A += B has the same effect as A.extend(B).
Classes and built-in object types can override the new operators in order to implement the in-place behavior; the non-in-place behavior is automatically used as a fallback when an object does not implement the in-place behavior. For classes, the method name is the method name for the corresponding non-in-place operator prepended with an "i" (e.g., __iadd__ implements in-place __add__ ).
A new expression notation was added for lists whose elements are computed from another list (or lists):
[<expression> for <variable> in <sequence>]
For example, [i**2 for i in range(4)] yields the list [0,1,4,9]. This is more efficient than using map with a lambda, and at least in the context of scanning lists, avoids some scoping issues raised by lambdas (e.g., using defaults to pass in information from the enclosing scope). You can also add a condition:
[<expression> for <variable> in <sequence> if <condition>]
For example, [w for w in words if w == w.lower( )] yields the list of words that contain no uppercase characters. This is more efficient than filter with a lambda. Nested for loops and more than one if is supported as well, though using this seems to yield code that is as complex as nested maps and lambdas (see Python manuals for more details).
Import statements now allow an "as" clause (e.g., import mod as name), which saves an assignment of an imported module's name to another variable. This works with from statements and package paths too (e.g., from mod import var as name. The word "as" was not made a reserved word in the process. (To import odd filenames that don't map to Python variable names, see the __import_ _ built-in function.)
The print statement now has an option that makes the output go to a different file than the default sys.stdout. For instance, to write an error message to sys.stderr, you can now write:
print >> sys.stderr, "spam"
As a special case, if the expression used to indicate the file evaluates to None, the current value of sys.stdout is used (like not using >> at all). Note that you can always write to file objects such as sys.stderr by calling their write method; this optional extension simply adds the extra formatting performed by the print statement (e.g., string conversion, spaces between items).
Python is now equipped with a garbage collector that can hunt down cyclical references between Python objects. It does not replace reference counting (and in fact depends on the reference counts being correct), but can decide that a set of objects belongs to a cycle if all their reference counts are accounted for in their references to each other. A new module named gc lets you control parameters of the garbage collection; an option to the Python "configure" script lets you enable or disable the garbage collection. (See the 2.0 release notes or the library manual to check if this feature is enabled by default or not; because running this extra garbage collection step periodically adds performance overheads, the decision on whether to turn it on by default is pending.)
This is a partial list of standard library changes introduced by Python release 2.0; see 2.0 release notes for a full description of the changes.
A new function zip was added: zip(seq1,seq2,...) is equivalent to map(None,seq1,seq2,...) when the sequences have the same length. For instance, zip([1, 2, 3], [10, 20, 30]) returns [(1,10), (2,20), (3,30)]. When the lists are not all the same length, the shortest list defines the result's length.
A new standard module named pyexpat provides an interface to the Expat XML parser. A new standard module package named xml provides assorted XML support code in (so far) three subpackages: xml.dom , xml.sax , and xml.parsers.
The new webbrowser module attempts to provide a platform-independent API to launch a web browser. (See also the LaunchBrowser script at the end of Chapter 4.)
Portability was ensured to 64-bit platforms under both Linux and Win64, especially for the new Intel Itanium processor. Large file support was also added for Linux64 and Win64.
The garbage collection changes resulted in the creation of two new slots on an object, tp_traverse and tp_clear. The augmented assignment changes result in the creation of a new slot for each in-place operator. The GC API creates new requirements for container types implemented in C extension modules. See Include/objimpl.h in the Python source distribution.
New popen2, popen3, and popen4 calls were added in the os module.
The os.popen call is now much more usable on Windows 95 and 98. To fix this call for Windows 9x, Python internally uses the w9xpopen.exe program in the root of your Python installation (it is not a standalone program). See Microsoft Knowledge Base article Q150956 for more details.
Administrator privileges are no longer required to install Python on Windows NT or Windows 2000. The Windows installer also now installs by default in \Python20\ on the default volume (e.g., C:\Python20 ), instead of the older-style \Program Files\Python-2.0\.
The Windows installer no longer runs a separate Tcl/Tk installer; instead, it installs the needed Tcl/Tk files directly in the Python directory. If you already have a Tcl/Tk installation, this wastes some disk space (about 4 MB) but avoids problems with conflicting Tcl/Tk installations and makes it much easier for Python to ensure that Tcl/Tk can find all its files.
Python 2.1 Alpha FeaturesLike the weather in Colorado, if you wait long enough, Python's feature set changes. Just before this edition went to the printer, the first alpha release of Python 2.1 was announced. Among its new weapons are these:
As usual, of course, you should consult this book's web page (http://www.rmi.net/~lutz/about-pp.html) and Python 2.1 and later release notes for Python developments that will surely occur immediately after I ship this insert off to my publisher. |
This section lists changes introduced by Python release 1.6; by proxy, most are part of release 2.0 as well.
The append method for lists can no longer be invoked with more than one argument. This used to append a single tuple made out of all arguments, but was undocumented. To append a tuple, write l.append((a, b, c)).
The connect, connect_ex, and bind methods for sockets require exactly one argument. Previously, you could call s.connect(host, port), but this was not by design; you must now write s.connect((host, port)).
The str and repr functions are now different more often. For long integers, str no longer appends an "L"; str(1L) is "1", which used to be "1L", and repr(1L) still returns "1L". For floats, repr now gives 17 digits of precision to ensure that no precision is lost (on all current hardware).
Some library functions and tools have been moved to the deprecated category, including some widely used tools such as find. The string module is now simply a frontend to the new string methods, but given that this module is used by almost every Python module written to date, it is very unlikely to go away.
The following sections describe changes made to the Python language itself.
Python now supports Unicode (i.e., 16-bit wide character) strings. Release 1.6 added a new fundamental datatype (the Unicode string), a new built-in function unicode, and numerous C APIs to deal with Unicode and encodings. Unicode string constants are prefixed with the letter "u", much like raw strings (e.g., u"..."). See the file Misc/unicode.txt in your Python distribution for details, or visit web site http://starship.python.net/crew/lemburg/unicode-proposal.txt.
Many of the functions in the string module are now available as methods of string objects. For instance, you can now say str.lower( ) instead of importing the string module and saying string.lower(str). The equivalent of string.join(sequence,delimiter) is delimiter.join(sequence). (That is, you use " ".join(sequence) to mimic string.join(sequence)).
The new regular expression engine, SRE, is fully backward-compatible with the old engine, and is invoked using the same interface (the re module). That is, the re module's interface remains the way to write matches, and is unchanged; it is simply implemented to use SRE. You can explicitly invoke the old engine by importing pre, or the SRE engine by importing sre. SRE is faster than pre, and supports Unicode (which was the main reason to develop yet another underlying regular expression engine).
Special function call syntax can be used instead of the apply function: f(*args, **kwds) is equivalent to apply(f, args, kwds). You can also use variations like f(a1, a2, *args, **kwds), and can leave one or the other out (e.g., f(*args), f(**kwds)).
The built-ins int and long take an optional second argument to indicate the conversion base, but only if the first argument is a string. This makes string.atoi and string.atol obsolete. (string.atof already was.)
When a local variable is known to the compiler but undefined when used, a new exception UnboundLocalError is raised. This is a class derived from NameError, so code that catches NameError should still work. The purpose is to provide better diagnostics in the following example:
x = 1 def f( ): print x x = x+1
This used to raise a confusing NameError on the print statement.
You can now override the in operator by defining a __contains_ _ method. Note that it has its arguments backward: x in a runs a.__contains__(x) (that's why the name isn't __in__).
This section lists some of the changes made to the Python standard library.
New; tools for distributing Python modules.
New; read and write zip archives (module gzip does gzip files).
New; access to the Unicode 3.0 database.
New; Windows registry access (one without the _ is in progress).
Expanded to include optional OpenSSL secure socket support (on Unix only).
Support for Tk versions 8.0 through 8.3.
This module no longer uses the built-in C strop module, but takes advantage of the new string methods to provide transparent support for both Unicode and ordinary strings.
This section lists some of the changes made to Python tools.
Completely overhauled. See the IDLE home page at http://www.python.org for more information.
Python equivalent of xgettext message text extraction tool used for internationalizing applications written in Python.
This section describes significant language, library, tool, and C API changes in Python between the first edition of this book (Python 1.3) and Python release 1.5.2.
The following sections describe changes made to the Python language itself.
Python now provides a name-mangling protocol that hides attribute names used by classes. Inside a class statement, a name of the form _ _X is automatically changed by Python to _Class_ _X , where Class is the name of the class being defined by the statement. Because the enclosing class name is prepended, this feature limits the possibilities of name clashes when you extend or mix existing classes. Note that this is not a "private" mechanism at all, just a class name localization feature to minimize name clashes in hierarchies and the shared instance object's namespace at the bottom of the attribute inheritance links chain.
Exceptions may now take the form of class (and class instance) objects. The intent is to support exception categories. Because an except clause will now match a raised exception if it names the raised class or any of its superclasses, specifying superclasses allows try statements to catch broad categories without listing all members explicitly (e.g., catching a numeric-error superclass exception will also catch specific kinds of numeric errors). Python's standard built-in exceptions are now classes (instead of strings) and have been organized into a shallow class hierarchy; see the library manual for details.
Import statements may now reference directory paths on your computer by dotted-path syntax. For instance:
import directory1.directory2.module # and use path from directory1.directory2.module import name # and use "name"
Both load a module nested two levels deep in packages (directories). The leftmost package name in an import path (directory1) must be a directory within a directory that is listed in the Python module search path (sys.path initialized from PYTHONPATH). Thereafter, the import statement's path denotes subdirectories to follow. Paths prevent module name conflicts when installing multiple Python systems on the same machine that expect to find their own version of the same module name (otherwise, only the first on PYTHONPATH wins).
Unlike the older ni module that this feature replaces, the new package support is always available (without running special imports) and requires each package directory along an import path to contain a (possibly empty) __init__.py module file to identify the directory as a package, and serve as its namespace if imported directly. Packages tend to work better with from than with import, since the full path must be repeated to use imported objects after an import.
Python 1.5 added a new statement:
assert test [, value]
which is the same as:
if __debug__: if not test: raise AssertionError, value
Assertions are mostly meant for debugging, but can also be used to specify program constraints (e.g., type tests on entry to functions).
The word "assert" was added to the list of Python reserved words; "access" was removed (it has now been deprecated in earnest).
A few convenience methods were added to the built-in dictionary object to avoid the need for manual loops: D.clear( ), D.copy( ), D.update( ), and D.get( ). The first two methods empty and copy dictionaries, respectively. D1.update(D2) is equivalent to the loop:
for k in D2.keys( ): D1[k] = D2[k]
D.get(k) returns D[k] if it exists, or None (or its optional second argument) if the key does not exist.
List objects have a new method, pop, to fetch and delete the last item of the list:
x = s.pop( ) ...is the same as the two statements... x = s[-1]; del s[-1]
and extend, to concatenate a list of items on the end, in place:
s.extend(x) ...is the same as... s[len(s):len(s)] = x
The pop method can also be passed an index to delete (it defaults to -1). Unlike append, extend is passed an entire list and adds each of its items at the end.
In support of regular expressions and Windows, Python allows string constants to be written in the form r"...\...", which works like a normal string except that Python leaves any backslashes in the string alone. They remain as literal \ characters rather than being interpreted as special escape codes by Python.
Python now supports complex number constants (e.g., 1+3j) and complex arithmetic operations (normal math operators, plus a cmath module with many of the math module's functions for complex numbers).
Objects created with code like L.append(L) are now detected and printed specially by the interpreter. In the past, trying to print cyclic objects caused the interpreter to loop recursively (which eventually led to a core dump).
A raise statement without any exception or extra-data arguments now makes Python re-raise the most recently raised uncaught exception.
Because exceptions can now either be string objects or classes and class instances, you can use any of the following raise statement forms:
raise string # matches except with same string object raise string, data # same, with optional data raise class, instance # matches except with class or its superclass raise instance # same as: raise instance.__class__, instance raise # reraise last exception
You can also use the following three forms, which are for backwards-compatibility with earlier releases where all built-in exceptions were strings:
raise class # same as: raise class( ) (and: raise class, instance) raise class, arg # same as: raise class(arg) raise class, (arg,...) # same as: raise class(args...)
The new ** binary operator computes the left operand raised to the power of the right operand. It works much like the built-in pow function.
In an assignment (= statements and other assignment contexts), you can now assign any sort of sequence on the right to a list or tuple on the left (e.g., (A,B) = seq, [A,B] = seq ). In the past, the sequence types had to match.
Python 1.5 has been clocked at almost twice the speed of its predecessors on the Lib/test/pystone.py benchmark. (I've seen almost a threefold speedup in other tests.)
The following sections describe changes made to the Python standard library.
The built-in dir function now reports attributes for modules, classes, and class instances, as well as for built-in objects such as lists, dictionaries, and files. You don't need to use members like __methods__ (but you still can).
The int and float built-in functions now accept string arguments, and convert from strings to numbers exactly like string.atoi/atof. The new list(S) built-in function converts any sequence to a list, much like the older and obscure map(None, S) trick.
A new regular expression module, re, offers full-blown Perl-style regular expression matching. See Chapter 18, for details. The older regex module described in the first edition is still available, but considered obsolete.
The split and join functions in the string module were generalized to do the same work as the original splitfields and joinfields.
Beginning in Python 1.5, the pickle module's unpickler (loader) no longer calls class __init__ methods to recreate pickled class instance objects. This means that classes no longer need defaults for all constructor arguments to be used for persistent objects. To force Python to call the __init_ _ method (as it did before), classes must provide a special __getinitargs__ method; see the library manual for details.
An implementation of the pickle module in C is now a standard part of Python. It's called cPickle and is reportedly many times faster than the original pickle. If present, the shelve module loads it instead of pickle automatically.
To open a DBM file in "create new or open existing for read+write" mode, pass a "c" in argument 2 to anydbm.open. This changed as of Python 1.5.2; passing a "c" now does what passing no second argument used to do (the second argument now defaults to "r" -- read-only). This does not impact shelve.open.
The rand module is now deprecated; use random instead.
Tkinter became portable to and sprouted native look-and-feel for all major platforms (Windows, X, Macs). There has been a variety of changes in the Tkinter GUI interface:
The __call_ _ method for StringVar class objects was dropped in Python 1.4; that means you need to explicitly call their get( )/set( ) methods, instead of calling them with or without arguments.
The ScrolledText widget went through a minor interface change in Python 1.4, which was apparently backed out in release 1.5 due to code breakage (so never mind).
Tkinter now supports Tk's new grid geometry manager. To use it, call the grid method of widget objects (much like pack , but passes row and column numbers, not constraints).
Fredrik Lundh now maintains a nice set of Tkinter documentation at http://www.pythonware.com, which provides references and tutorials.
The CGI interface changed. An older FormContent interface was deprecated in favor of the FieldStorage object's interface. See the library manual for details.
These scripts are automatically run by Python on startup, used to tailor initial paths configuration. See the library manuals for details.
Assigning to a key in the os.environ dictionary now updates the corresponding environment variable in the C environment. It triggers a call to the C library's putenv routine such that the changes are reflected in integrated C code layers as well as in the environment of any child processes spawned by the Python program. putenv is now exposed in the os module too (os.putenv).
The new exc_info( ) function in the sys module returns a tuple of values corresponding to sys.exc_type and sys.exc_value. These older names access a single global exception; exc_info is specific to the calling thread.
There is a new standard module called operator, which provides functions that implement most of the built-in Python expression operators. For instance, operator.add(X,Y) does the same thing as X+Y, but because operator module exports are functions, they are sometimes handy to use in things like map, so you don't have to create a function or use a lambda form.
The following sections describe major Python tool-related changes.
The new JPython system is an alternative Python implementation that compiles Python programs to Java Virtual Machine ( JVM) bytecode and provides hooks for integrating Python and Java programs. See Chapter 15.
The COM interfaces in the Python Windows ports have evolved substantially since the first edition's descriptions (it was "OLE" back then); see Chapter 15. Python also now ships as a self-installer for Windows, with built-in support for the Tkinter interface, DBM-style files, and more; it's a simple double-click to install today.
The SWIG system has become a primary extension writers' tool, with new "shadow classes" for wrapping C++ classes. See Chapter 19.
This system for publishing Python objects on the Web has grown to become a popular tool for CGI programmers and web scripters in general. See the Zope section in Chapter 15.
This tool for generating correct HTML files (web page layouts) from Python class object trees has grown to maturity. See Chapter 15.
The PMW system provides powerful, higher-level widgets for Tkinter-based GUIs in Python. See Chapter 6.
Python now ships with a point-and-click development interface named IDLE. Written in Python using the Tkinter GUI library, IDLE either comes in the source library's Tools directory or is automatically installed with Python itself (on Windows, see IDLE's entry in the Python menu within your Start button menus). IDLE offers a syntax-coloring text editor, a graphical debugger, an object browser, and more. If you have Python with Tk support enabled and are accustomed to more advanced development interfaces, IDLE provides a feature-rich alternative to the traditional Python command line. IDLE does not provide a GUI builder today.
The PIL image processing and NumPy numeric programming systems have matured considerably, and a portable database API for Python has been released. See Chapter 6 and Chapter 16.
The following sections describe changes made to the Python C API.
All useful Python symbols are now exported in the single Python.h header file; no other header files need be imported in most cases.
All Python interpreter code is now packaged in a single library file when you build Python. For instance, under Python 1.5, you need only link in libpython1.5.a when embedding Python (instead of the older scheme's four libraries plus .o's).
All exposed Python symbols now start with a "Py" prefix.
A handful of new API tools provide better support for threads when embedding Python. For instance, there are tools for finalizing Python (Py_Finalize) and for creating "multiple interpreters" (Py_NewInterpreter).
Note that spawning Python language threads may be a viable alternative to C-level threads, and multiple namespaces are often sufficient to isolate names used in independent system components; both schemes are easier to manage than multiple interpreters and threads. But in some threaded programs, it's also useful to have one copy of system modules and structures per thread, and this is where multiple interpreters come in handy (e.g., without one copy per thread, imports might find an already-loaded module in the sys.modules table if it was imported by a different thread). See the new C API documentation manuals for details.
There is a new reference manual that ships with Python and documents major C API tools and behavior. It's not fully fleshed out yet, but it's a useful start.
CONTENTS |