I l@ve RuBoard Previous Section Next Section

2.8 Files

Hopefully, most readers are familiar with the notion of files—named storage compartments on your computer that are managed by your operating system. Our last built-in object type provides a way to access those files inside Python programs. The built-in open function creates a Python file object, which serves as a link to a file residing on your machine. After calling open, you can read and write the associated external file, by calling file object methods.

Compared to types we've seen so far, file objects are somewhat unusual. They're not numbers, sequences, or mappings; instead, they export methods only for common file processing tasks. Technically, files are a prebuilt C extension type that provides a thin wrapper over the underlying C stdio filesystem; in fact, file object methods have an almost 1-to-1 correspondence to file functions in the standard C library.

Table 2.10 summarizes common file operations. To open a file, a program calls the open function, with the external name first, followed by a processing mode ('r' means open for input, 'w' means create and open for output, 'a' means open for appending to the end, and others we'll ignore here). Both arguments must be Python strings.

Table 2.10. Common File Operations

Operation

Interpretation

output = open('/tmp/spam', 'w')

Create output file ('w' means write)

input = open('data', 'r')

Create input file ('r' means read)

S = input.read()

Read entire file into a single string

S = input.read(N)

Read N bytes (1 or more)

S = input.readline()

Read next line (through end-line marker)

L = input.readlines()

Read entire file into list of line strings

output.write(S)

Write string S onto file

output.writelines(L)

Write all line strings in list L onto file

output.close()

Manual close (or it's done for you when collected)

Once you have a file object, call its methods to read from or write to the external file. In all cases, file text takes the form of strings in Python programs; reading a file returns its text in strings, and text is passed to the write methods as strings. Reading and writing both come in multiple flavors; Table 2.10 gives the most common.

Calling the file close method terminates your connection to the external file. We talked about garbage collection in a footnote earlier; in Python, an object's memory space is automatically reclaimed as soon as the object is no longer referenced anywhere in the program. When file objects are reclaimed, Python automatically closes the file if needed. Because of that, you don't need to always manually close your files, especially in simple scripts that don't run long. On the other hand, manual close calls can't hurt and are usually a good idea in larger systems.

2.8.1 Files in Action

Here is a simple example that demonstrates file-processing basics. We first open a new file for output, write a string (terminated with an end-of-line marker, '\n'), and close the file. Later, we open the same file again in input mode, and read the line back. Notice that the second readline call returns an empty string; this is how Python file methods tell us we've reached the end of the file (empty lines are strings with just an end-of-line character, not empty strings).

>>> myfile = open('myfile', 'w')             # open for output (creates)
>>> myfile.write('hello text file\n')        # write a line of text
>>> myfile.close()

>>> myfile = open('myfile', 'r')             # open for input
>>> myfile.readline()                        # read the line back
'hello text file\012'
>>> myfile.readline()                        # empty string: end of file
''

There are additional, more advanced file methods not shown in Table 2.10; for instance, seek resets your current position in a file, flush forces buffered output to be written, and so on. See the Python library manual or other Python books for a complete list of file methods. Since we're going to see file examples in Chapter 9, we won't present more examples here.

2.8.2 Related Python Tools

File objects returned by the open function handle basic file-interface chores. In Chapter 8, you'll see a handful of related but more advanced Python tools. Here's a quick preview of all the file-like tools available:

File descriptor-based files

The os module provides interfaces for using low-level descriptor-based files.

DBM keyed files

The anydbm module provides an interface to access-by-key files.

Persistent objects

The shelve and pickle modules support saving entire objects (beyond simple strings).

Pipes

The os module also provides POSIX interfaces for processing pipes.

Other

There are also optional interfaces to database systems, B-Tree based files, and more.

I l@ve RuBoard Previous Section Next Section