only for RuBoard - do not distribute or recompile Previous Section Next Section

10.1 DB-API

Like Java and Perl, Python has developed a unified API for database access: DB-API. This database API was developed by a Python Special Interest Group (SIG) called the Database SIG. The Database SIG is a group of influential Python developers interested in implementing Python access to various databases. On the positive side, DB-API is a very small, simple API. On the negative side, it has design flaws. Part of its problem is that, being small, it does not support many of the more complex features database programmers expect in a database API. It also fails to achieve true database independence. As a comprehensive database access API, it still leaves a lot to be desired and does not compete with more mature APIs such as Perl DBI and Java JDBC. You should therefore expect significant changes in this API over time.

10.1.1 The Database Connection

The entry point into DB-API is really the only part of the API tied to a particular database engine. By convention, all modules supporting DB-API are named after the database they support with a db extension. The MySQL implementation is thus called MySQLdb. Similarly, the Oracle implementation is called oracledb and the Sybase implementation sybasedb. Each module contains a connect( ) method that returns a DB-API connection object. This method returns an object that has the same name as the module:

import MySQLdb;
conn = MySQLdb.connect(host='carthage', user='test',  
                       passwd='test', db='test');

This example connects using the username/password pair test/test to the MySQL database test hosted on the machine carthage. In addition to these four arguments, you can specify a custom port, the location of a Unix socket to use for the connection, and an integer representing client connection flags. All arguments must be passed to connect( ) as keyword/value pairs, as in the example above.

The API for a connection object is very simple. You basically use it to gain access to cursor objects and manage transactions. When you are done, you should close the connection:

conn.close(  );

10.1.2 Cursors

Cursors represent SQL statements and their results. The connection object provides your application with a cursor via the cursor( ) method:

cursor = conn.cursor(  );

This cursor is the center of your Python database access. Through the execute( ) method, you send SQL to the database and process any results. The simplest form of database access is, of course, a simple insert:

conn = MySQLdb.connect(host='carthage', user='test', passwd='test', db='test');
cursor = conn.cursor(  );
cursor.execute("INSERT INTO test (test_id, test_char) VALUES (1, 'test')");
print "Affected rows: ", cursor.rowcount;

In this example, the application inserts a new row into the database using the cursor generated by the MySQL connection. It then verifies the insert by printing out the number of rows affected. For inserts, this value should always be 1.

Query processing is a little more complex. Again, you use the execute( ) method to send SQL to the database. Instead of checking the affected rows, however, you grab the results from the cursor using one of many fetch methods. Example 10-1 shows a Python program processing a simple query.

Example 10-1. A simple query
import MySQLdb;

connection = None;
try:
    connection = MySQLdb.connect(host="carthage", user="user", 
                                 passwd="pass", db="test");
    cursor = connection.cursor(  );
    cursor.execute("SELECT test_id, test_val FROM test ORDER BY test_id");
    for row in cursor.fetchall(  ):
        print "Key: ", row[0];
        print "Value: ", row[1];
     connection.close(  );
 except:
    if connection:
        connection.close(  );

The cursor object actually provides several fetch methods: fetchone( ), fetchmany( ), and fetchall( ). For each of these methods, a row is represented by a Python tuple. In Example 10-1, the fetchall( ) method fetches all the results from the query into a list of Python tuples. This method, like all the fetch methods, will throw an exception if the SQL was not a query.

Of course, fetching all the rows at once can be inefficient for large result sets. You can instead fetch each row one by one by using the fetchone( ) method. The fetchone( ) method returns a single row as a tuple where each element represents a column in the returned row. If you have already fetched all the rows of the result set, fetchone( ) will return None.

The final fetch method, fetchmany( ), is middle ground between fetchone( ) and fetchall( ). It enables an application to fetch a predefined number of rows at once. You can either pass in the number of rows you wish to see or rely on the value of cursor.arraysize to provide a default value.

10.1.3 Parameterized SQL

DB-API includes a mechanism for executing parameterized SQL statements using the execute( ) method as well as a more complex method called executemany( ). Parameterized SQL is an SQL statement with placeholders to which you can pass arguments. As with a simple SQL execution, the first argument to execute( ) is an SQL string. Unlike the simple form, this SQL has placeholders for parameters specified by the second argument. A simple example is:

cursor.execute('INSERT INTO COLORS (COLOR, ABBR) VALUES (%s, %s)', 
               ('BLUE', 'BL'));

In this example, %s is placed in the SQL as a placeholder for values passed as the second argument. The first %s matches the first value in the paramter tuple, and the second %s matches the second value in the tuple.

MySQLdb treats all values as string values, even when their underlying database type is BIGINT, DOUBLE, DATE, etc. Thus, all conversion parameters should be %s even though you might think they should be %d or %f.

DB-API actually has several ways of marking SQL parameters. You can specify the format you wish to use by setting MySQLdb.paramstyle. The above example is MySQLdb.paramstyle = "format". The "format" value is the default for MySQLdb when a tuple of parameters is passed to execute( ) and is basically the set of placeholders from the ANSI C printf( ) function. Another possible value for MySQLdb.paramstyle is "pyformat". This value is the default when you pass a Python mapping as the second argument.

DB-API actually allows several other formats, but MySQLdb does not support them. This lack of support is particularly unfortunate since it is common practice in database applications in other languages to mark placeholders with a ?.

The utility of parameterized SQL becomes apparent when you use the executemany( ) method. This method enables you to execute the same SQL statement with multiple sets of parameters. For example, consider this code snippet that adds three rows to the database using execute( ):

cursor.execute("INSERT INTO COLOR (COLOR, ABBREV) VALUES ('BLUE', 'BL')");
cursor.execute("INSERT INTO COLOR (COLOR, ABBREV) VALUES ('PURPLE', 'PPL')");
cursor.execute("INSERT INTO COLOR (COLOR, ABBREV) VALUES ('ORANGE', 'ORN')");

That same functionality using executemany( ) looks like this:

cursor.executemany("INSERT INTO COLOR ( COLOR, ABBREV ) VALUES (%s, %s )",
                   (("BLUE", "BL"), ("PURPLE", "PPL"), ("ORANGE", "ORN")));

This one line executes the same SQL three times using different values in place of the placeholders. This can be extremely useful if you are using Python in batch processing.

10.1.4 Other Objects

DB-API provides a host of other objects to help encapsulate common SQL data types so they may be passed as parameters to execute( ) and executemany( ) and relieve developers of the burden of formatting them for different databases. These objects include Date, Time, Timestamp, and Binary. MySQLdb supports these objects up to a point. Specifically, when MySQLdb binds parameters, it converts each paramter to a string (via __str__) and places it in the SQL. The Timestamp object, in particular, includes fractional seconds, which MySQL considers illegal input.

It is important to note that MySQLdb does not properly implement the Date( ), Time( ), and Timestamp( ) constructors for their respective objects. You instead have to use the DateFromTicks( ), TimeFromTicks( ), and TimestampFromTicks( ) methods to get a reference to the desired object. The argument for each of these methods is the number of seconds since the epoch.

The following code creates a Date for the current time and updates the database:

import time;
d = MySQLdb.DateFromTicks(time.time(  ));
cursor.execute("UPDATE test SET test_date = %s WHERE test_id = 1", (d,));
only for RuBoard - do not distribute or recompile Previous Section Next Section