I l@ve RuBoard |
8.8 Storing a BLOB in a PostgreSQL DatabaseCredit: Luther Blissett 8.8.1 ProblemYou need to store a binary large object (BLOB) in a PostgreSQL database. 8.8.2 SolutionPostgreSQL 7.2 supports large objects, and the psycopg module supplies a Binary escaping function: import psycopg, cPickle # Connect to a DB, e.g., the test DB on your localhost, and get a cursor connection = psycopg.connect("dbname=test") cursor = connection.cursor( ) # Make a new table for experimentation cursor.execute("CREATE TABLE justatest (name TEXT, ablob BYTEA)") try: # Prepare some BLOBs to insert in the table names = 'aramis', 'athos', 'porthos' data = {} for name in names: datum = list(name) datum.sort( ) data[name] = cPickle.dumps(datum, 1) # Perform the insertions sql = "INSERT INTO justatest VALUES(%s, %s)" for name in names: cursor.execute(sql, (name, psycopg.Binary(data[name])) ) # Recover the data so you can check back sql = "SELECT name, ablob FROM justatest ORDER BY name" cursor.execute(sql) for name, blob in cursor.fetchall( ): print name, cPickle.loads(blob), cPickle.loads(data[name]) finally: # Done. Remove the table and close the connection. cursor.execute("DROP TABLE justatest") connection.close( ) 8.8.3 DiscussionPostgreSQL supports binary data (BYTEA and variations thereof), but you need to be careful when communicating such data via SQL. Specifically, when you use a normal INSERT SQL statement and need to have binary strings among the VALUES you're inserting, you need to escape some characters in the binary string according to PostgreSQL's own rules. Fortunately, you don't have to figure out those rules for yourself: PostgreSQL supplies functions that do all the needed escaping, and psycopg exposes such a function to your Python programs as the Binary function. This recipe shows a typical case: the BYTEAs you're inserting come from cPickle.dumps, so they may represent almost arbitrary Python objects (although, in this case, we're just using them for a few lists of characters). The recipe is purely demonstrative and works by creating a table and dropping it at the end (using a try/finally statement to ensure finalization is performed even if the program terminates because of an uncaught exception). Earlier PostgreSQL releases put limits of a few KB on the amount of data you could store in a normal field of the database. To store really large objects, you needed to use roundabout techniques to load the data into the database (such as PostgreSQL's nonstandard SQL function LO_IMPORT to load a datafile as an object, which requires superuser privileges and datafiles that reside on the machine running the PostgreSQL server) and store a field of type OID in the table to be used later for indirect recovery of the data. Fortunately, none of these techniques are necessary anymore: since Release 7.1 (the current release at the time of writing is 7.2.1), PostgreSQL embodies the results of project TOAST, which removes the limitations on field-storage size and therefore the need for peculiar indirection. psycopg supplies the handy Binary function to let you escape any binary string of bytes into a form acceptable for placeholder substitution in INSERT and UPDATE SQL statements. 8.8.4 See AlsoRecipe 8.7 for a MySQL-oriented solution to the same problem; PostgresSQL's home page (http://www.postgresql.org/); the Python/PostgreSQL module (http://initd.org/software/psycopg). |
I l@ve RuBoard |