I l@ve RuBoard |
3.23 Module: Yet Another Python Templating Utility (YAPTU)Credit: Alex Martelli Templating is the process of defining a block of text that contains embedded variables, code, and other markup. This text block is then automatically processed to yield another text block, in which the variables and code have been evaluated and the results have been substituted into the text. Most dynamic web sites are generated with the help of templating mechanisms. Example 3-1 contains Yet Another Python Templating Utility (YAPTU), a small but complete Python module for this purpose. YAPTU uses the sub method of regular expressions to evaluate embedded Python expressions but handles nested statements via recursion and line-oriented statement markers. YAPTU is suitable for processing almost any kind of structured-text input, since it lets client code specify which regular expressions denote embedded Python expressions and/or statements. Such regular expressions can then be selected to avoid conflict with whatever syntax is needed by the specific kind of structured text that is being processed (HTML, a programming language, RTF, TeX, etc.) See Recipe 3.22 for another approach, in a very different Python style, with very similar design goals. YAPTU uses a compiled re object, if specified, to identify expressions, calling sub on each line of the input. For each match that results, YAPTU evaluates match.group(1) as a Python expression and substitutes in place the result, transformed into a string. You can also pass a dictionary to YAPTU to use as the global namespace for the evaluation. Many such nonoverlapping matches per line are possible, but YAPTU does not rescan the resulting text for further embedded expressions or statements. YAPTU also supports embedded Python statements. This line-based feature is primarily intended to be used with if/elif/else, for, and while statements. YAPTU recognizes statement-related lines through three more re objects that you pass it: one each for statement, continuation, and finish lines. Each of these arguments can be None if no such statements are to be embedded. Note that YAPTU relies on explicit block-end marks rather than indentation (leading whitespace) to determine statement nesting. This is because some structured-text languages that you might want to process with YAPTU have their own interpretations of the meaning of leading whitespace. The statement and continuation markers are followed by the corresponding statement lines (i.e., beginning statement and continuation clause, respectively, where the latter normally makes sense only if it's an else or elif). Statements can nest without limits, and normal Pythonic indentation requirements do not apply. If you embed a statement that does not end with a colon (e.g., an assignment statement), a Python comment must terminate its line. Conversely, such comments are not allowed on the kind of statements that you may want to embed most often (e.g., if, else, for, and while). The lines of such statements must terminate with their :, optionally followed by whitespace. This line-termination peculiarity is due to a slightly tricky technique used in YAPTU's implementation, whereby embedded statements (with their continuations) are processed by exec, with recursive calls to YAPTU's copyblock function substituted in place of the blocks of template text they contain. This approach takes advantage of the fact that a single, controlled, simple statement can be placed on the same line as the controlling statement, right after the colon, avoiding any whitespace issues. As already explained, YAPTU does not rely on whitespace to discern embedded-statement structure; rather, it relies on explicit markers for statement start, statement continuation, and statement end. Example 3-1. Yet Another Python Templating Utility"Yet Another Python Templating Utility, Version 1.3" import sys # utility stuff to avoid tests in the mainline code class _nevermatch: "Polymorphic with a regex that never matches" def match(self, line): return None def sub(self, repl, line): return line _never = _nevermatch( ) # one reusable instance of it suffices def identity(string, why): "A do-nothing-special-to-the-input, just-return-it function" return string def nohandle(string, kind): "A do-nothing handler that just reraises the exception" sys.stderr.write("*** Exception raised in %s {%s}\n"%(kind, string)) raise # and now, the real thing: class copier: "Smart-copier (YAPTU) class" def copyblock(self, i=0, last=None): "Main copy method: process lines [i,last) of block" def repl(match, self=self): "return the eval of a found expression, for replacement" # uncomment for debug: print '!!! replacing', match.group(1) expr = self.preproc(match.group(1), 'eval') try: return str(eval(expr, self.globals, self.locals)) except: return str(self.handle(expr, 'eval')) block = self.locals['_bl'] if last is None: last = len(block) while i<last: line = block[i] match = self.restat.match(line) if match: # a statement starts "here" (at line block[i]) # i is the last line NOT to process stat = match.string[match.end(0):].strip( ) j = i+1 # Look for 'finish' from here onwards nest = 1 # Count nesting levels of statements while j<last: line = block[j] # First look for nested statements or 'finish' lines if self.restend.match(line): # found a statement-end nest = nest - 1 # Update (decrease) nesting if nest==0: break # j is first line NOT to process elif self.restat.match(line): # found a nested statement nest = nest + 1 # Update (increase) nesting elif nest==1: # Look for continuation at this nesting match = self.recont.match(line) if match: # found a continued statement nestat = match.string[match.end(0):].strip( ) # key "trick": cumulative recursive copyblock call stat = '%s _cb(%s,%s)\n%s' % (stat,i+1,j,nestat) i = j # i is the last line NOT to process j += 1 stat = self.preproc(stat, 'exec') # second half of key "trick": do the recursive copyblock call stat = '%s _cb(%s,%s)' % (stat, i+1, j) # uncomment for debug: print "-> Executing: {"+stat+"}" try: exec stat in self.globals, self.locals except: return str(self.handle(expr, 'exec')) i=j+1 else: # normal line, just copy with substitutions self.oufun(self.regex.sub(repl, line)) i=i+1 def _ _init_ _(self, regex=_never, globals={}, restat=_never, restend=_never, recont=_never, preproc=identity, handle=nohandle, oufun=sys.stdout.write): "Initialize self's attributes" def self_set(**kwds): self._ _dict_ _.update(kwds) self_set(locals={'_cb': self.copyblock}, **vars( )) def copy(self, block=None, inf=sys.stdin): "Entry point: copy-with-processing a file, or a block of lines" if block is None: block = inf.readlines( ) self.locals['_bl'] = block self.copyblock( ) if _ _name_ _=='_ _main_ _': "Test: copy a block of lines to stdout, with full processing" import re rex=re.compile('@([^@]+)@') rbe=re.compile('\+') ren=re.compile('-') rco=re.compile('= ') x=23 # just a variable to try substitution cop = copier(rex, globals( ), rbe, ren, rco) # Instantiate smart copier lines_block = """ A first, plain line -- it just gets copied. A second line, with @x@ substitutions. + x+=1 # Nonblock statements (nonblock ones ONLY!) must end with comments - Now the substitutions are @x@. + if x>23: After all, @x@ is rather large! = else: After all, @x@ is rather small! - + for i in range(3): Also, @i@ times @x@ is @i*x@. - One last, plain line at the end.""".splitlines(1) print "*** input:" print ''.join(lines_block) print "*** output:" cop.copy(lines_block) Not counting comments, whitespace, and docstrings, YAPTU is just 50 lines of source code, but rather a lot happens within that code. An instance of the auxiliary class _nevermatch is used for all default placeholder values for optional regular-expression arguments. This instance is polymorphic with compiled re objects for the two methods of the latter that YAPTU uses (sub and match), which simplifies the main body of code and saves quite a few tests. This is a good general idiom to keep in mind for generality and concise code (and often speed as well). See Recipe 5.24 for a more systematic and complete development of this idiom into the full-fledged Null Object design pattern. An instance of the copier class has a certain amount of state, in addition to the relevant compiled re objects (or _nevermatch instance) and the output function to use (normally a write bound method for some file or file-like object). This state is held in two dictionary attributes: self.globals, the dictionary that was originally passed in for expression substitution; and self.locals, another dictionary that is used as the local namespace for all of YAPTU's exec and eval calls. Note that while self.globals is available to YAPTU, YAPTU does not change anything in it, as that dictionary is owned by YAPTU's caller. There are two internal-use-only items in self.locals. The value at key '_bl' indicates the block of template text being copied (a sequence of lines, each ending with \n), while the value at key '_cb', self.copyblock, is the bound method that performs the copying. Holding these two pieces of state as items in self.locals is key to YAPTU's workings, since self.locals is what is guaranteed to be available to the code that YAPTU processes with exec. copyblock must be recursive, as this is the simplest way to ensure there are no nesting limitations. Thus, it is important to ensure that nested recursive calls are always able to further recurse, if needed, through their exec statements. Access to _bl is similarly necessary, since copyblock takes as arguments only the line indexes inside _bl that a given recursive call is processing (in the usual Python form, with the lower bound included and the upper bound excluded). copyblock is the heart of YAPTU. The repl nested function is the one that is passed to the sub method of compiled re objects to get the text to be used for each expression substitution. repl uses eval on the expression string and str on the result, to ensure that the returned value is also a string. Most of copyblock is a while loop that examines each line of text. When a line doesn't match a statement-start marker, the loop performs substitutions and then calls the output function. When a line does match a statement-start marker, the loop enters a smaller nested loop, looking for statement-continuation and statement-end markers (with proper accounting for nesting levels, of course). The nested loop builds up, in the local variable stat, a string containing the original statement and its continuations at the same nesting level (if any) followed by a recursive call to _cb(i,j) after each clause-delimiting colon, with newlines as separators between any continuations. Finally, stat is passed to the exec statement, the nested loop terminates, and the main loop resumes from a position immediately following the embedded statement just processed. Thanks to perfectly normal recursive-invocation mechanisms, although the exec statement inevitably invokes copyblock recursively, this does not disturb the loop's state (which is based on local variables unoriginally named i and j because they are loop counters and indexes on the _bl list). YAPTU supports optional preprocessing for all expressions and statements by passing an optional callable preproc when creating the copier. The default, however, is no preprocessing. Exceptions may be handled by passing an optional callable handle. The default behavior is for YAPTU to reraise the exception, which terminates YAPTU's processing and propagates the exception outward to YAPTU's caller. You should also note that the _ _init_ _ method avoids the usual block of boilerplate self.spam = spam statements that you typically see in _ _init_ _. Instead, it uses a "self-set" idiom to achieve exactly the same result without repetitious, verbose, and error-prone boilerplate code. 3.23.1 See AlsoRecipe 3.22, Recipe 5.24, and Recipe 17.8. |
I l@ve RuBoard |