<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
  [
    <!ENTITY litsource   "<userinput>litsource</userinput>">
  ]
>
<article>
  <articleinfo>
    <title>A source extractor for lightweight literate programming</title>
    <titleabbrev>
      &litsource;: A literate source extractor
    </titleabbrev>
    <authorgroup>
      <author>
        <firstname>John W.</firstname>
        <surname>Shipman</surname>
      </author>
    </authorgroup>
    <address><email>tcc-doc@nmt.edu</email>
    </address>
    <revhistory>
      <revision>
        <revnumber>$Revision: 1.6 $</revnumber>
        <date>$Date: 2006/06/11 20:01:45 $</date>
      </revision>
    </revhistory>
  </articleinfo>
  <section id="intro">
    <title>Introduction</title>
    <blockquote>
      <attribution>
        <citetitle>Structure and interpretation of computer
        programs</citetitle>, Harold Abelson and Gerald Jay
        Sussman, p. xvii
      </attribution>
      <para>
        Programs must be written for people to read, and only
        incidentally for machines to execute.
      </para>
    </blockquote>
    <para>
      By literate programming, we mean programs that are intended
      to be readable.  The idea comes from Dr. Donald
      E. Knuth and has a long history.  For background, see the
      <ulink url='http://www.literateprogramming.com/'>Literate
      Programming web site</ulink>.
    </para>
    <para>
      Knuth's <userinput>cweb</userinput> system interwove
      narrative about the program with the actual source code of
      the program.  One then runs a tool named
      <userinput>ctangle</userinput> to generate the source code,
      an a different tool named <userinput>cweave</userinput> to
      generate the online documentation.
    </para>
    <para>
      The present effort was inspired by similar efforts of
      <ulink url='http://www.nmt.edu/~al/'>Dr. Allan
      M. Stavely</ulink>, who suggested using DocBook as a
      general framework for literate programming.  Refer to
      <ulink url='http://www.nmt.edu/tcc/help/pubs/docbook42/'
      ><citetitle>Writing documentation with DocBook-XML
      4.2</citetitle></ulink> for more information on DocBook.
    </para>
    <para>
      Stavely's idea was to use DocBook's existing
      <userinput>programlisting</userinput> element to hold the
      program fragments, adding a
      <userinput>role='executable'</userinput> attribute to that
      element to distinguish executable source code from other
      uses of the <userinput>programlisting</userinput> element.
      This means that the regular processing of DocBook into HTML
      and PDF forms becomes the new equivalent of Knuth's
      <userinput>cweave</userinput> step.
    </para>
    <para>
      The remaining half of the problem, the extraction of the
      executable code from the DocBook source file, is the
      subject of this document.
    </para>
    <section id='this-avail'>
      <title>How to get this publication</title>
      <para>
        This document is available in <ulink
        url='http://www.nmt.edu/tcc/help/lang/python/examples/litsource/'
        >Web form</ulink> and also as a <ulink
url='http://www.nmt.edu/tcc/help/lang/python/examples/litsource/litsource.pdf'
        >PDF document</ulink>.  See also the <ulink
url='http://www.nmt.edu/tcc/help/lang/python/examples/litsource/litsource'
        >executable Python source</ulink> and the <ulink
url='http://www.nmt.edu/tcc/help/lang/python/examples/litsource/litsource.xml'
        >XML source of this document</ulink>.
      </para>
    </section> <!--End this-avail-->
  </section>
  <section id="encoding">
    <title>Encoding the literate program</title>
    <para>
      One limitation of Stavely's approach was that it assembled
      all the executable code fragments into a single file for
      execution.  But the literate exposition of a C program, for
      example, might require the discussion of two source files, a
      header file named <filename>foo.h</filename> and a code
      file named <filename>foo.c</filename>.  We get around this
      problem by using the <userinput>role</userinput> attribute
      of the <userinput>programlisting</userinput> element in a
      more flexible way.
    </para>
    <para>
      The general form of a literate program source is a valid
      DocBook-XML file, except that each fragment of executable
      code is wrapped in a <userinput>programlisting</userinput>
      element with this general format:
      <programlisting
>&lt;programlisting role='outFile:<replaceable>F</replaceable>'&gt;
  (source text)
&lt;/programlisting&gt;
</programlisting>
      where <userinput><replaceable>F</replaceable></userinput>
      is the name of the output file to which that source text
      should be written.
    </para>
    <para>
      We can then handle the above example by using a
      <userinput>role='outFile:foo.h'</userinput> attribute on
      fragments of the header file and a
      <userinput>role='outFile:foo.c'</userinput> attribute on
      fragments of the code file.  For example:
    </para>
    <programlisting
>&lt;programlisting role='outFile:foo.h'&gt;
  (stuff to be written to foo.h)
&lt;/programlisting&gt;
   ...
&lt;programlisting role='outFile:foo.c'&gt;
  (stuff to be written to foo.c)
&lt;/programlisting&gt;
</programlisting>
    <para>
      Of course, either of those files can be broken into many
      fragments spread throughout the document.  They can even be
      intermingled.
    </para>
    <para>
      There are two important refinements to mention:
    </para>
    <itemizedlist>
      <listitem>
        <para>
          You can use a CDATA section to enclose the source
          fragment.  This XML convention uses special delimiters
          to tell processing programs not to mess with anything
          between
          &#x201c;<userinput>&lt;![CDATA[</userinput>&#x201d; and
          &#x201c;<userinput>]]&gt;</userinput>&#x201d;.  This is
          especially convenient for enclosing XML fragments,
          because you can use
          &#x201c;<userinput>&lt;</userinput>&#x201d; and
          &#x201c;<userinput>&gt;</userinput>&#x201d; characters
          without having to escape them.
        </para>
      </listitem>
      <listitem>
        <para>
          If your text is not enclosed in a CDATA section, you
          can use DocBook tags inside the
          <userinput>programlisting</userinput> element.
        </para>
        <para>
          For example, you can enclose a function call inside a
          <userinput>link</userinput> element that links to the
          definition of that function.  In both the HTML and PDF
          generated from the DocBook file, that function name
          will then be clickable.
        </para>
        <para>
          Another element you might want to use inside a code
          fragment is the <userinput>co</userinput> element, to
          label lines of the code with callouts that are defined
          later inside DocBook <userinput>callout</userinput> elements.
        </para>
      </listitem>
    </itemizedlist>
    <para>
      Here's an example of the use of callouts, as it would be
      encoded in the DocBook source.  This is from the exposition
      of a schema using <ulink
      url="http://www.nmt.edu/tcc/help/pubs/rnc/">Relax NG
      Compact Format (RNC)</ulink>.
    </para>
    <programlisting
><![CDATA[      <programlisting role='outFile:trails.rnc'>
park = element park
{ attribute name { text }?,   <co id='park.name'>
  trail*                      <co id='park.trail'>
}
</programlisting>
      <calloutlist>
        <callout arearefs='park.name'>
          <para>
            This optional attribute contains the name of the park.
          </para>
        </callout>
        <callout arearefs='park.trail'>
          <para>
            The content of a <userinput>park</userinput> element
            consists of one or more <userinput>trail</userinput>
            elements.
          </para>
        </callout>
      </calloutlist>]]>
    </programlisting>
  </section>
  <section id="operation">
    <title>Operation of the &litsource; script</title>
    <para>
      A script in the Python language extracts the various output files from
      DocBook source files.  Command line arguments are:
    </para>
    <programlisting
>litsource <replaceable>file</replaceable> ...
</programlisting>
    <para>
      Each DocBook-XML source file named on the command line is
      read, and all the <userinput>programlisting</userinput>
      elements with the correct <userinput>role</userinput>
      attribute are assembled and written to the corresponding
      files.
    </para>
    <section id="makefile">
      <title>Suggested <userinput>Makefile</userinput> rules</title>
      <para>
        If you are using the Unix <application >make</application
        > utility to build your document and source files, you
        can add lines to your
        <userinput>Makefile</userinput> to take care of building
        the program source files.
      </para>
      <para>
        First, in the part of your <filename >Makefile</filename
        > that defines variables, define a variable named
        <userinput>CODE_TARGET</userinput> that contains the name
        of the source file you want to build.  If you are
        building multiple source files, any of them will work.
        For instance, if your source file is called <filename
        >run.c</filename >, the rule would look like this:
      </para>
      <programlisting
>CODE_TARGET     =  run.c
</programlisting>
      <para>
        Then, in the rules part of your <filename
        >Makefile</filename >, add this rule:
      </para>
      <programlisting
>code: $(CODE_TARGET)

$(CODE_TARGET): $(TARGET).xml
        litsource $&lt;
</programlisting>
      <para>
        Make sure that the last line starts with an actual
        <keysym>tab</keysym> character.
      </para>
      <para>
        Here's one more refinement.  Suppose you are generating
        an executable script in some scripting language like Perl
        or Python, and you need to make that script executable
        under Unix.  Assume further that the script's name is in
        the <userinput>$(CODE_TARGET)</userinput> variable.  You
        can automate making the script executable with a rule
        like this:
      </para>
      <programlisting
>$(CODE_TARGET): $(TARGET).xml
        litsource $&lt;; \
        chmod +x $(CODE_TARGET)
</programlisting>
    </section> <!--End makefile-->
  </section>
  <section id="source">
    <title>Literate exposition of the &litsource; program
    itself</title>
    <para>
      The &litsource; program is worth study as an example not
      only of literate programming but also of how easy it is to
      process XML files in Python.
    </para>
    <section id="design-notes">
      <title>Design notes</title>
      <para>
        An earlier version of this script used the Document
        Object Model (DOM) to build a tree representation of the
        entire DocBook document.  It then used XPath to pull from
        this tree the set of
        <userinput>programlisting</userinput> elements that had a
        <userinput>role</userinput> attribute whose value started
        with <userinput>"outFile:"</userinput>.  The code was
        straightforward and quite short.  See the <ulink
        url='http://www.nmt.edu/tcc/help/lang/python/examples/litdom'
        >literate exposition of the DOM version</ulink >.
      </para>
      <para>
        However, for large Docbook files, the DOM technique became
        somewhat time-consuming.  For example, a 5400-line DocBook
        file took about 60 seconds to process.  The current
        version, using Python's SAX interface (Simple API for XML),
        processed this same file in 0.11 seconds, a better than
        500-fold performance improvement.
      </para>
      <para>
        SAX is a completely different approach to XML processing.
        It is a serial, event-based technique.  The SAX interface
        reads through the XML and classifies each bit as a start
        tag, end tag, chunk of text, comment, and so on.  The
        programmer defines a set of &#x201c;handlers&#x201d; that
        are called whenever specific types of content are
        encountered.
      </para>
      <para>
        Because the &litsource; script cares only about the text
        inside selected <userinput>programlisting</userinput>
        elements, there is no need to build a tree of the entire
        DocBook file.  All we need is a SAX interface with three
        handlers:
      </para>
      <orderedlist>
        <listitem>
          <para>
            One handler observes each start tag that goes by.
            When it sees a <userinput >&lt;programlisting
            role='outFile:<replaceable >filename</replaceable
            >'&gt;</userinput > tag, it remembers that we are now
            inside a code fragment, and it also remembers the
            <userinput ><replaceable >filename</replaceable
            ></userinput >, and opens an output file by that
            name.
          </para>
        </listitem>
        <listitem>
          <para>
            Another handler is called whenever the SAX interface
            sees text content.  If we are currently inside a
            code fragment, that text content is written to the
            current output file.
          </para>
        </listitem>
        <listitem>
          <para>
            A third handler observes each end tag.  If it is a
            <userinput >&lt;/programlisting</userinput > tag,
            we note that we're no longer inside a code fragment.
          </para>
        </listitem>
      </orderedlist>
      <para>
        Here are some good resources for learning about Python's
        XML libraries:
      </para>
      <itemizedlist>
        <listitem id='python-web'>
          <para>
            See the <ulink url='http://www.python.org'>Python web
            site</ulink> for information and downloads for the
            Python language.
          </para>
        </listitem>
        <listitem id='oreilly-book'>
          <para>
            <citetitle>Python &amp; XML</citetitle>, by Christopher
            A. Jones and Fred L. Drake, Jr.  (O'Reilly Press, 2002,
            ISBN 0-596-00128-2) is an excellent overview of the
            major approaches to Python XML processing, with copious
            examples.
          </para>
        </listitem>
        <listitem id='sax-lib'>
          <para>
            In the online Python Library Reference, see the
            documentation for the <ulink
            url='http://www.python.org/doc/2.4/lib/module-xml.sax.html'
            ><userinput>xml.sax</userinput> module</ulink>.
          </para>
        </listitem>
      </itemizedlist>
      <para>
        This program was written using the Cleanroom or
        zero-defect methodology.  The best introduction to the
        method is given in Stavely, Allan M., <citetitle>Toward
        Zero-defect Programming</citetitle>, Addison-Wesley,
        1999, ISBN 0-201-38595-3.  Also see
        <ulink url="http://www.nmt.edu/~shipman/soft/clean">my
        Cleanroom pages</ulink> for a discussion of how I
        practice the methodology.
      </para>
    </section> <!--End design-notes-->
    <section id="prologue">
      <title>The prologue</title>
      <para>
        The script starts with the usual Python prologue.  The
        first line makes the script self-executing.  This is
        followed by minimal comments pointing to the online form
        of the literate programming document, and the Cleanroom
        intended function for the program as a whole.
      </para>
      <!--NB: It is critical to avoid a blank line at the
       !  beginning of the script, hence the unusual position
       !  of the closing '>' at the end of the next tag:
       !-->
      <programlisting role='outFile:litsource'
>#!/usr/bin/env python
#================================================================
# litsource:  Extract code from literate-programming source files.
#   For documentation, see:
#       http://www.nmt.edu/tcc/help/lang/python/examples/litsource/
#----------------------------------------------------------------
# Overall intended function:
#   [ output files named in input files given on the command line
#         :=  code fragments designated for those files
#     sys.stderr  +:=  error messages if any ]
#----------------------------------------------------------------
</programlisting>
    </section> <!--End prologue-->
    <section id="imports">
      <title>Modules required</title>
      <para>
        Aside from the standard Python <userinput>sys</userinput>
        module that gives programs access to their standard I/O
        streams and command line arguments, the program needs two
        items from the Python's SAX library:
      </para>
      <itemizedlist>
        <listitem>
          <para>
            The <userinput >ContentHandler</userinput > class is
            a base class used to write SAX content handlers.  See
            the <ulink
            url='http://www.python.org/doc/2.4/lib/module-xml.sax.handler.html'
            >documentation for <userinput
            >xml.sax.handler</userinput ></ulink>.
          </para>
        </listitem>
        <listitem>
          <para>
            The <userinput >make_parser()</userinput > function
            is used to create a parser using our content handler.
            See the <ulink
            url='http://www.python.org/doc/2.4/lib/module-xml.sax.html'
            >documentation for <userinput >xml.sax</userinput
            ></ulink >.
          </para>
        </listitem>
      </itemizedlist>
      <programlisting role='outFile:litsource'
>import sys
from xml.sax.handler import ContentHandler
from xml.sax import make_parser
</programlisting>
    </section> <!--End imports-->
    <section id="globals">
      <title>Global declarations</title>
      <para>
        These manifest constants are defined globally.
      </para>
      <variablelist>
        <varlistentry>
          <term>
            <userinput >PROG_ELT</userinput >
          </term>
          <listitem>
            <para>
              The element for the <userinput
              >programlisting</userinput > element.
            </para>
            <programlisting role='outFile:litsource'
>#================================================================
# Manifest constants
#----------------------------------------------------------------

PROG_ELT     =  "programlisting"
</programlisting>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <userinput >ROLE_ATTR</userinput >
          </term>
          <listitem>
            <para>
              The name of the <userinput >role</userinput >
              attribute.
              <programlisting role='outFile:litsource'
>ROLE_ATTR    =  "role"
</programlisting>
            </para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <userinput >ROLE_PREFIX</userinput >
          </term>
          <listitem>
            <para>
              The prefix of the <userinput >role</userinput >
              attribute that identifies this <userinput
              >programlisting</userinput > element as a code
              fragment.
              <programlisting role='outFile:litsource'
>ROLE_PREFIX  =  "outFile:"
</programlisting>
            </para>
          </listitem>
        </varlistentry>
      </variablelist>
    </section> <!--End globals-->
    <section id="main">
      <title>The main program</title>
      <para>
        The only thing the main does is iterate over the list of
        files given as command line arguments, processing each
        one in turn by calling <xref linkend='processFile' />.
      </para>
      <programlisting role='outFile:litsource'
># - - - - -   m a i n   - - - - -

def main():
    """Main program for litsource."""

    #-- 1 --
    for inFileName in sys.argv[1:]:
        #-- 1 body --
        # [ if inFileName names a readable, valid DocBook XML file ->
        #     output files named in that file  :=  code fragments
        #       designated for those files
        #     sys.stderr  +:=  error messages from processing that file,
        #                      if any
        #   else ->
        #     sys.stderr  +:=  error message ]
        processFile ( inFileName )
</programlisting>
    </section> <!--End main-->
    <section id="processFile">
      <title>
        <userinput>processFile</userinput>: Process one input file
      </title>
      <para>
        The <userinput>processFile()</userinput> function handles
        all the processing for one DocBook source file.
      </para>
      <programlisting role='outFile:litsource'
># - - -   p r o c e s s F i l e   - - -

def processFile ( inFileName ):
    """Process one input file.

      [ inFileName is a string ->
          if inFileName names a readable, valid DocBook XML file ->
            output files named in that file  :=  code fragments
              designated for those files
            sys.stderr  +:=  error messages from processing that file,
                             if any
          else ->
            sys.stderr  +:=  error message ]
    """
</programlisting>
      <para>
        The first step is to open the input file, and report
        errors if that fails.
      </para>
      <programlisting role='outFile:litsource'
>    #-- 1 --
    # [ if inFileName names a readable file ->
    #     inFile  :=  that file opened for reading
    #   else ->
    #     sys.stderr  +:=  error message
    #     return ]
    try:
        inFile  =  open ( inFileName )
    except IOError, detail:
        sys.stderr.write ( "*** Can't open file '%s' for reading: %s\n" %
            (inFileName, detail) )
        return
</programlisting>
      <para>
        The next step is to create a SAX parser.  We first create
        a content handler object, an <userinput
        >ArticleHandler</userinput > object.  This object
        contains the three handlers that observe start tags, text
        content, and end tags.  See <xref
        linkend='class-ArticleHandler' />.
      </para>
      <programlisting role='outFile:litsource'
>    #-- 2 --
    # [ ch  :=  an ArticleHandler instance ]
    ch  =  ArticleHandler()
</programlisting>
      <para>
        The remaining steps obey the usual SAX protocol.  We
        create a new SAX parser with the <userinput
        >make_parser()</userinput > function, associate our
        content handler with it using its <userinput
        >.setContentHandler()</userinput > method, and then use
        its <userinput >.parse()</userinput > method to read
        <userinput >inFile</userinput >.
      </para>
      <para>
        This process is fairly well-described on page 53 of the
        <link linkend='oreilly-book'>O'Reilly <citetitle >Python
        and XML</citetitle > book</link>.  (However, the example
        on this page does not run unless you first import the
        <userinput >ContentHandler</userinput > class.)
      </para>
      <programlisting role='outFile:litsource'
>    #-- 3 --
    # [ ch is an ArticleHandler object ->
    #     if inFile contains a readable, well-formed XML file ->
    #       output files named in inFile  :=  code fragments
    #           designated for those files
    #       sys.stderr  +:=  error message(s), if any ]
    saxparser  =  make_parser()
    saxparser.setContentHandler ( ch )
    saxparser.parse ( inFile )
</programlisting>
    </section> <!--End processFile-->
    <section id="class-ArticleHandler">
      <title><userinput >class ArticleHandler</userinput >: The
      customized content handler</title>
      <para>
        This class represents our content handler.  It inherits
        from the SAX <userinput >ContentHandler</userinput >
        class; all we have to do is define three handlers, with
        given names.
      </para>
      <para>
        These state items in the instance manage the process of
        extracting code fragments:
      </para>
      <variablelist>
        <varlistentry>
          <term>
            <userinput >.outFileName</userinput >
          </term>
          <listitem>
            <para>
              Initially set to <userinput >None</userinput >,
              whenever we are inside a code fragment, this
              attribute holds the name of the output file.
            </para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <userinput >.outFile</userinput >
          </term>
          <listitem>
            <para>
              When we are inside a code fragment, this attribute
              holds a writeable file handle that writes to
              <userinput >self.outFileName</userinput >.
            </para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <userinput >.fileMap</userinput >
          </term>
          <listitem>
            <para>
              Because we want each output file to be the
              concatenation of all the code fragments assigned to
              that file, we want to open each output file only
              once.  Hence, the <userinput >.fileMap</userinput >
              attribute holds a dictionary whose keys are the
              names of output files we have seen so far, and each
              corresponding value is a writeable file handle for
              that file.
            </para>
          </listitem>
        </varlistentry>
      </variablelist>
      <programlisting role='outFile:litsource'
># - - - - -   c l a s s   A r t i c l e H a n d l e r   - - - - -

class ArticleHandler(ContentHandler):
    """Content handler object.

      Exports:
        ArticleHandler():  [ return a new ArticleHandler ]

      State/Invariants:
        .fileMap:
          [ a dictionary whose keys are the names of files in
            fragments seen so far; each value is a writeable
            file handle for that file ]
        .outFileName:
          [ if currently within a fragment ->
              the output file name for that fragment
            else -> None ]
        .outFile:
          [ if currently within a fragment ->
              the output file handle for that fragment
            else -> None ]
    """
</programlisting>
    </section> <!--End class-ArticleHandler-->
    <section id="ArticleHandler-init">
      <title><userinput >ArticleHandler.__init__()</userinput >:
      Constructor</title>
      <para>
        The constructor for the <userinput
        >ArticleHandler</userinput > has only two duties.  First,
        it calls the parent class constructor.
      </para>
      <programlisting role='outFile:litsource'
># - - -   A r t i c l e H a n d l e r . _ _ i n i t _ _   - - -

    def __init__ ( self ):
        """Constructor for ArticleHandler.
        """

        #-- 1 --
        # [ self  :=  a new ContentHandler instance ]
        ContentHandler.__init__ ( self )
</programlisting>
      <para>
        Then it initializes the instance variables.
      </para>
      <programlisting role='outFile:litsource'
>        #-- 2 --
        self.fileMap  =  {}
        self.outFileName  =  self.outFile  =  None
</programlisting>
    </section> <!--End ArticleHandler-init-->
    <section id="ArticleHandler-startElement">
      <title><userinput >ArticleHandler.startElement()</userinput
      >: Observe a start tag</title>
      <para>
        The SAX interface requires that the content handler class
        define a method named <userinput
        >.startElement()</userinput > to observe start tags.
      </para>
      <programlisting role='outFile:litsource'
># - - -   A r t i c l e H a n d l e r . s t a r t E l e m e n t   - - -

    def startElement ( self, name, attrs ):
        """Handle a start tag.

          [ (name is the element name) and
            (attrs is a dictionary containing the attribute names |->
            attribute values) ->
              if this tag starts a fragment ->
                self.outFileName  :=  the fragment's file name
                self.outFile      :=  the fragment's output file
                self.fileMap      :=  self.fileMap with an entry
                    mapping the fragment's file name |-> the
                    fragment's output file
              else -> I ]
        """
</programlisting>
      <para>
        If this start tag isn't a <userinput
        >programlisting</userinput > element, or it doesn't have
        a <userinput >role</userinput > attribute, we don't care
        about it.  Otherwise, we save the <userinput
        >role</userinput > attribute the variable <userinput
        >role</userinput >.
      </para>
      <programlisting role='outFile:litsource'
>        #-- 1 --
        if  name != PROG_ELT:
            return

        #-- 2 --
        # [ if attrs has a key ROLE_ATTR ->
        #     role  :=  that attribute's value
        #   else -> return ]
        try:
            role  =  attrs [ ROLE_ATTR ]
        except KeyError:
            return
</programlisting>
      <para>
        If the <userinput >role</userinput > attribute starts
        with <userinput >outFile:</userinput >, we store the rest
        in <userinput >self.outFileName</userinput >, signifying
        that we are now inside a code fragment.  If it's not one
        of our <userinput >role</userinput > attributes, we
        return to the caller.
      </para>
      <programlisting role='outFile:litsource'
>        #-- 3 --
        # [ if role starts with ROLE_PREFIX ->
        #     self.outFileName  :=  the rest of role
        #   else -> return ]
        if  role.startswith ( ROLE_PREFIX ):
            self.outFileName  =  role [ len ( ROLE_PREFIX ) : ]
        else:
            return
</programlisting>
      <para>
        Next we need an output file handle so that the <userinput
        >.characters()</userinput > method will know where to
        write the code content.  If the <userinput
        >self.fileMap</userinput > dictionary already has an
        output file handle in it for this file name, we use that,
        saving it in <userinput >self.outFile</userinput >.
        Otherwise, we open it now and save the file handle in
        <userinput >self.outFile</userinput > and also in the
        <userinput >self.fileMap</userinput >.  Failure to open
        the output file is a fatal error.
      </para>
      <programlisting role='outFile:litsource'
>        #-- 4 --
        # [ if self.fileMap has no key self.outFileName ->
        #     self.fileMap  :=  self.fileMap with an entry mapping
        #         self.outFileName |-> a writeable file handle for
        #             self.outFileName
        #     self.outfile  :=  that same file handle
        #   else ->
        #     self.outFile  :=  the corresponding value from
        #                       self.fileMap ]
        try:
            self.outFile  =  self.fileMap [ self.outFileName ]
        except KeyError:
            try:
                self.outFile  =  open ( self.outFileName, "w" )
                self.fileMap[self.outFileName]  =  self.outFile
            except IOError, detail:
                print >> sys.stderr, ( "*** Can't open file "
                    "'%s' for writing." % self.outFileName )
                sys.exit(1)
</programlisting>
    </section> <!--End ArticleHandler-startElement-->
    <section id="ArticleHandler-characters">
      <title><userinput >ArticleHandler.characters()</userinput
      >:  Observe text content</title>
      <para>
        The SAX interface stipulates that the content handler
        have a method called <userinput >.characters()</userinput
        > that is called to pass it all textual content.  We care
        about such content only if we are currently inside a code
        fragment.
      </para>
      <programlisting role='outFile:litsource'
># - - -   A r t i c l e H a n d l e r . c h a r a c t e r s   - - -

    def characters ( self, text ):
        """Handle text within an element.

          [ text is a string ->
              if  self.outFile is not None ->
                self.outFile  +:=  text
              else -> I ]
        """

        #-- 1 --
        if  self.outFile is not None:
            self.outFile.write ( text )
</programlisting>
    </section> <!--End ArticleHandler-characters-->
    <section id="ArticleHandler-endElement">
      <title><userinput >ArticleHandler.endElement()</userinput
      >: Observe an end tag</title>
      <para>
        Again, the name of this method is mandated by the SAX
        interface as the one it calls when it sees an end tag.
        If it's a <userinput >&lt;/programlisting&gt;</userinput
        > end tag, we clear the values of <userinput
        >self.outFileName</userinput > and <userinput
        >self.outFile</userinput > to signify that we're no
        longer in a code fragment.  Other end tags are ignored.
      </para>
      <programlisting role='outFile:litsource'
># - - -   A r t i c l e H a n d l e r . e n d E l e m e n t   - - -

    def endElement ( self, name ):
        """Handle the end of an element.

          [ name is an element name ->
              if name==PROG_ELT ->
                self.outFile      :=  None
                self.outFileName  :=  None
              else -> I ]
        """
        if  name == PROG_ELT:
            self.outFile  =  self.outFileName  =  None
</programlisting>
    </section> <!--End ArticleHandler-endElement-->
    <section id="epilogue">
      <title>Epilogue</title>
      <para>
        Rather than placing the main at the end of the script, we
        defined it above (<xref linkend='main' />) as a function
        <userinput >main()</userinput > so that the code can be
        presented in top-down order.
      </para>
      <para>
        The lines below cause <userinput >main()</userinput >
        to be called, assuming that &litsource; is the main
        script.  Python sets global variable <userinput
        >__name__</userinput > to the string <userinput
        >'__main__'</userinput > for the outermost script.
      </para>
      <programlisting role='outFile:litsource'
># - - - - -   e p i l o g u e   - - - - -

if  __name__ == '__main__':
    main()
</programlisting>
    </section> <!--End epilogue-->
  </section>
</article>
