Next / Previous / Contents / TCC Help System / NM Tech homepage

Abstract

Describes a script that extracts the source code from a program presented in lightweight literate programming form, using the DocBook documentation toolchain, the Python programming language, and the lxml module for XML processing.

This publication is available in Web form and also as a PDF document. Please forward any comments to tcc-doc@nmt.edu.

Table of Contents

1. Introduction
2. Encoding the literate program
3. Operation of the litlxml script
3.1. Suggested Makefile rules
4. Literate exposition of the litlxml program itself
4.1. Design notes
4.2. The prologue
4.3. Modules required
4.4. Global declarations
4.5. Verification functions
4.6. The main program
4.7. processFile(): Process one input file
4.8. processDoc(): Process one document tree
4.9. processElt(): Process one literate element
4.10. Epilogue

1. Introduction

 

Programs must be written for people to read, and only incidentally for machines to execute.

 
 -- Structure and interpretation of computer programs, Harold Abelson and Gerald Jay Sussman, p. xvii

By literate programming, we mean programs that are intended to be readable. The idea comes from Dr. Donald E. Knuth and has a long history. For background, see the Literate Programming web site.

Knuth's cweb system interwove narrative about the program with the actual source code of the program. One then runs a tool named ctangle to generate the source code an a different tool named cweave to generate the online documentation.

The present effort was inspired by similar efforts of Dr. Allan M. Stavely, who suggested using DocBook as a general framework for literate programming. Refer to Writing documentation with DocBook-XML 4.2 for more information on DocBook.

Stavely's idea was to use DocBook's existing programlisting element to hold the program fragments, adding a role='executable' attribute to that element to distinguish executable source code from other uses of the programlisting element. This means that the regular processing of DocBook into HTML and PDF forms becomes the new equivalent of Knuth's cweave step.

The remaining half of the problem, the extraction of the executable code from the DocBook source file, is the subject of this document.

The litlxml script is embedded in this document. Relevant online files include: