Table of Contents
xml4create.py: A more Pythonic module for XML file creation
xml4create.py: The source code
Document.splitQName(): Process a qualified name
Element.__wrapRoot(): Wrap the root element
Element.__setAttr(): Set up one attribute
Element.__newElement(): Element child of an element
Element.update(): Copy XML attributes to the element
The Python programming language is a fine vehicle for writing programs that process or generate XML files. This document is a quick reference to the major features of Python's interface to the XML Document Object Model (DOM), which provides for a tree-like data structure that represents an XML document. We won't cover every feature; this is just a selection of commonly used techniques.
The reader is assumed to be generally familiar with the Python language and with the structure of XML documents. For more information, see these references:
For an introduction to Python's approach to the DOM,
see the Python Library Reference page, “
xml.dom—The Document Object
Model API .”
For help on XML, see TCC's XML page.
This document assumes you have the 4Suite package installed. For information and downloads, see the Fourthought page.
This document also presents a number of Python source files in literate programming style: a useful module and a few short test drivers demonstrating its use. These files are available online:
test driver for
Test driver demonstrating output in multiple namespaces.
driver demonstrating creation of a document fragment.
For most XML processing, the Document Object Model is a convenient way to read or write XML files. However, there are some situations where other techniques may be necessary.
A DOM tree is resident entirely in memory. Obviously, if you are reading a five-gigabyte XML file, it probably won't fit.
Building a DOM tree from a document can be a slow process for larger files.
If memory size or performance are issues, refer to the SAX processing model, which is outside the scope of this document. Here is a good resource:
Jones, Christopher A., and Fred L. Drake, Jr. Python & XML. O'Reilly, 2002, ISBN 0-596-00128-2.
Throughout the descriptions of these interfaces, we
describe a number of arguments and results as
“strings”. It is safest to use Python
unicode strings throughout these interfaces. Regular
Python strings of class
str will be
converted to Unicode automatically, but the automatic
conversion may not do the right thing.