Abstract
Describes the lxml package for reading and writing XML
files with the Python programming language.
This publication is available in Web form and also as a PDF document. Please
forward any comments to tcc-doc@nmt.edu.
This work is licensed under a
Creative Commons Attribution-NonCommercial 3.0 Unported License.
Table of Contents
ElementTree represents XMLetree moduleComment() constructorElement() constructorElementTree() constructorfromstring() function: Create an element from
a stringparse() function: build an ElementTree from a fileProcessingInstruction()
constructorQName() constructorSubElement() constructortostring() function: Serialize
as XMLXMLID() function: Convert text to
XML with a dictionary of id valuesclass ElementTree: A complete XML
documentElementTree.find()ElementTree.findall(): Find matching
elementsElementTree.findtext(): Retrieve the
text content from an elementElementTree.getiterator(): Make an
iteratorElementTree.getroot(): Find the root
elementElementTree.xpath(): Evaluate an
XPath expressionElementTree.write(): Translate back
to XMLclass Element: One element in the treeElement instanceElement.append(): Add a new element
childElement.clear(): Make an element emptyElement.find(): Find a matching
sub-elementElement.findall(): Find all matching
sub-elementsElement.findtext(): Extract text
contentElement.get(): Retrieve an attribute
value with defaultingElement.getchildren(): Get element
childrenElement.getiterator(): Make an
iterator to walk a subtreeElement.getroottree(): Find the ElementTree containing this elementElement.insert(): Insert a new child
elementElement.items(): Produce attribute
names and valuesElement.iterancestors(): Find an
element's ancestorsElement.iterchildren(): Find all childrenElement.iterdescendants(): Find all
descendantsElement.itersiblings(): Find other
children of the same parentElement.keys(): Find all attribute
namesElement.remove(): Remove a child
elementElement.set(): Set an attribute
valueElement.xpath(): Evaluate an XPath
expressionetbuilder.py: A simplified XML builder moduleetbuilderCLASS(): Helper function for adding
CSS class attributesFOR(): Helper function for adding
XHTML for attributessubElement(): Add a child
elementaddText(): Add text content to an
elementclass ElementMaker: The factory
classElementMaker.__init__(): ConstructorElementMaker.__call__(): Handle calls
to the factory instanceElementMaker.__handleArg(): Process
one positional argumentElementMaker.__getattr__(): Handle
arbitrary method callstestetbuilder: A test driver for
etbuilderrnc_validate: A module to validate XML against a
Relax NG schemarnc_validate modulernc_validate modulernc_validate.py: PrologueRelaxExceptionclass RelaxValidatorRelaxValidator.validate()RelaxValidator.__init__():
ConstructorRelaxValidator.__makeRNG(): Find or
create an .rng fileRelaxValidator.__getModTime(): When
was this file last changed?RelaxValidator.__trang(): Translate
.rnc to .rng format
With the continued growth of both Python and XML, there is
a plethora of packages out there that help you read,
generate, and modify XML files from Python scripts.
Compared to most of them, the lxml package has two big advantages:
Performance. Reading and writing even fairly large XML files takes an almost imperceptible amount of time.
Ease of programming. The lxml package is based on ElementTree,
which Fredrik Lundh invented to simplify and streamline
XML processing.
lxml is similar in many ways to two other, earlier packages:
Fredrik Lundh continues to maintain his original
version of ElementTree.
xml.etree.ElementTree
is now an official part of the Python library. There
is a C-language version called cElementTree which may be even faster than lxml
for some applications.
However, the author prefers lxml for providing a number of
additional features that make life easier. In particular,
support for XPath makes it considerably easier to manage
more complex XML structures.