Sometimes you want to walk through all or part of a document, looking at all the elements in document order. Similarly, you may want to walk through all or part of a document and look for all the occurrences of a specific kind of element.
.getiterator() method on an
Element instance produces a Python iterator that
tells Python how to visit elements in these ways. Here
is the general form, for an
If you omit the argument, you will get an iterator that
first, then all its element children and their children,
in a preorder traversal of that
If you want to visit only elements with a certain tag name, pass the desired tag name as the argument.
Preorder traversal of a tree means that we visit the root
first, then the subtrees from left to right (that is, in
document order). This is also called a
depth-first traversal: we visit the root, then
its first child, then its first child's first child, and
so on until we run out of descendants. Then we move back
up to the last element with more children, and repeat.
Here is an example showing the traversal of an entire tree. First, a diagram showing the tree structure:
A preorder traversal of this tree goes in this order: a, b, c, d, e.
>>> xml = '''<a><b><c/><d/></b><e/></a>''' >>> tree = etree.fromstring(xml) >>> walkAll = tree.getiterator() >>> for elt in walkAll: ... print elt.tag, ... a b c d e >>>
In this example, we visit only the
>>> xml = '''<bio> ... <bird type="Bushtit"/> ... <butterfly type="Mourning Cloak"/> ... <bird type="Mew Gull"/> ... <group site="Water Canyon"> ... <snake type="Sidewinder"/> ... <bird type="Verdin"/> ... </group> ... <bird type="Pygmy Nuthatch"/> ... </bio>''' >>> root = etree.fromstring(xml) >>> for elt in root.getiterator('bird'): ... print elt.get('type', 'Unknown') ... Bushtit Mew Gull Verdin Pygmy Nuthatch >>>
Note in the above example that the iterator visits the Verdin element even though it is not a direct child of the root element.