What happens to your application if you read a file that does not conform to the schema? There are two ways to deal with error handling.
If you are a careful and defensive programmer, you will always check for the presence and validity of every part of the XML document you are reading, and issue an appropriate error message. If you aren't careful or defensive enough, your application may crash.
It can make your application a lot simpler if you mechanically validate the input file against the schema that defines its document type.
lxml module, the latter approach is inexpensive
both in programming effort and in runtime. You can
validate a document using either of these major schema
lxml module can validate a document, in the form of an
ElementTree, against a schema expressed in
the Relax NG notation. For more information about Relax
NG, see Relax NG
Compact Syntax (RNC).
A Relax NG schema can use two forms: the compact syntax (RNC), or an XML document type (RNG). If your schema uses RNC, you must translate it to RNG format. The trang utility does this conversion for you. Use a command of this form:
Once you have the schema available as an
.rng file, use these steps to valid an
.rng file into its own
ElementTree, as described in Section 7.3, “The
Use the constructor
etree.RelaxNG( to convert that tree into
a “schema instance,” where
ElementTree instance, containing the schema,
from the previous step.
If the tree is not a valid Relax NG
schema, the constructor will raise an
.validate( method of the schema
instance to validate
This method returns
against the schema, or
0 if it does
If the method returns
0, the schema
instance has an attribute named
.error_log containing all the errors
detected by the schema instance. You can print
.error_log.last_error to see the most recent
Presented later in this document are two examples of the use of this validation technique: