What happens to your application if you read a file that does not conform to the schema? There are two ways to deal with error handling.
If you are a careful and defensive programmer, you will always check for the presence and validity of every part of the XML document you are reading, and issue an appropriate error message. If you aren't careful or defensive enough, your application may crash.
It can make your application a lot simpler if you mechanically validate the input file against the schema that defines its document type.
With the lxml module, the latter approach is inexpensive
both in programming effort and in runtime. You can
validate a document using either of these major schema
languages:
The lxml module can validate a document, in the form of an
ElementTree, against a schema expressed in
the Relax NG notation. For more information about Relax
NG, see Relax NG
Compact Syntax (RNC).
A Relax NG schema can use two forms: the compact syntax (RNC), or an XML document type (RNG). If your schema uses RNC, you must translate it to RNG format. The trang utility does this conversion for you. Use a command of this form:
trangfile.rncfile.rng
Once you have the schema available as an .rng file, use these steps to valid an
element tree .
ET
Parse the .rng file into its own ElementTree, as described in Section 6.3, “The ElementTree() constructor”.
Use the constructor etree.RelaxNG( to convert that tree into
a “schema instance,” where S) is the SElementTree instance, containing the schema,
from the previous step.
If the tree is not a valid Relax NG
schema, the constructor will raise an etree.RelaxNGParseError exception.
Use the .validate( method of the schema
instance to validate ET).
ET
This method returns 1 if validates
against the schema, or ET0 if it does
not.
If the method returns 0, the schema
instance has an attribute named .error_log containing all the errors
detected by the schema instance. You can print .error_log.last_error to see the most recent
error detected.
Presented later in this document are two examples of the use of this validation technique: