Next / Previous / Contents / Shipman's homepage

4. Handling multiple namespaces

A namespace in XML is a collection of element and attribute names. For example, in the XHTML namespace we find element names like body, link and h1, and attribute names like href and align.

For simple documents, all the element and attribute names in a single document may be in the namespace. In general, however, an XML document may include element and attribute names from many namespaces.

4.1. Glossary of namespace terms

4.1.1. URI: Universal Resource Identifier

Formally, each namespace is named by a URI or Universal Resource Identifier. Although a URI often looks like a URL, there is an important difference:

  • A URL (Universal Resource Locator) corresponds more or less to an actual Web page. If you paste a URL into your browser, you expect to get a Web page of some kind.

  • A URI is just a unique name that identifies a specific conceptual entity. If you paste it into a browser, you may get a Web page or you may not; it is not required that the URI that defines a given namespace is also a URL.

4.1.2. NSURI: Namespace URI

Not all URIs define namespaces.

The term NSURI, for NameSpace URI, is a URI that is used to uniquely identify a specific XML namespace.

Note

The W3C Recommendation Namespaces in XML 1.0 prefers the term namespace name for the more widely used NSURI.

For example, here is the NSURI that identifies the “XHTML 1.0 Strict” dialect of XHTML:

http://www.w3.org/1999/xhtml

4.1.3. The blank namespace

Within a given document, one set of element and attribute names may not be referred to a specific namespace and its corresponding NSURI. These elements and attributes are said to be in the blank namespace.

This is convenient for documents whose element and attribute names are all in the same namespace. It is also typical for informal and experimental applications where the developer does not want to bother defining an NSURI for the namespace, or hasn't gotten around to it yet.

For example, many XHTML pages use a blank namespace because all the names are in the same namespace and because browsers don't need the NSURI in order to display them correctly.

4.1.4. Clark notation

Each element and attribute name in a document is related to a specific namespace and its corresponding NSURI, or else it is in the blank namespace. In the general case, a document may specify the NSURI for each namespace; see Section 4.2, “The syntax of multi-namespace documents”.

Because the same name may occur in different namespaces within the same document, when processing the document we must be able to distinguish them.

Once your document is represented as an ElementTree, the .tag attribute that specifies the element name of an Element contains both the NSURI and the element name using Clark notation, named after its inventor, James Clark.

When the NSURI of an element is known, the .tag attribute contains a string of this form:

"{NSURI}name"

For example, when a properly constructed XHTML 1.0 Strict document is parsed into an ElementTree, the .tag attribute of the document's root element will be:

"{http://www.w3.org/1999/xhtml}html"

Note

Clark notation does not actually appear in the XML source file. It is employed only within the ElementTree representation of the document.

For element and attribute names in the blank namespace, the Clark notation is just the name without the “{NSURI}” prefix.

4.1.5. Ancestor

The ancestors of an element include its immediate parent, its parent's parent, and so forth up to the root of the tree. The root node has no ancestors.

4.1.6. Descendant

The descendants of an element include its direct children, its childrens' children, and so on out to the leaves of the document tree.