Next / Previous / Contents / TCC Help System / NM Tech homepage

4.3. Elements with mixed content

In general, an element can have any mixture of text and other elements as children. You can specify exactly which elements can be children. If you like, you can even specify that the children must occur in a given order. You can also specify that the child elements are optional.

So, in the general form of the declaration <!ELEMENT gi (content)>, the content is an expression syntax—that is, it consists of operators and operands arranged in arbitrarily complex ways.

Let's start with some simple cases to show you the features of a content declaration, but keep in mind that these features can be used in combination.

The simplest case is when an element a has a single child element b:

    <!ELEMENT a (b)>

The above declaration in a DTD means that an element <a>...</a> must contain exactly one <b> element.

To specify that a child element can occur one or more times, append a plus sign (+) after the child element name. For example, to say that a <squid> element may contain one or more <tentacle> elements:

    <!ELEMENT squid (tentacle+)>

You can also specify that a child element can occur any number of times, or not at all. Append an asterisk (*), meaning “zero or more of the previous,” after the child element name:

    <!ELEMENT lizard (leg*)> <!-- some <lizard>s have no <leg>s -->

The question-mark suffix (?) means the child element is optional: it can occur zero or one time in the content of the element you're declaring. For example, suppose an <oven> element can either be empty or contain a <pie> element:

    <!ELEMENT oven (pie?)>

If you want a certain sequence of children, name the child elements in a comma-separated list. For example, suppose a <memo> element must contain exactly one <from> element, then one <to> element, one <subject>, and one <message> element:

    <!ELEMENT memo (from,to,subject,message)>

But you can use the +, *, and ? operators in this declaration. For example, suppose that you want to require that a <memo> must have <from> and <to> elements, but the <subject> element is optional, and it can have zero or more <message> elements. You'd then declare it like this:

    <!ELEMENT memo (from,to,subject?,message*)>

Sometimes you need to specify that there is a choice of children. The “or” operator (|) can be used to separate the choices. For example, suppose that a <trophy> element can have either a child named <bowling> or a child named <tennis>. Here's how you'd declare it:

    <!ELEMENT trophy (bowling|tennis)>

You can also apply the usual suffix operators to groups of elements. For example, suppose you have an element <timerecord> that starts with a required <purpose> element, followed by zero or more pairs of <start-time> and <end-time> records:

    <!ELEMENT timerecord (purpose,(start-time,end-time)*)>

Here's another more general example:

    <!ELEMENT stock ((pig|chicken|cow)*)>

The above example says a <stock> element can contain any number of the three child elements, in any order.

Moreover, you can allow regular, untagged text to be mixed in with your specified child tags by placing #PCDATA at the start of a list of choices. For example, suppose a <speech> element can contain any mixture of regular text, and text tagged with the elements <loud> and <soft>:

    <!ELEMENT speech ((#PCDATA|loud|soft)*)>
    <!ELEMENT loud (#PCDATA)>
    <!ELEMENT soft (#PCDATA)>

So, the content part of the element declaration can be arbitrarily complex. There are some ways #PCDATA cannot be used, and there are other uncommon features you may need; refer to the XML standard or a good book on the subject.