Next / Previous / Contents / Shipman's homepage

3. Design considerations

In any representation of the real world, there is always a tradeoff: what qualities are formalized, and which ones informal? When designing XML document types, we don't want every possible thing to result in a new kind of element. Some things are important enough to rate their own tags or attributes, but some things are best left as good old English text.

Similarly, there is a tradeoff in validation. The nxml-emacs package for the emacs text editor ensures that the XML document you create is valid, that is, that it conforms to the schema. But the schema can go only so far: a valid document may still have problems that prevent it from being useful. Ultimately a human should go over the rendered form of the information with a critical eye, and fix the problems manually. However, it may be possible to write additional software tools to check the content, and effort invested in such tools may pay off in the long run.

To be more specific, let's turn to the problem at hand. To represent sightings of birds, there are three fundamental dimensions.

3.1. The time dimension

Time: When was the bird seen? For our purposes, a granularity of one day is a good first approximation. Birders generally record the day of a sighting. However, there are a few cases where the time of day is important. For example, certain hummingbirds may come to feeders only at certain times or only at certain intervals. We don't need to formalize that information if bird records are allowed to contain generic, unstructured text: “The Carolina Wren generally stops singing here before 10am.”