Next / Previous / Contents / TCC Help System / NM Tech homepage

4. The schema

I've been pretty happy with XML as a general tool for managing complex information, especially since the advent of the Relax NG compact format (RNC) schema language and the nxml-emacs XML-aware text editor that make it so much easier to create and maintain XML files.

Below, in literate programming form, is the schema that defines the slide catalog document type.

4.1. Schema prologue

Here is the opening comment block for the RNC schema.

birdimages.rnc
# slidecat.rnc: Relax NG schema for bird slide.
#   Do not edit this file directly.  It is extracted mechanically
#   from the documentation:
#     http://www.nmt.edu/~john/scans/slides/ims/
#----------------------------------------------------------------

4.2. image-catalog: The root element

The root element for the entire XML document type is image-catalog:

birdimages.rnc
start = image-catalog

image-catalog = element image-catalog
{ attribute date { text }?,       1
  attribute revision { text }?,   2
  original*                       3
}
1 The date attribute is intended for an RCS Date tag or other content indicating when the file was last touched.
2 Intended for an RCS Revision tag or the equivalent.
3 Each primary entry in the catalog is an original element.

4.3. original: Data for one exposure

Each original element describes one exposure, whether on film or digital media.

birdimages.rnc
original = element original
{ attribute cat-no { cat-no-pattern },     1
  attribute ab6 { text },          2
  attribute state                  3
  { xsd:string { pattern='[a-z]{2,3}' }},
  attribute qual                   4
  { xsd:string { pattern='[abcdf]' } }?,
  attribute scan { xsd:positiveInteger }?, 5
  attribute old { old-pattern }?,  6
  attribute arch-no { text }?,     7
  info                             8
}

The cat-no-pattern pattern defines the format of a catalog number.

birdimages.rnc
cat-no-pattern = xsd:string
{ pattern='[0-9]{4}-[0-9]{2}-[0-9]{2}[\-a-z][0-9]{4}'
}

The old-pattern pattern defines the format of an old-style catalog number.

birdimages.rnc
old-pattern = xsd:string
{ pattern='[0-9]+\.[0-9]{2}'
}
1

The catalog number is required and it must be unique within the file.

2

A list of one or more six-letter bird codes representing species in the image; each code may be followed by a ? character if the identity is not certain. For the bird code system, see A system for representing taxonomic nomenclature. Examples:

    ab6='amerob'
    ab6='larbun?'
    ab6='eargre amecoo'
    ab6='dowwoo|haiwoo'
    ab6='amewig^eurwig?'
3 U.S. postal code or the foreign equivalent. Two or three lowercase letters.
4 Optional quality indicator. Quality a is reserved for really stunning photography, suitable for contests or commercial sale; b is a pretty good picture; c is decent; d is poor (the default); and quality code f is reserved only for really horrible pictures that are in the catalog only because they are the only documentation there is.
6

Catalog number under the old system. Originally I assigned serial numbers to each roll, so for example “#59.38” would mean roll 58, frame 38. There are a few undated originals with these notations, so I can sometimes use the old catalog numbers to approximate their dates.

7

This attribute is obsolete. Originally I was going to record in the image catalog which archive directory contained the reference image. Later I decided to put this information somewhere else; see archx: A program to index a photo archive.

Formerly, if this image were contained in one of the standard PNG archive CDs, this attribute is the three-digit archive number. For example, arch-no='003' means the image was contained on the PNG-003 CD. This has been abandoned; bird images are now in www/scans/bird/bird-NNN.

5

If the original is film, this attribute specifies the dot pitch, in dots per inch, of the scanner used to produce the digital image. For example, for a scanner that produces 4000dpi, the attribute would be scan='4000'. Omitted for digital originals.

8 The children of this element can be any of the elements in the info pattern.

4.4. The info pattern

Several elements describing the image can occur as children of the original element.

birdimages.rnc
info = ( loc? & note? & film? & light? & beh? & desc? & pose? )
film  = element film
{ attribute iso { iso-pattern }?,
  text
}
iso-pattern = xsd:positiveInteger
loc   = element loc { text }
note  = element note { text }
light = element light { text }
beh   = element beh { text }
desc  = element desc { text }
pose  = element pose { text }

The content of the film element may be either a filmstock description such as “KR” for Kodachrome, or a digital camera type.

Each of the remaining choices is just a container for text. Their meanings are:

film

Filmstock or digital camera type. Common codes include KR for Kodachrome 64; KL for Kodachrome 200; EL for Ektachrome 400; Fj for Fuji Reala; and VC for Kodak VC-400.

The optional iso attribute is the ASA or ISO sensitivity rating.

loc

Description of the locality. Use colons to separate levels, e.g., “<loc>Bosque del Apache: Headquarters</loc>”.

note

General notes about the image that don't fit the other categories. There can be multiple note elements; each will be treated as a separate paragraph.

light

Comments on the lighting. Backlit, sidelit, direct (from behind the camera); strong or weak light; filtered; polluted; low-angle; and so on.

beh

Notes on the bird's behavior, e.g., flight, squawk (mouth is open).

desc

Plumage or other visual description of the bird.

pose

How the bird is posed: frontal, front quarter, profile, rear quarter, butt-shot.