Next / Previous / Contents / Shipman's homepage

7.  Flat output files

If you are planning on representing bird taxonomy using a relational database, flat files are a universal format accepted by all the major database systems. In a flat file, each record consists of a sequence of fixed-length fields.

When you run nomcompile3, it writes three files:

The sections below describe the formats of these product files.

7.1.  The tree (.tre) file

The tree file defines all the different scientific names used in the input. Here is the format of that file:

LengthContents
varies The taxonomic key number. The exact format of this field depends on the content of the ranks file; see Section 7.1.1, “Taxonomic key numbers”.
6 If this taxon has a standard six-letter bird code, that code appears here; otherwise the field is blank.
1 For generally accepted forms, this field is blank. If the form is not in the main AOU Check-List, a question mark (?) appears here.
36

The next field is the scientific name of the group to which this form is referred, for example, Junco hyemalis. The field is aligned flush left and padded on the right with spaces. For forms not identified to species, the smallest containing taxon is used, e.g., Aves for “bird sp.”

For subspecific forms defined in the alternate names file, this field contains the scientific name with a space and an integer appended. For example, in the line for the standard species Snow Goose, this line will have the value “Chen caerulescens”, while Blue Goose will have “Chen caerulescens 1”, Blue-Snow intergrade “Chen caerulescens 2”, and so on.

56

The English name of the form appears next, aligned flush left and right-padded with spaces. For multi-word names, the generic part comes first, followed by a comma, one space, and the specific part. No underbar (“_”) characters will appear in this field, so genus and species names will be rendered in ordinary type.

Examples:

Dunlin
Loon, Red-throated
grebe sp.
bird sp.
bird, large sp.
teal, Blue-winged x Cinnamon
Junco, (Gray-headed x Slate-colored) Dark-Eyed

7.1.1. Taxonomic key numbers

The taxonomic key number can be used to sort records into phylogenetic order, as defined by the AOU Check-List. It contains one or more digits for each rank (except for the root rank). The number of digits for each rank is determined by the third column in the ranks file.

Warning

It is an extremely bad idea to use this number to represent a taxon for any other purpose other than sorting. Not only is it spectacularly meaningless out of context, but any change to the input files will change all of the taxonomic key numbers.

For example, if your ranks file looks like the example given above (2-digit order, 2-digit family, 1-digit subfamily, 2-digit genus, 2-digit species, and 2-digit form), each taxonomic key number would have these components:

  • The two-digit serial number of the taxonomic order in which this form is placed, or “00” if the form is not placed into an order (e.g., “bird sp.”).

  • The two-digit serial number of the taxonomic family within this order, or “00” for forms not placed within a specific family. Note that the sequence of families starts over at “01” again within each order.

  • The one-digit serial number of the subfamily within the family, or “0” if the subfamily is unknown.

  • The two-digit serial number of the genus within the family, or “00” if the genus is unknown.

  • The two-digit serial number of the species within the genus, or “00” if the species is unknown.

  • The two-digit serial number of the form within the species, or “00” if the form is unknown.

For example, code daejun (Dark-eyed Junco) might have a taxonomic key number of “21 24 3 47 01 00” (the spaces here are for clarity—they are not actually present in the record). This key would mean that this form is in the 21st order, and in the 24th family within that order, the 3rd subfamily within that family, the 47th genus within that subfamily, and the first species within that genus, and not in any known subform of the species.

Other forms that are included within Dark-eyed Junco will have keys “21 24 3 47 01 01”, “21 24 3 47 01 02”, and so on. Examples of such forms include races such as Gray-headed Junco, hybrids among the different races (e.g., “Gray-headed × Slate-colored Junco”), and obsolete names (“Northern Junco”).

Note that the taxonomic key number can be used to deduce relationships between form codes. For example, to find out what genus a species is in, just construct a key number that is the same as the species' key number, but with its species number set to “00”. Continuing the example above, suppose Gray-headed Junco has this key number:

21 24 3 47 01 01

Then we can deduce all the higher ranks by substituting zeroes in the appropriate fields:

21 24 3 47 01 00 The containing species, Junco hyemalis
21 24 3 47 00 00 The containing genus, Junco
21 24 3 00 00 00 The containing subfamily, Emberizinae
21 24 0 00 00 00 The containing family, Emberizidae
21 00 0 00 00 00 The containing order, Passeriformes
00 00 0 00 00 00 The containing class, Aves