Next / Previous / Contents / Shipman's homepage

4. The six-letter bird code system

There are a number of different systems for encoding kinds of birds. This package supports a particular system that grew out of the author's work creating a database of 90 years of Christmas Bird Count data. The package could be modified to work with other systems.

4.1. Design goals for bird code systems

Among programmers, good programs or systems are often described as “robust.” Such systems should be easy to learn and use, and they should not tend to confuse users or mangle data. Design of a good encoding system involves more than just the problem of representing the data. We should consider human factors as well.

Here are some other qualities of a good code system:

  • Codes should be short, to save keystrokes during data entry.

  • Encoding should be easy to learn and quick to execute.

  • The codes should be meaningful and easy to decode. Although any code can be translated mechanically by a program, it often saves time if we can just look at a code and know what it means without having to look it up.

  • It should handle forms other than species— any category of birds, however precise (“Blue Goose”) or vague (“black bird sp.”) the identification.

  • It should cope well with the continual changes in taxonomy and nomenclature.

  • It should be usable even by non-experts, so beginners and even non-birders can use it for data entry.

  • Use of a code should not be a significant source of errors.

For efficient data entry, we want to be able to bang the records into the machine quickly (minimizing mistakes, of course). Speed depends on more than just the keystroke rate. Thinking takes time too—the time it takes to think of the right code, or look it up if necessary.

A robust system should also be designed so that most errors can be detected easily, and easily corrected whenever possible. In the author's opinion, this is an argument against using the shortest possible code. Longer codes have more redundancy, so it is more likely that a user can figure out what was meant even if the code has an error in it. As an example, the English language has a lot of redundancy in it, which is a robust characteristic. We can oftxn undxrstand a sxntxncx xvxn if it contains quitx a fxw typos.

The best way to represent the name of a bird is to spell it out, and to conform (where possible) to the names standardized in the current edition of the AOU Check-List.

However, using the AOU standardized names has some drawbacks:

  • For computer applications requiring bird names to be encoded for storage, typing a full name is prohibitively inefficient.

  • The AOU does not address the common problem of representing imprecise identifications such as “duck sp.” or “large falcon.”

  • A system for retrieval of sight records must be able to handle exotics (e.g., escaped waterfowl) that are not included in the AOU Check-List.