Next / Previous / Contents / Shipman's homepage


Describes a system for representing taxonomic arrangements of bird checklists, and an interface for retrieving such data using the Python programming language.

This publication is available in Web form and also as a PDF document. Please forward any comments to

Table of Contents

1. Introduction
2. Files for downloading
3. Requirements
4. The six-letter bird code system
4.1. Design goals for bird code systems
4.2. Origins of the six-letter code system
4.3. Rules for the six-letter code system
4.4. Handling collisions
5. Input files
5.1. The ranks file
5.2. Preparing the standard forms (.std) file
5.2.1. Higher-taxon records in the .std file
5.2.2. Species records in the .std file
5.3. Preparing the alternate forms (.alt) file
5.3.1. What constitutes a valid English name?
5.3.2. Higher taxon records in the .alt file
5.3.3. Direct equivalent records in the .alt file
5.3.4. Subspecific forms records in the .alt file
5.3.5. Collision records in the .alt file
6. Building the standard product files
7. Flat output files
7.1. The tree (.tre) file
7.1.1. Taxonomic key numbers
7.2. The abbreviations (.ab6) file
7.3. The collisions (.col) file
8. Schema for the XML product file
8.1. taxonomySystem: the XML root element
8.2. rankSet: Taxonomic ranks in use
8.3. taxonomy: The classification tree
8.4. abbrSet: Bird code definitions
8.5. collisionSet: List of collision codes
9. The Python taxonomy interface,
9.1. Class Txny: the complete system
9.2. Class Hier: The set of taxonomic ranks
9.3. Class Rank: One taxonomic rank
9.4. The Taxon class: One node in the classification tree
10. The module
10.1. Manifest constants
10.2. Utility functions
10.3. class BirdId

1. Introduction

For the representation and processing of data about wild birds, a firm nomenclatural base is essential. Fortunately, for North American birds, the American Ornithologists' Union (AOU) provides a definitive and inclusive classification of forms that have occurred here, in printed form as the AOU Check-List. The AOU's periodical, The Auk, periodically publishes supplements updating the names and arrangements.

In the process of entering a few million records of bird data for the Audubon Christmas Bird Count database, the author also developed a system for representing bird names using a six-letter code. This system currently includes over 2500 codes, each code connected to either a taxon or to an English name.

This document describes a software base for representing the AOU's taxonomic arrangements and the author's six-letter codes as computer files. Once the various input files are prepared and compiled, there are several different ways to access these data:

  • You can use a single XML file that describes both the taxonomy and the bird code system. Prebuilt files are available for all of the AOU Check-List arrangements starting with the Seventh Edition and its various Supplements.

  • A module written in the Python programming language makes it easy for you to write scripts in Python that use one of these XML data files.

  • You can use a set of three “flat files” that can be loaded into a database system or spreadsheet.

To download any of these data or program files, see Section 2, “Files for downloading”.

A companion document, Bird taxonomy system: internal maintenance specification, describes the internals of the various programs in this suite.

This is the third major version. The first version produced only flat files. The second version added the single XML file and its Python interface. This version simplifies the handling of italic markup and adds methods for performing HTML markup.