Next / Previous / Contents / Shipman's homepage


Describes a compiler for processing a set of files that describe taxonomic arrangements of North American birds.

This publication is available in Web form and also as a PDF document. Please forward any comments to

Table of Contents

1. Introduction
2. Software methodologies
3. Design overview
3.1. Design issues with taxonomies
3.2. Design issues with form codes
3.3. Principal classes
3.4. Overall procedure
3.5. Error recovery
4. Prologue
5. Imported modules
6. Manifest constants
6.13. SSP_CODE
6.18. COLL_SEP
6.19. L_SCI
6.20. L_ENG
6.23. HT_NAME_RE
6.24. SLASH_RE
7. Verification functions
7.1. abbr-file: Name of the codes output file
7.2. alt-file: Name of the alternat forms input file
7.3. can-add-child: Can a given parent have a given child?
7.4. coll-file: Name of the collisions output file
7.5. eff-ranks: Effective ranks input file name
7.6. eng-normalize: Normal form of an English name
7.7. input-files: All input files
7.8. output-files: All output files
7.9. rank-parent: What taxon is the parent of a new taxon of a given rank?
7.10. std-file: Name of the standard forms input file
7.11. tree-file: Name of the flat tree output file
7.12. xml-file: Name of the XML output file
8. main(): Main program
9. writeProductFiles: Write all product files
10. writeTreeFile(): Write the flat tree file
11. writeSubtree(): Recursive tree walker
12. writeAbbrFiles(): Write the flat abbreviation and collision files
13. writeXMLFile(): Write the XML file
14. class Args: Process command line arguments
14.1. Args.__init__(): Constructor
14.2. Args.__buildParser(): Generate a command line parser
15. class Hier: Taxonomic levels of interest
15.1. Hier.__len__(): How many ranks?
15.2. Hier.__getitem__(): Return the (k)th rank
15.3. Hier.__iter__(): Iterator for ranks
15.4. Hier.__contains__(): Is a given code in this hierarchy?
15.5. Hier.lookupRankCode()
15.6. Hier.canParentHaveChild()
15.7. Hier.keyLen(): Find the key length for a given depth
15.8. Hier.txKeyFill(): Right-zero-pad a taxonomic key
15.9. Hier.writeXML(): Generate XML
15.10. Hier.__init__(): Constructor
15.11. Hier.__readRanksFile(): Read the ranks file
16. class Rank: One taxonomic rank
16.1. Rank.__cmp__()
16.2. Rank.__str__()
16.3. Rank.writeXML(): Generate XML
16.4. Rank.__init__(): Constructor
16.5. Rank.__scanCode(): Scan the code field
16.6. Rank.__scanRequired(): Scan the required flag field
16.7. Rank.__scanKeyLen(): Scan the key length field
16.8. Rank.__scanName(): Scan the rank name field
17. class Txny: The entire classification
17.1. Txny.lookupAbbr(): Find the definition of a code
17.2. Txny.lookupSci(): Find a scientific name
17.3. Txny.__init__(): Constructor
17.4. Txny.__readStd(): Read the standard forms file
17.5. Txny.__readStdLine(): Process one line from the standard forms file
17.6. Txny.__scanNonSpTail(): Process higher-taxon tail
17.7. Txny.__appendTaxon(): Try to append this taxon to the tree
17.8. Txny.__scanSpTail(): Process a species tail
17.9. Txny.__checkGenus(): Add a new genus?
17.10. Txny.__checkSubgenus(): Add a new subgenus?
17.11. Txny.__addSpecies(): Add a new species and its bindings
17.12. Txny.__addCodes(): Set up symbol table entries for a standard form
17.13. Txny.__addCodeStd(): Create a standard binding
17.14. Txny.__addCodeColl(): Add standard and collision symbol bindings
17.15. Txny.__readAlt(): Read the alternate forms file
17.16. Txny.__readAltLine(): Process one line of the alt file
17.17. Txny.__scanAbbr(): Scan a code
17.18. Txny.__scanHigherAlt(): Scan a higher-taxon line
17.19. Txny.__bindHigherAlt(): Bind a higher-taxon code
17.20. Txny.__scanEquivalentAlt(): Scan a direct-equivalent line
17.21. Txny.__bindEquivalentAlt(): Create an equivalence binding
17.22. Txny.__scanSubspecificAlt(): Scan a subspecific form line
17.23. Txny.__bindSubspecificAlt()
17.24. Txny.__findSubspParent(): Under what species does this new subspecies go?
17.25. Txny.__scanCollisionAlt(): Scan a higher-taxon line
17.26. Txny.__bindCollision()
17.27. Txny.dispatchTable: Routing table for alt records
17.28. Txny.__finalCheck(): Verify correctness of the symbol table
18. class TaxaTree: The taxonomic tree
18.1. TaxaTree.__init__(): Constructor
18.2. TaxaTree.setRoot(): Store the root taxon
18.3. TaxaTree.rankParent(): Under what parent does a new taxon go?
18.4. TaxaTree.canAddChild()
18.5. TaxaTree.addTaxon()
18.6. TaxaTree.lookupTxKey()
18.7. TaxaTree.__getitem__(): Retrieve a taxon by scientific name
18.8. TaxaTree.__contains__(): Membership test for a scientific name
18.9. TaxaTree.writeXML(): Generate XML output
19. class Taxon: One node in the taxonomy
19.1. Taxon.__init__()
19.2. Taxon.__len__(): How many children?
19.3. Taxon.__getitem__(): Return the (n)th child
19.4. Taxon.__iter__(): Iterator for the children
19.5. Taxon.__str__()
19.6. Taxon.abbr(): Is there a standard code?
19.7. Taxon.childKey(): Derive a child's taxonomic key
19.8. Taxon.writeFlat(): Write a tree-file record
19.9. Taxon.writeXML(): Recursive XML tree writer
19.10. Taxon.__writeXMLNode(): Write one node
20. class StdHead: The common front part of a standard forms line
20.1. StdHead.__init__()
20.2. StdHead.__scanRankCode()
20.3. StdHead.__scanStatus()
21. class NonSpTail: Scanner for the non-species tail
21.1. NonSpTail.__init__()
22. class SpTail: Scanner for the species tail
22.1. SpTail.__init__()
22.2. SpTail.__scanSci()
22.3. SpTail.__checkEng(): Check the English name
22.4. SpTail.__checkDisamb(): Check the disambiguation
23. class RawTaxon: Temporary container for taxon attributes
24. class AbTab: The symbol table for codes
24.1. AbTab.addAbbr(): Create a symbol table entry or find an existing one
24.2. AbTab.__getitem__(): Find the symbol table entry for a given code
24.3. AbTab.__contains__(): Is this code in the symbol table?
24.4. AbTab.__iter__(): Generate all symbol table entries
24.5. AbTab.writeXML()
24.6. AbTab.__init__()
25. class AbSym: One symbol table entry
25.1. AbSym.__init__()
25.2. AbSym.bind(): Try to add a binding
26. class AbBind: Base class for bindings
26.1. AbBind.__init__()
26.2. AbBind.writeFlat(): Write an abbreviations file record
27. class StdBind: Code bound to a taxon
27.1. StdBind.__init__()
27.2. StdBind.__str__()
27.3. StdBind.combine()
27.4. StdBind.lookup(): Return the related taxon
27.5. StdBind.eng()
27.6. StdBind.writeXML()
28. class EqBind: Code equivalent to another code
28.1. EqBind.__init__()
28.2. EqBind.__str__()
28.3. EqBind.combine()
28.4. EqBind.lookup(): What is the referenced taxon?
28.5. EqBind.__chainClosure(): Recursive lookup function
28.6. EqBind.eng(): Get the English name
28.7. EqBind.writeXML()
29. class CollBind: Cluster of colliding codes
29.1. CollBind.__init__()
29.2. CollBind.__str__()
29.3. CollBind.combine()
29.4. CollBind.lookup()
29.5. CollBind.eng()
29.6. CollBind.writeFlat(): Write a collisions file record
29.7. CollBind.writeXML()
30. validateEng(): Validate an English name
31. Epilogue