Next / Previous / Contents / Shipman's homepage

9. The Python taxonomy interface, xnomo3.py

The author considers the Python language the best general-purpose programming currently available. This section describes a Python-language module suited for putting a firm taxonomic foundation under bird records work.

Python module xnomo3.py provides access to an XML taxonomy file as described above under Section 8, “Schema for the XML product file”. It insulates you from the XML files, providing attributes and methods that allow you to look up bird codes and other common operations in bird records management.

To use this module, import it like this:

from xnomo3 import *

Here are the exported classes available in the xnomo3 module.

9.1. Class Txny: the complete system

Normally the first thing you'll do is instantiate a Txny object, which represents the entire system—taxonomy plus bird codes:

Txny ( dataFile=None )

Reads the XML data file and returns a Txny object representing that file.

If no argument is supplied, this constructor looks for a file named aou.xml in the current directory.

If there is no readable, valid XML file, the constructor raises an IOError exception.

Attributes of a Txny object include:

.root

The root taxon of the taxonomic arrangement, as a Taxon object. See Section 9.4, “The Taxon class: One node in the classification tree”.

.hier

Contains a Hier object representing the set of taxonomic ranks used. See Section 9.2, “Class Hier: The set of taxonomic ranks”.

Methods on a Txny object include:

.lookupTxKey(txKey)

Looks for the taxon corresponding to taxonomic key number txKey and returns it as a Taxon object. Raises a KeyError exception if the arrangement has no such key.

.lookupSci(sci)

Looks for the taxon whose scientific name matches sci, and returns it as a Taxon object. This is a case-sensitive comparison. If there is no taxon in this arrangement with the given scientific name, raises KeyError.

.lookupAbbr(abbr)

Looks for the taxon that is equivalent to bird code abbr, and returns it as a Taxon object.

The search is case-insensitive, and you don't have to pad the code to length 6. For example, to search for the code for the Hawaiian name of the Hawaiian Hawk (I`o), you could use "IO", "io", "io    " or "IO    ".

.lookupCollision(abbr)

If abbr is one of the bird codes disallowed because it is a collision, this method returns a list of the valid substitute codes. For example, this call

    txny.lookupCollision("BAROWL")

would return the list ["BRDOWL", "BRNOWL"].

Raises KeyError if abbr is not a collision code.

.genTxKeys()

Generates the taxonomic keys in the arrangement in ascending (phylogenetic) order, as strings.

.genAbbrs()

Generates the valid bird codes in self, in ascending order, uppercased.

.abbrToEng(abbr)

Returns the English name from which the given abbreviation abbr was derived. Raises KeyError if abbr is not valid. Example: for "GOCKIN", it returns "Golden-crowned Kinglet".

.abbrToEngComma(abbr)

Like .abbrToEng, but returns the name in inverted order, e.g., "Kinglet, Golden-crowned".

.abbrToHtml(abbr, cssClass=None)

For names that do not include italicized genus or species names, this method returns the same result as .abbrToEng. If some words require an italic rendering, the returned string will include HTML markup.

If no cssClass argument is provided, italicized words will be marked up using the deprecated HTML <i>...</i> element.

<i>Hylocichla</i> sp.
small <i>Accipiter</i> sp.
Iceland (<i>glaucoides</i>) Gull

A better practice is to use a CSS stylesheet to style the Web page you are building. Provide a class name to this method using the cssClass keyword argument, and if there are any italicized words in the English name for abbr, they will be wrapped in an HTML span element with that class name. Here are some examples of marked-up return values when this method is called with the default cssClass=None:

Suppose you call this method with argument cssClass="latin". Here are some examples of return values:

<span class='latin'>Hylocichla</span> sp.
small <span class='latin'>Accipiter</span> sp.
Iceland (<span class='latin'>glaucoides</span>) Gull

Here is a CSS rule you might use to italicize these words.

span.latin { font-style: italic; }
.abbrToHtmlComma(abbr, cssClass=None)

Same as .abbrToHtml, but supplies multi-word names in inverted order. For example:

Gull, Iceland (<i>glaucoides</i>)

.abbrHtmlSubelt(abbr, node, cssClass=None)

If you are building an XHTML page with the pylxml package, use this method to attach the marked-up English name inside an etree.Element instance. It works like the .abbrToHtml method, but instead of returning the result as a string, it is added to node. The cssClass argument works in the same as it does for .abbrToHtml.

For example, suppose in your Python script txny is an instance of the Txny class; the variable cell is bound to an Element instance representing the td element of a table that you are building; and ab6 is a bird code. This code would attach the marked-up English name to the cell node:

    txny.abbrHtmlSubelt(ab6, cell)

Here's an example of what part of your Web page might look like after serialization to XHTML:

    <td>
      Iceland (<i>glaucoides</i>) Gull
    </td>

Suppose you have a CSS style rule for span elements with a class='latin' attribute. In that case, you would call it this way:

    txny.abbrHtmlSubelt(ab6, cell, cssClass='latin')

and the result HTML would look like this:

    <td>
      Iceland (<span class='latin'>glaucoides</span>) Gull
    </td>

.abbrHtmlSubeltComma(abbr, node, cssClass=None)

Same as .abbrHtmlSubelt, but uses inverted word order.

.abbrToTeX(abbr)

Returns the English name from which the given abbreviation abbr was derived, marked up for TEX or LaTEX.

Here's an example return value. The “\/” control sequence is a TEX italic correction command. It inserts a bit of extra space after the last character if the following character is not also italicized.

Iceland ({\it glaucoides\/}) Gull

.abbrToTeXComma(abbr)

Same as .abbrToTeX, but the name is inverted. For example, it might return:

Gull, Iceland ({\it glaucoides\/})

.abbrToRawEng(abbr)

Returns the raw English name corresponding to abbr as it appears in the .std. This means the word order will be inverted, and may contain “"” and “_” characters.