The archindex.py module reads one of the index files
written by archx.py. For the interface, see
Section 5.2, “class ArchiveIndex: Index of one archive”.
The code starts with the module's documentation string and a few vital imports.
"""Reader module for files conforming to archx.rnc.
For documentation, see:
http://www.nmt.edu/~john/scans/slides/archx/
"""
The first import, enabling the use of
Python generators, must precede all other import statements. Next we import Parse from the 4Suite XML package (see Python and the XML Document Object Model (DOM) with
4Suite) to read the XML file. Also
included are two exceptions raised by the Parse() function, so we can catch them
gracefully.
This module has been ported to the more modern
lxml library. See the copy in
tcc/p/cranefest/, which
should replace this module using the older
4Suite XML package.
#================================================================ # Imports #---------------------------------------------------------------- from __future__ import generators from Ft.Xml import Parse, ReaderException from Ft.Lib import UriException from rnc_archx import *
Here is the interface to retrieve an archive index file.
An ArchiveIndex is a container for ArchiveImage instances, each of which describes
one archived image.
Typically you won't call the constructor directly;
instead, use the static method ArchiveIndex.readFile() to read the file for
you.
# - - - - - c l a s s A r c h i v e I n d e x - - - - -
class ArchiveIndex:
"""Represents one archive index file, conforming to archx.rnc.
Exports:
ArchiveIndex ( imageCatalog ):
[ imageCatalog is a birdimages.ImageCatalog instance ->
return a new, empty ArchiveIndex object ]
.imageCatalog: [ as passed to constructor ]
.getArchImage ( catNo ):
[ catNo is an image catalog number as a string ->
if self has an entry for that catalog number ->
return an ArchImage instance representing that entry
else -> raise KeyError ]
.genArchImages():
[ generate the ArchImage instances in self, in ascending
order by catalog number ]
.addArchImage ( archImage ):
[ archImage is an ArchImage instance ->
self := self with archImage added ]
ArchiveIndex.readFile ( imageCatalog, fileName ):
[ (imageCatalog is a birdimages.ImageCatalog instance) and
(fileName is a string) ->
if (fileName names an XML file valid against
archx.rnc) and
(catalog numbers in that file are all found in
imageCatalog) ->
return a new ArchiveIndex object containing ArchImage
instances representing entries from fileName that
do match entries in imageCatalog
else -> raise IOError ]
The internal state of an ArchiveIndex
instance consists of one dictionary:
State/Invariants:
.__catNoMap:
[ a dictionary whose values are the ArchiveIndex
instances contained in self, and each key is the
catalog number of that instance ]
"""
There is little for this nominal constructor to do: just
initialize the .__catNoMap dictionary.
# - - - A r c h i v e I n d e x . _ _ i n i t _ _ - - -
def __init__ ( self ):
"""Constructor for ArchiveIndex.
"""
self.__catNoMap = {}
Retrieves the ArchImage for the given
catalog number, if any. If the catalog number is not in
self.__catNoMap, this method will raise
KeyError.
# - - - A r c h i v e I n d e x . g e t A r c h I m a g e - - -
def getArchImage ( self, catNo ):
"""Look up a catalog number.
"""
return self.__catNoMap [ catNo ]
Sorts the catalog numbers, then generates the
corresponding ArchImage instances in that
order.
# - - - A r c h i v e I n d e x . g e n A r c h I m a g e s - - -
def genArchImages ( self ):
"""Generate self's images.
"""
catNoList = self.__catNoMap.keys()
catNoList.sort()
for catNo in catNoList:
yield self.__catNoMap[catNo]
raise StopIteration
Add one ArchImage instance to self.
# - - - A r c h i v e I n d e x . a d d A r c h I m a g e - - -
def addArchImage ( self, archImage ):
"""Add one cataloged entry.
"""
self.__catNoMap[archImage.original.catNo] = archImage
This static method reads an XML file conforming to
archx.rnc and returns its contents as an ArchiveIndex instance.
# - - - A r c h i v e I n d e x . r e a d F i l e - - -
# @staticmethod
def readFile ( imageCatalog, fileName ):
"""Read an XML file.
"""
We use the Parse() function to convert the
XML file into a DOM tree.
#-- 1 --
# [ if fileName names a readable, well-formed XML file ->
# doc := a DOM Document node representing that file
# else -> raise IOError ]
try:
doc = Parse ( fileName )
except UriException, detail:
raise IOError, ( "No such file '%s': %s" %
(filename, detail) )
except ReaderException, detail:
raise IOError, ( "File '%s' not well-formed: %s" %
(filename, detail) )
Next we build a node set of all the image
nodes in the document, and also create an ArchiveIndex instance.
#-- 2 --
# [ xList := a node-set of all RNC_IMAGE_N nodes in doc
# archx := a new, empty ArchiveIndex instance ]
xList = doc.documentElement.xpath ( '//%s' % RNC_IMAGE_N )
archx = ArchiveIndex()
Each valid node in xList will be converted
to an ArchImage object and added to self.__catNoMap.
#-- 3 --
# [ if (all the nodes in xList are valid against
# archx.rnc, and their catalog numbers are defined
# in imageCatalog) ->
# archx := archx with ArchImage instances added
# representing valid nodes from xList
# else -> raise IOError ]
for xNode in xList:
#-- 3 body --
# [ xNode is a DOM RNC_IMAGE_N Element node ->
# if xNode is not valid against archx.rnc ->
# raise IOError
# else if xNode's catalog number is found in
# self.imageCatalog ->
# archx := archx with an ArchImage instance
# added representing xNode ]
#-- 3.1 --
# [ (imageCatalog is a birdimages.ImageCatalog) and
# (xNode is a DOM RNC_IMAGE_N Element node) ->
# if (xNode is not valid against archx.rnc) or
# (xNode's catalog number is not in imageCatalog) ->
# raise IOError
# else ->
# archImage := an ArchImage instance
# representing that catalog number ]
archImage = ArchImage.readNode ( imageCatalog, xNode )
#-- 3.2 --
# [ archx := archx with archImage added ]
archx.addArchImage ( archImage )
Finally the accumulate catalog is returned to the caller.
#-- 4 --
return archx
readFile = staticmethod ( readFile )
Each instance of this class represents one image that is
not only in the image catalog, but has also been measured
for image size, and a thumbnail placed in the thumbnail
directory. Most of the cataloging information is
represented as an Original instance (as
described in An XML-based bird cataloging
system), available as the .original attribute of an ArchImage instance.
Here is the class's interface, and its trivial constructor.
# - - - - - c l a s s A r c h I m a g e - - - - -
class ArchImage:
"""Represents the cataloging information for one archived image.
Exports:
ArchImage ( original, high, wide ):
[ (original is a birdimages.Original instance) and
(high is the image's height in pixels as an int) and
(wide is the image's width in pixels as an int) ->
return a new ArchImage object with those values ]
.original: [ as passed to constructor, read-only ]
.high: [ as passed to constructor, read-only ]
.wide: [ as passed to constructor, read-only ]
ArchImage.readNode ( imageCatalog, xNode ):
[ (imageCatalog is a birdimages.ImageCatalog instance) and
(xNode is a DOM RNC_IMAGE_N Element ->
if (xNode is not valid against archx.rnc) or
(xNode's catalog number is not found in
imageCatalog) ->
raise IOError
else ->
return an ArchImage instance representing that
catalog number ]
"""
def __init__ ( self, original, high, wide ):
"""Constructor for ArchImage
"""
self.original = original
self.high = high
self.wide = wide
This static method converts an image node
into an ArchImage instance, assuming that
its catalog number is found in the dictionary.
# - - - A r c h I m a g e . r e a d N o d e - - -
# @staticmethod
def readNode ( imageCatalog, xNode ):
"""Convert an XML node to an ArchImage.
"""
First we pull out the catalog number, height, and width.
#-- 1 --
# [ catNo := xNode's RNC_CAT_NO_A attribute ]
catNo = xNode.getAttributeNS ( None, RNC_CAT_NO_A )
#-- 2 --
# [ if xNode has an RNC_HIGH_A attribute that is a valid
# float in string form ->
# high := that attribute as a float
# else -> raise IOError ]
high = getIntAttr ( xNode, RNC_HIGH_A )
#-- 3 --
# [ if xNode has an RNC_WIDE_A attribute that is a valid
# float in string form ->
# wide := that attribute as a float
# else -> raise IOError ]
wide = getIntAttr ( xNode, RNC_WIDE_A )
Translate the catalog number into an Original instance, or fail.
#-- 4 --
# [ if catNo matches a catalog number in imageCatalog ->
# original := the corresponding Original from
# imageCatalog
# else -> raise IOError ]
original = imageCatalog.getOriginal ( catNo )
#-- 5 --
return ArchImage ( original, high, wide )
readNode = staticmethod ( readNode )
This utility function handles the retrieval and conversion of an XML attribute that should contain an integer in string form.
# - - - g e t I n t A t t r - - -
def getIntAttr ( node, attrName ):
"""Convert an integer attribute value
[ (node is a DOM Element node) and
(attrName is an attribute name as a string) ->
if node has an attribute named attrName and it
contains a valid int in string form ->
return that attribute as an int
else -> raise IOError ]
"""
#-- 1 --
# [ if node has an attribute named attrName ->
# rawInt := that attribute's value
# else -> raise IOError ]
rawInt = node.getAttributeNS ( None, attrName )
if not rawInt:
raise IOError, ( "Missing %s attribute" % attrName )
#-- 2 --
# [ if rawInt is a valid int in string form ->
# return int(rawInt)
# else -> raise IOError ]
try:
result = int ( rawInt )
return result
except ValueError:
raise IOError, ( "%s='%s': value not an int" %
(attrName, rawInt) )