Next / Previous / Contents / Shipman's homepage

19. class BaseCompiler: Base class for banding data compilers

This abstract class is used by the various data compilers.

19.1. class BaseCompiler: The interface

Here is the start of the actual class, and its documentation string that describes the interface.

baseclasses.py
# - - - - -   c l a s s   B a s e C o m p i l e r   - - - - -

class BaseCompiler:
    '''Abstract base class for raw banding data compilers.

      Exports:
        BaseCompiler(fileName, stationSet, speciesSet):
          [ (fileName is the name of the raw banding data file) and
            (stationSet is a BaseStationSet object) and
            (speciesSet is a BaseSpeciesSet object) ->
              if fileName conforms to the batch file naming
              scheme and names a readable file ->
                return a new BaseCompiler object with those values
              else ->
                raise ValueError ]
        .fileName:      [ as passed to constructor, read-only ]
        .stationSet:    [ as passed to constructor, read-only ]
        .speciesSet:    [ as passed to constructor, read-only ]
        .location:
          [ if fileName represents a multi-station batch ->
              a Location object representing that batch's location
            else -> None ]
        .station:
          [ if fileName represents a single-station batch ->
              a Station object representing that station
            else -> None ]
        .year:
          [ the batch year as a four-digit string ]
        .__iter__():
          [ Log()  +:=  error messages from processing the contents
                of self.fileName as a batch file, if any
            generate a sequence of BaseEncounter records representing
            valid lines from that file, if any, and send any error
            messages to Log() ]

The class constructor takes three arguments:

  • This class must know the name of the input file because the file name itself contains either a location code or a station number, as well as the field year when the data were taken. The fileName argument must be the path name of the raw input file. It must conform to the batch file naming rules described in the specification.

  • stationSet is an object derived from the BaseStationSet class. This object defines the valid location and station codes.

  • speciesSet is an object derived from the BaseSpeciesSet class that defines the valid species codes.

The constructor returns a new BaseCompiler object. This object is a Python generator that generates the output records as a sequence of BaseEncounter objects. The calling program iterates over the BaseCompiler object in a for-loop, calling the .flatten() method on each generated BaseEncounter to object to translate that object to a flat-file record that can be written to the output file.

So here's the main-sequence pseudocode of a typical compiler:

stationSet = StationSetYYYY(station authority file)
speciesSet = SpeciesSetYYYY(species authority file)
compiler = Compiler(raw file name, stationSet, speciesSet)
for encounter in compiler:
    write encounter.flatten()

If the named file isn't readable, or contains errors, all messages are sent to the Log() singleton.

Additional exported attributes include:

  • Two two attributes named .location and .station track the type of batch file—multi-station or single-station.

    For batch files named for a location, the former is set to the corresponding Location object, and the latter to None.

    For single-station batch files, .location is None and the .station attribute is set to a Station object representing that batch's station.

  • The .year attribute, a four-character string of the form "YYYY, is set from the name of the file.