Next / Previous / Contents / Shipman's homepage

6. Design notes

Before we examine the actual noteweb script, a few comments on data structures and algorithms are in order.

Because of the need for navigational links between pages, we can't just go out and find monthly XML files and immediately convert them to HTML. Each monthly page must have Next and Previous navigational links. So, when we build a monthly page, we need to know which month (if any) was the previous one in sequence, and which is the next in sequence. There is no guarantee that every month has a valid input file. There might even be years with no valid input files.

Therefore, the first thing we have to do is read all the XML files, rendering each one into a birdnotes.BirdNoteSet instance. Then we can work through these instances, converting each to an HTML page in the same subdirectory where we found the XML input file. (Note that keeping all these BirdNoteSet instances around may eat up a lot of memory. If that is ever a problem, we'll just have to make two passes: once to see which files are valid, and another pass to render them, so that we don't have to keep the entire data set in memory at once.)

We must also generate the index page, with a table of links to all the months. Each row in this table contains all the months of that year. There is, however, no guarantee that years are contiguous. We just look to see what year directories are present, and that determines the set of table rows.

The above conditions suggest a data structure made from instances of three classes:

  1. One YearCollection instance contains everything we need to build the index page.

    Because this instance contains all the input data, it can figure out which months are the Previous and Next navigational links for a given month.

  2. The YearCollection instance is a container for YearRow instances, one for each year for which there is an input data directory.

    Each YearRow instance has all the information needed to build one row of the index table.

  3. Each YearRow instance is a container for up to twelve MonthCell instances.

    Each MonthCell instance has all the information about one month for which there is an input XML file, and has everything needed to build the monthly HTML page.