Next / Previous / Contents / TCC Help System / NM Tech homepage

6. Design notes

Before we examine the actual noteweb script, a few comments on data structures and algorithms are in order.

Because of the need for navigational links between pages, we can't just go out and find monthly XML files and immediately convert them to HTML. Each monthly page must have Next and Previous navigational links. So, when we build a monthly page, we need to know which month (if any) was the previous one in sequence, and which is the next in sequence. There is no guarantee that every month has a valid input file. There might even be years with no valid input files.

Therefore, the first thing we have to do is read all the XML files, rendering each one into a birdnotes.BirdNoteSet instance. Then we can work through these instances, converting each to an HTML page in the same subdirectory where we found the XML input file. (Note that keeping all these BirdNoteSet instances around may eat up a lot of memory. If that is ever a problem, we'll just have to make two passes: once to see which files are valid, and another pass to render them, so that we don't have to keep the entire data set in memory at once.)

We must also generate the index page, with a table of links to all the months. Each row in this table contains all the months of that year. There is, however, no guarantee that years are contiguous. We just look to see what year directories are present, and that determines the set of table rows.

The above conditions suggest a data structure made from instances of three classes:

  1. One YearCollection instance contains everything we need to build the index page.

    Because this instance contains all the input data, it can figure out which months are the Previous and Next navigational links for a given month.

  2. The YearCollection instance is a container for YearRow instances, one for each year for which there is an input data directory.

    Each YearRow instance has all the information needed to build one row of the index table.

  3. Each YearRow instance is a container for up to twelve MonthCell instances.

    Each MonthCell instance has all the information about one month for which there is an input XML file, and has everything needed to build the monthly HTML page.

6.1. Discarded approaches

The first cut at an overall data structure was a list named yearList containing YearRow instances. Each YearRow would be a container for BirdNoteSet instances, one per valid month.

However, there are certain things we need to know about the months, such as the month number (e.g., '04'). So the MonthCell class was invented, with an instance holding one BirdNoteSet and ancillary information such as the month number and the file name of the month page. Each YearRow would then be a container for MonthCell instances, one per valid month.

The next problem was connecting up the navigation links between month pages. Clearly, when rendering a month page, we must know the URL of the Previous month page (if any) and the Next month page (if any). However, how does the MonthCell instance know these URLs? Three approaches were considered:

  • Let the MonthCell class have attributes that hold the previous/next links; initialize them to None.

    Then, after all the MonthCell instances are created, make a serial pass through them and link them up into a bidirectional linked list.

    Finally, render each MonthCell into HTML in any old order, using the stored previous/next attributes to set up its navigation.

    The aesthetic objection to this approach was that the MonthCell objects depend on an external mechanism to set up their linking. Logically, the information required to find a month's neighbors should reside at a higher organizational level.

  • Define a global function called something like neighbors() that finds the previous/next neighbors, given a year and month. This function would walk through the yearList, starting at the given month, and working backwards and forwards to find the nearest neighbors.

    The objection to this approach is that the yearList must then be global, or be passed around through many levels.

  • At this point, the author felt that a third class was called for: YearCollection, which manages the overall sequence of years and months.

    • It can traverse the years in reverse chronological order to build the index table with the most recent years first.

    • It is the obvious place for logic that can find the neighbors of any month.