Before we examine the actual noteweb script, a few comments on data structures and algorithms are in order.
Because of the need for navigational links between pages, we can't just go out and find monthly XML files and immediately convert them to HTML. Each monthly page must have and navigational links. So, when we build a monthly page, we need to know which month (if any) was the previous one in sequence, and which is the next in sequence. There is no guarantee that every month has a valid input file. There might even be years with no valid input files.
Therefore, the first thing we have to do is read all the
XML files, rendering each one into a birdnotes.BirdNoteSet instance. Then we can work
through these instances, converting each to an HTML page in
the same subdirectory where we found the XML input file.
(Note that keeping all these BirdNoteSet
instances around may eat up a lot of memory. If that is
ever a problem, we'll just have to make two passes: once to
see which files are valid, and another pass to render them,
so that we don't have to keep the entire data set in memory
at once.)
We must also generate the index page, with a table of links to all the months. Each row in this table contains all the months of that year. There is, however, no guarantee that years are contiguous. We just look to see what year directories are present, and that determines the set of table rows.
The above conditions suggest a data structure made from instances of three classes:
One YearCollection instance contains
everything we need to build the index page.
Because this instance contains all the input data, it can figure out which months are the and navigational links for a given month.
The YearCollection instance is a
container for YearRow instances, one for
each year for which there is an input data directory.
Each YearRow instance has all the
information needed to build one row of the index table.
Each YearRow instance is a container for
up to twelve MonthCell instances.
Each MonthCell instance has all the
information about one month for which there is an input
XML file, and has everything needed to build the
monthly HTML page.