Next / Previous / Contents / TCC Help System / NM Tech homepage

7. Main program

This program was developing using the Cleanroom or zero-defect software methodology. For more details, see The Cleanroom software development methodology.

The webstats.py program has this overall intended function:

webstats.py
# - - - - -   w e b s t a t s . p y   - -   m a i n

def main():
    '''Generate reports on Web page access at this Apache server.

      [ access-logs are readable ->
          index-page  :=  index-content(access-logs)
          hit-parade-page  :=  hits-content(access-logs)
          letter-pages(access-logs)  :=  letter-content(access-logs)
          personal-reports(access-logs) := personal-content(access-logs)
          official-reports(access-logs) := official-content(access-logs)
          sys.stderr  +:=  (version-greeting) + (error messages
              about invalid lines in access-logs, if any) ]
    '''

Here are some verification functions that clarify and subdivide the semantics of this intended function. Refer to the specification for an overview of the generated pages; the names just below are just notational shorthand for the parts of the generated structure.

webstats.py
#================================================================
# Verification functions
#----------------------------------------------------------------
# access-logs  ==  Apache access_log file in /var/log/httpd
#     and any files named "access_log.[1-5]" in that directory
#----------------------------------------------------------------
# index-page  ==  a page at INDEX_WEB_PATH
#----------------------------------------------------------------
# index-content(logs) ==
#   (summary of total hits in logs) +
#   (link to hit-parade-page) +
#   (official-report for "/" in logs) +
#   (links to letter-pages(logs)) +
#   (links to official-reports(logs))
#----------------------------------------------------------------
# hit-parade-page == a page at HITS_WEB_PATH
#----------------------------------------------------------------
# hits-content(logs) ==
#   a report showing all pages with at least HITS_CUTOFF hits in
#   logs, in descending order by total hits, with URL as a
#   secondary key
#----------------------------------------------------------------
# letter-pages(logs) ==
#   pages at (LETTER_PREFIX + c + HTML_EXT) relative to
#   OUT_WEB_PATH, one for each unique first letter (c) in the
#   personal page hits from logs
#----------------------------------------------------------------
# letter-content(logs) ==
#   for each unique first letter (c) in the personal page hits
#   from logs, links to the personal-reports for pages whose
#   users start with (c)
#----------------------------------------------------------------
# personal-reports(logs) ==
#   pages at (username + HTML_EXT) relative to
#   PERSONAL_WEB_PATH for every personal username found in logs
#----------------------------------------------------------------
# personal-content(logs) ==
#   one report for each personal account appearing in logs,
#   showing all URLs for that account in ascending order by URL
#----------------------------------------------------------------
# official-reports(logs) ==
#   pages at (dirname + HTML_EXT) relative to
#   OFFICIAL_WEB_PATH for every official directory name found
#   in logs
#----------------------------------------------------------------
# official-content(logs) ==
#   one report for each official directory appearing in logs,
#   showing all URLs for that account in ascending order by URL
#----------------------------------------------------------------

First we write a banner message to the standard error stream, and compute the starting and ending time of the report interval. For the definition of the report interval, see Section 6.1, “EXPIRE_DAYS.

webstats.py
    #-- 1 --
    # [ sys.stderr  +:=  greeting message and timestamp
    #   now  :=  the current time as a datetime.datetime
    #   cutoffTime  :=  a time EXPIRE_DAYS days before the
    #       current time as a datetime.datetime ]
    message ( "== %s %s == %s +0000\n" %
              ( PRODUCT_NAME, EXTERNAL_VERSION,
                datetime.datetime.utcnow().strftime ( DATE_FORMAT ) ) )
    utcZone  =  FixedTimeZone(0, "UTC")
    thirtyDays  =  datetime.timedelta( EXPIRE_DAYS )
    now  =  datetime.datetime.now(utcZone)
    cutoffTime  =  now - thirtyDays

Reading the log files, and pouring their relevant records into the AccessSummary instance, are handled in Section 8, “inputPhase(): Read the access logs”.

webstats.py
    #-- 2 --
    # [ access-logs are readable ->
    #     accessSummary  :=  a new AccessSummary instance
    #         containing all the relevant records from those logs ]
    accessSummary  =  inputPhase(cutoffTime, now)

Next we create a tccpage2.TCCPage instance that will hold the index page. See Section 2.3, “Navigational considerations” for remarks on the navigational features. See also Section 6.3, “Web paths” and Section 6.5, “Report label text”.

webstats.py
    #-- 3 --
    # [ indexPage  :=  a new tccpage2.TCCPage instance with a "TCC
    #       Computer Center" navigational link ]
    navList  =  [
        tccpage2.NavLink ( "Next", [] ),
        tccpage2.NavLink ( "Previous", [] ),
        tccpage2.NavLink ( "Tech Computer Center",
          [ ("Tech Computer Center", TCC_WEB_PATH.url) ] ) ]
    indexPage  =  tccpage2.TCCPage ( INDEX_PAGE_TITLE, navList,
                                     url=INDEX_WEB_PATH.url )

The next function adds all content to this page and all subpages.

webstats.py
    #-- 4 --
    # [ indexPage  +:=  index-content(accessSummary)
    #   hit-parade-page  :=  hits-content(accessSummary)
    #   letter-pages(accessSummary)  :=  letter-content(accessSummary)
    #   personal-reports(accessSummary)  := 
    #       personal-content(accessSummary)
    #   official-reports(accessSummary)  :=
    #       official-reports(accessSummary) ]
    buildAllPages ( indexPage, accessSummary )

Finally, write the content of the index page. For the declaration of INDEX_ABS_PATH, see Section 6.3, “Web paths”.

webstats.py
    #-- 5 --
    # [ file INDEX_WEB_PATH  :=  indexPage, serialized as XHTML ]
    try:
        indexFile  =  open ( INDEX_WEB_PATH.absPath, "w" )
    except IOError, detail:
        fatal ( "Can't open the index page '%s': %s" %
                (INDEX_ABS_PATH.absPath, detail) )
    indexPage.write ( indexFile )
    indexFile.close()