Next / Previous / Contents / TCC Help System / NM Tech homepage

Abstract

This publication is available in Web form and also as a PDF document. Please forward any comments to tcc-doc@nmt.edu.

Table of Contents

1. Requirements
2. Input files
2.1. The Apache access log file format
2.2. The effective host list
3. Filtering the logs
4. Output files
4.1. The root page, index.html
4.2. The hit parade page, byhits.html
4.3. Personal letter pages: pers-C.html
4.4. User report pages: p/username.html and o/dirname.html
5. Operation of tccwebstats
6. Program internals

1. Requirements

Web page authors may be curious about whether anyone is seeing their pages. The purpose of the tccwebstats program is to provide a web page where anyone can see which pages have been visited within the last 30 days.

The output of this program addresses two kinds of questions:

  • Is anyone visiting this page? For this query, we need a list of all the pages sorted by URL, with hit counts for each one.

  • What are the most popular pages on this server? To answer this question, we need a “hit parade” that lists page hits in descending order.

Both reports show, for each URL, the percentage of off-site accesses. Accessors not in the nmt.edu domain (129.138.*) are considered off-site.

At the request of the New Mexico Tech Public Information Office, the program should maintain data covering at least the last month's accesses.

For the implementation of this program, including the actual code narrated in lightweight literate programming style, see the webstats.py 4.0: Internal Maintenance Specification.