Next / Previous / Contents / TCC Help System / NM Tech homepage

7.2. The user database

The whole purpose of cookies in this application is to remember which chapter the user was reading, so we can take them back there next time.

We could put that information right in the cookie, and modify the cookie each time. However, this isn't very realistic. Because sites that use cookies often pass sensitive data to CGI scripts, it's unsafe to place user data right in the cookie. The preferred technique is to generate a cookie from random characters, and then use that as a key to look up the real user information somewhere.

Here is what we need to remember for each user:

For storing and retrieving small amounts of data, Python's built-in gdbm module is a simple, appropriate tool. A gdbm file works like a Python dictionary: you give it a key and a value, and later you can give it the same key and you'll get back the same value. For simplicity's sake, both key and value must be strings.

With gdbm, we can just make each user's cookie the key, and keep the user's values in string form.

There is a subtle problem with using a gdbm database in this way: database bloat. Over time, if many users visit the system, the file may get quite large. We need some way to discard entries after a reasonable period. In this example application, we'll limit the lifetime of active cookies to a week, but in a production application, you'll want to think about how long you let a user stay around before you forget their information.

Hence, in each user's database entry, in addition to their current chapter number and their “persistent flag,” we'll encode a third item: an expiration timestamp. Every time our script runs, we will go through all the entries in the database, and discard the ones that have expired. (In a production application where you are dealing with thousands of users, you might do this database cleaning in a separate program to be run periodically.)

The gdbm module requires that all keys and values must be strings. The cookie is a suitable key just as it is. As for the user's data, we'll encode the value like this:

C,P,T