Next / Previous / Contents / TCC Help System / NM Tech homepage

6. Source code for PathInfo

Here is the source code for the pathinfo.py module containing the PathInfo class.

6.1. Prologue

The code for the pathinfo.py module starts with the conventional Python documentation string.

pathinfo.py
"""pathinfo.py:  Object to represent a snapshot of a file's status.

  For documentation in "literate programming" style, see:
    http://www.nmt.edu/help/lang/python/examples/pathinfo/
"""

Next comes the importation of the standard Python modules we use:

  • The os module supports Posix file systems and other operating system functions.

  • The stat module supplies additional declarations needed for interpreting the information that comes out of the os.stat() and os.lstat() functions.

  • We use the time module for translating and formatting file timestamps. In Python 2.3 and later, the datetime module is now recommended, but we'll use the older version out of consideration for those who have older Python installs.

pathinfo.py
#================================================================
# Imports
#----------------------------------------------------------------

import os, stat, time

6.2. Module constants

We define one manifest constant named TIME_FORMAT that controls the formatting of time tuples into textual form. This is used in the PathInfo.__str__() method. It is exported by this module so that other classes that inherit from PathInfo can format other timestamps in the same way. The output is in the form "yyyy-mm-dd hh:mm:ss".

pathinfo.py
#================================================================
# Manifest constants
#----------------------------------------------------------------

TIME_FORMAT  =  "%Y-%m-%d %H:%M:%S"

6.3. class PathInfo

The rest of the code in this module is inside the PathInfo class. We state here in the class's documentation string the class invariants: conditions that must always be true once the constructor has completed. For more on invariants, see the author's notes on cleanroom verification of objects.

pathinfo.py
#================================================================
# Functions and classes
#----------------------------------------------------------------


# - - - - -   c l a s s   P a t h I n f o   - - - - -

class PathInfo:
    """Represents a snapshot of one file's status.

      Class invariants:
        .path:
          [ the pathname passed to the class constructor ]
        .size:
          [ self.path's size in bytes as an integer ]
        .createEpoch:
          [ the epoch time when self.path was created ]
        .modEpoch:
          [ the epoch time when self.path was last modified ]
        .mode:
          [ the mode bits for self.path ]
"""

6.4. PathInfo.__init__(): The class constructor

This method is the one actually called when you use the class constructor PathInfo(). The single argument is the pathname of the desired file or directory.

pathinfo.py
# - - -   P a t h I n f o . _ _ i n i t _ _   - - -

    def __init__ ( self, path ):
        """Constructor for the PathInfo class.

          [ path is a string ->
              if path names an inode whose status is readable ->
                return a new PathInfo containing that status
              else -> raise OSError ]
        """

The job of a constructor is to assemble a new instance of the class. In Python classes, all methods of a class have an invisible first argument self: this is the instance on which the methods operate. The instance can be thought of as a namespace—that is, a set of names and their corresponding values. When the constructor starts, self is a namespace containing all the names of methods in the class (and the class variables, of any), but no other names. The constructor then customizes the instance by adding or modifying the names in the namespace self.

The first order of business is to save the argument inside the instance's namespace, symbolized by self.

pathinfo.py
        #-- 1 --
        self.path  =  path

Next we ask the operating system for the status of that pathname.

There are two different functions in the os module for getting the status information about a path: os.stat() and os.lstat(). They work the same except when the path points to a soft link. In that case, os.stat() retrieves data about the path pointed to by the soft link, while os.lstat() retrieves data about the soft link itself. For our purposes, we want the latter. Both these methods raise an OSError exception if the given path is nonexistent or inaccessible.

pathinfo.py
        #-- 2 --
        # [ if path names an existing, accessible file path ->
        #     self.status  :=  the status tuple for path
        #   else -> raise OSError ]
        self.status  =  os.lstat ( path )

Then we pull out the items from the status tuple that we export: the file size, the creation and modification timestamps, and the mode bits. The constants that start with “ST_...” are the indices of elements of the status tuple, and come from the stat module.

pathinfo.py
        #-- 3 --
        # [ self.status is a status tuple ->
        #     self.size         :=  size from self.status
        #     self.createEpoch  :=  creation epoch time from
        #                           self.status
        #     self.modEpoch     :=  modification epoch time
        #                           from self.status
        #     self.mode         :=  mode bits from self.status ]
        self.size         =  self.status[stat.ST_SIZE]
        self.createEpoch  =  self.status[stat.ST_CTIME]
        self.modEpoch     =  self.status[stat.ST_MTIME]
        self.mode         =  self.status[stat.ST_MODE]

At this point we are done, because we have established the five stated invariants on the attributes .path, .size, .createEpoch, .modEpoch, and .mode.

Because this is a constructor (that is, because its name is .__init__()), we don't have to return a value explicitly. The constructor's first argument, self, is the instance we have constructed, and it is returned to the caller of the PathInfo() constructor.

6.5. PathInfo.isFile(): Is this an ordinary file?

The standard Python stat module contains a function called S_ISREG() that takes the mode bits as an argument and returns true if the mode bits describe a regular file, false otherwise.

pathinfo.py
# - - -   P a t h I n f o . i s F i l e   - - -

    def isFile ( self ):
        """Predicate to test whether this path is an ordinary file.

          [ if self represents an ordinary file ->
              return a true value
            else ->
              return a false value ]
        """
        return  stat.S_ISREG ( self.mode )

6.6. PathInfo.isDir(): Is this a directory?

The stat module has a predicate named S_ISDIR() that tests the mode bits to see if they describe a directory.

pathinfo.py
# - - -   P a t h I n f o . i s D i r   - - -

    def isDir ( self ):
        """Predicate to test whether this path is a directory.

          [ if self represents a directory ->
              return a true value
            else ->
              return a false value ]
        """
        return  stat.S_ISDIR ( self.mode )

6.7. PathInfo.isLink(): Is this a soft link?

The stat module has a predicate for testing for soft links: S_ISLNK().

pathinfo.py
# - - -   P a t h I n f o . i s D i r   - - -

    def isLink ( self ):
        """Predicate to test whether this path is a soft link.

          [ if self represents a soft link ->
              return a true value
            else ->
              return a false value ]
        """
        return  stat.S_ISLNK ( self.mode )

6.8. PathInfo.absPath(): Absolute path

This method returns the absolute path name equivalent to self.path. It uses the standard Python function os.path.abspath().

pathinfo.py
# - - -   P a t h I n f o . a b s P a t h   - - -

    def absPath ( self ):
        """Return self's absolute path name."""
        return  os.path.abspath ( self.path )

6.9. PathInfo.realPath(): Actual absolute path

This method returns the absolute path name equivalent to self.path. Unlike the .absPath() method, any soft links that may occur within the path are resolved and replaced with the paths to which the links refer. It uses the standard Python function os.path.realpath().

pathinfo.py
# - - -   P a t h I n f o . r e a l P a t h   - - -

    def realPath ( self ):
        """Return self's absolute path name, with links resolved."""
        return  os.path.realpath ( self.path )

6.10. PathInfo.ownerCanRead(), etc: Predicates for permission testing

All nine of the routines that test read, write, and execute permissions are structurally identical. In each case, one of the mode bits is 1 if the path has that permission, or 0 otherwise.

To test this bit while ignoring the other bits, we use a mask that has a 1 bit in that position and 0 bits elsewhere. Applying a Boolean “and” operation on the mask and the mode bits gives us a word that is all zeroes (false) if the relevant bit is 0, or a word that is not zero (true) if the relevant bit is not zero.

The first method illustrates the pattern. The mask for the owner read permission bit comes from the stat module, and is called S_IRUSR.

pathinfo.py
# - - -  P a t h I n f o . { o w n e r } C a n { R e a d   } - - -
#                          { g r o u p }       { W r i t e }
#                          { w o r l d }       { E x e c   }

    def ownerCanRead ( self ):
        return self.mode & stat.S_IRUSR

The remaining routines are identical except for the names of the masks they use from the stat module.

pathinfo.py
    def ownerCanWrite ( self ):
        return self.mode & stat.S_IWUSR

    def ownerCanExec ( self ):
        return self.mode & stat.S_IXUSR

    def groupCanRead ( self ):
        return self.mode & stat.S_IRGRP

    def groupCanWrite ( self ):
        return self.mode & stat.S_IWGRP

    def groupCanExec ( self ):
        return self.mode & stat.S_IXGRP

    def worldCanRead ( self ):
        return self.mode & stat.S_IROTH

    def worldCanWrite ( self ):
        return self.mode & stat.S_IWOTH

    def worldCanExec ( self ):
        return self.mode & stat.S_IXOTH

6.11. PathInfo.modTime(): Modification timestamp in human units

This method translates self.modEpoch to our standard date and time format.

First we convert the epoch time to a local time tuple, then we format it using the time formatting method of the time module.

pathinfo.py
# - - -   m o d T i m e   - - -

    def modTime ( self ):
        """Format the modification time as yyyy-mm-dd hh:mm:ss."""
        return time.strftime ( TIME_FORMAT,
            time.localtime ( self.modEpoch ) )

6.12. PathInfo.__str__(): Convert to a string

For the format of the string returned by this method, see Section 2, “The interface to the PathInfo object”.

We call internal methods to develop the file type, file permissions, and file modification timestamp. The assembly of the pieces, and the formatting of the size, is handled by the usual Python string format operator.

pathinfo.py
# - - -   P a t h I n f o . _ _ s t r _ _   - - -

    def __str__ ( self ):
        """Convert self to a string."""
        return ( "%s%s %s %8d %s" %
                 (self.__fileType(), self.__permFlags(),
                  self.modTime(), self.size, self.path) )

6.13. PathInfo.__fileType(): Get the file type code

This method derives the one-letter file type code: - for a regular file, d for a directory, or l for a link. Because it is a private method of the class, its name starts with two underscores (__) so it is not visible to code that imports this class.

pathinfo.py
# - - -   P a t h I n f o . _ _ f i l e T y p e   - - -

    def __fileType ( self ):
        """Return the file type code.

          [ if self is a regular file ->
              return "-"
            if self is a directory ->
              return "d"
            if self is a soft link ->
              return "l" ]
        """

Just as a defensive programming measure, we return "?" if for some reason the path is neither a file, a directory, or a soft link. This can happen for Unix device files, but those are beyond the intended audience of the pathinfo.py module.

pathinfo.py
        if self.isLink():
            return "l"
        elif self.isDir():
            return "d"
        elif self.isFile():
            return "-"
        else:
            return "?"

6.14. PathInfo.__permFlags(): Format all the permissions

This method formats the permission bits in self.mode using the time-honored format of the Unix “ls” command.

pathinfo.py
# - - -   P a t h I n f o . _ _ p e r m F l a g s   - - -

    def __permFlags ( self ):
        """Format self.mode's permissions as 'rwxrwxrwx'.
        """

Each set of three permission bits is formatted in the same way, so we call method self.__rwx() to format them. This method takes three arguments. The first argument is 0 if there is no read permission, nonzero otherwise. The second and third arguments are the write and execute permission with the same convention.

To get the values for each permission, we use a Boolean “and” (&) operator on self.mode and the mask values from the stat module to discard all but the bit of interest. Mask stat.S_IRUSR has a one bit in the position of the owner read permission of the mode word, and zeroes in the other positions. Mask stat.S_IWUSR is a mask for the owner write permission, and so on.

pathinfo.py
        return ( "%s%s%s" %
                 (self.__rwx ( self.mode & stat.S_IRUSR,
                               self.mode & stat.S_IWUSR,
                               self.mode & stat.S_IXUSR ),
                  self.__rwx ( self.mode & stat.S_IRGRP,
                               self.mode & stat.S_IWGRP,
                               self.mode & stat.S_IXGRP ),
                  self.__rwx ( self.mode & stat.S_IROTH,
                               self.mode & stat.S_IWOTH,
                               self.mode & stat.S_IXOTH ) ) )

6.15. PathInfo.__rwx(): Format three permission bits

This little method takes three permission values and returns a three-character string formatted in the “ls -l” convention. The read permission is formatted as "r" if true, "-" if false. Similarly, the write permission is formatted as "w" or "-", and the execute permission as "x" or "-".

Each argument is nonzero if the permission is set, zero if it is not set.

pathinfo.py
# - - -   P a t h I n f o . _ _ r w x   - - -

    def __rwx ( self, r, w, x ):
        """Format three permission bits.

          [ r, w, and x are Boolean values indicating read,
            write and execute permissions ->
              return a three-character string displaying those
              permissions as "ls -l" displays them ]
        """

The .__dasher() method handles generation of either a letter or a dash depending on the permission value.

pathinfo.py
        return  ( "%s%s%s" %
                  (self.__dasher ( r, "r" ),
                   self.__dasher ( w, "w" ),
                   self.__dasher ( x, "x" ) ) )

6.16. PathInfo.__dasher(): Format a permission bit

Formats a permission bit using the “ls -l” convention. The bit argument is true if the permission is granted, false otherwise. The flag argument is returned if the permission is true, otherwise the method returns "-".

pathinfo.py
# - - -   P a t h I n f o . _ _ d a s h e r   - - -

    def __dasher ( self, bit, flag ):
        """Format a permission bit as in ls -l.

          [ (bit is a Boolean value) and (flag is a string) ->
              if bit is true ->
                return flag
              else -> return "-" ]
        """
        if  bit:  return flag
        else:     return "-"

6.17. PathInfo.__cmp__(): Define the comparison operator on PathInfo objects

We want PathInfo objects to sort by their .path attributes, that is, in ascending order by pathname.

Because these pathnames are simple strings, and because Python's built-in cmp() function can compare strings, all we have to do here is call cmp() on the pathnames and return its result as our result.

pathinfo.py
# - - -   P a t h I n f o . _ _ c m p _ _   - - -

    def __cmp__ ( self, other ):
        """Comparison operator for PathInfo objects.

          [ other is a PathInfo object ->
              if self.path < other.path ->
                return a negative number
              else if self.path > other.path ->
                return a positive number
              else ->
                return 0 ]
        """
        return cmp ( self.path, other.path )