Here is the source code for the pathinfo.py module containing
the PathInfo class.
The code for the pathinfo.py module starts with the
conventional Python documentation string.
"""pathinfo.py: Object to represent a snapshot of a file's status.
For documentation in "literate programming" style, see:
http://www.nmt.edu/help/lang/python/examples/pathinfo/
"""
Next comes the importation of the standard Python modules we use:
The os module supports Posix
file systems and other operating system functions.
The stat module supplies
additional declarations needed for interpreting the
information that comes out of the os.stat() and os.lstat() functions.
We use the time module for
translating and formatting file timestamps. In
Python 2.3 and later, the datetime module is now recommended, but
we'll use the older version out of consideration for
those who have older Python installs.
#================================================================ # Imports #---------------------------------------------------------------- import os, stat, time
We define one manifest constant named TIME_FORMAT that controls the formatting of
time tuples into textual form. This is used in the
PathInfo.__str__() method.
It is exported by this module so that other classes that
inherit from PathInfo can format other timestamps in the same
way. The output is in the form "yyyy-mm-dd
hh:mm:ss".
#================================================================ # Manifest constants #---------------------------------------------------------------- TIME_FORMAT = "%Y-%m-%d %H:%M:%S"
The rest of the code in this module is inside the PathInfo
class. We state here in the class's documentation string
the class invariants: conditions
that must always be true once the constructor has
completed. For more on invariants, see the author's
notes on cleanroom verification of objects.
#================================================================
# Functions and classes
#----------------------------------------------------------------
# - - - - - c l a s s P a t h I n f o - - - - -
class PathInfo:
"""Represents a snapshot of one file's status.
Class invariants:
.path:
[ the pathname passed to the class constructor ]
.size:
[ self.path's size in bytes as an integer ]
.createEpoch:
[ the epoch time when self.path was created ]
.modEpoch:
[ the epoch time when self.path was last modified ]
.mode:
[ the mode bits for self.path ]
"""
This method is the one actually called when you use the
class constructor PathInfo().
The single argument is the pathname of the desired file
or directory.
# - - - P a t h I n f o . _ _ i n i t _ _ - - -
def __init__ ( self, path ):
"""Constructor for the PathInfo class.
[ path is a string ->
if path names an inode whose status is readable ->
return a new PathInfo containing that status
else -> raise OSError ]
"""
The job of a constructor is to assemble a new instance of
the class. In Python classes, all methods of a class
have an invisible first argument self: this is the instance on which the
methods operate. The instance can be thought of as a
namespace—that is, a set of
names and their corresponding values. When the
constructor starts, self is a
namespace containing all the names of methods in the
class (and the class variables, of any), but no other
names. The constructor then customizes the instance by
adding or modifying the names in the namespace self.
The first order of business is to save the argument
inside the instance's namespace, symbolized by self.
#-- 1 --
self.path = path
Next we ask the operating system for the status of that pathname.
There are two different functions in the os module for getting the status
information about a path: os.stat() and os.lstat(). They work the same except
when the path points to a soft link. In that case,
os.stat() retrieves data about
the path pointed to by the soft link, while os.lstat() retrieves data about the soft
link itself. For our purposes, we want the latter.
Both these methods raise an OSError exception if the given path is
nonexistent or inaccessible.
#-- 2 --
# [ if path names an existing, accessible file path ->
# self.status := the status tuple for path
# else -> raise OSError ]
self.status = os.lstat ( path )
Then we pull out the items from the status tuple that we
export: the file size, the creation and modification
timestamps, and the mode bits. The constants that start
with “ST_...”
are the indices of elements of the status tuple, and
come from the stat module.
#-- 3 --
# [ self.status is a status tuple ->
# self.size := size from self.status
# self.createEpoch := creation epoch time from
# self.status
# self.modEpoch := modification epoch time
# from self.status
# self.mode := mode bits from self.status ]
self.size = self.status[stat.ST_SIZE]
self.createEpoch = self.status[stat.ST_CTIME]
self.modEpoch = self.status[stat.ST_MTIME]
self.mode = self.status[stat.ST_MODE]
At this point we are done, because we have established
the five stated invariants on the attributes .path, .size,
.createEpoch, .modEpoch, and .mode.
Because this is a constructor (that is, because its name
is .__init__()), we don't have
to return a value explicitly. The constructor's first
argument, self, is the instance
we have constructed, and it is returned to the caller of
the PathInfo() constructor.
The standard Python stat module
contains a function called S_ISREG() that takes the mode bits as an
argument and returns true if the mode bits describe a
regular file, false otherwise.
# - - - P a t h I n f o . i s F i l e - - -
def isFile ( self ):
"""Predicate to test whether this path is an ordinary file.
[ if self represents an ordinary file ->
return a true value
else ->
return a false value ]
"""
return stat.S_ISREG ( self.mode )
The stat module has a predicate
named S_ISDIR() that tests the
mode bits to see if they describe a directory.
# - - - P a t h I n f o . i s D i r - - -
def isDir ( self ):
"""Predicate to test whether this path is a directory.
[ if self represents a directory ->
return a true value
else ->
return a false value ]
"""
return stat.S_ISDIR ( self.mode )
The stat module has a predicate
for testing for soft links: S_ISLNK().
# - - - P a t h I n f o . i s D i r - - -
def isLink ( self ):
"""Predicate to test whether this path is a soft link.
[ if self represents a soft link ->
return a true value
else ->
return a false value ]
"""
return stat.S_ISLNK ( self.mode )
This method returns the absolute path name equivalent to
self.path. It uses the standard Python
function os.path.abspath().
# - - - P a t h I n f o . a b s P a t h - - -
def absPath ( self ):
"""Return self's absolute path name."""
return os.path.abspath ( self.path )
This method returns the absolute path name equivalent to
self.path. Unlike the .absPath() method, any soft links that may occur
within the path are resolved and replaced with the paths
to which the links refer. It uses the standard Python
function os.path.realpath().
# - - - P a t h I n f o . r e a l P a t h - - -
def realPath ( self ):
"""Return self's absolute path name, with links resolved."""
return os.path.realpath ( self.path )
All nine of the routines that test read, write, and execute permissions are structurally identical. In each case, one of the mode bits is 1 if the path has that permission, or 0 otherwise.
To test this bit while ignoring the other bits, we use a mask that has a 1 bit in that position and 0 bits elsewhere. Applying a Boolean “and” operation on the mask and the mode bits gives us a word that is all zeroes (false) if the relevant bit is 0, or a word that is not zero (true) if the relevant bit is not zero.
The first method illustrates the pattern. The mask for
the owner read permission bit comes from the stat module, and is called S_IRUSR.
# - - - P a t h I n f o . { o w n e r } C a n { R e a d } - - -
# { g r o u p } { W r i t e }
# { w o r l d } { E x e c }
def ownerCanRead ( self ):
return self.mode & stat.S_IRUSR
The remaining routines are identical except for the names
of the masks they use from the stat module.
def ownerCanWrite ( self ):
return self.mode & stat.S_IWUSR
def ownerCanExec ( self ):
return self.mode & stat.S_IXUSR
def groupCanRead ( self ):
return self.mode & stat.S_IRGRP
def groupCanWrite ( self ):
return self.mode & stat.S_IWGRP
def groupCanExec ( self ):
return self.mode & stat.S_IXGRP
def worldCanRead ( self ):
return self.mode & stat.S_IROTH
def worldCanWrite ( self ):
return self.mode & stat.S_IWOTH
def worldCanExec ( self ):
return self.mode & stat.S_IXOTH
This method translates self.modEpoch to our standard date and time
format.
First we convert the epoch time to a local time tuple,
then we format it using the time formatting method of the
time module.
# - - - m o d T i m e - - -
def modTime ( self ):
"""Format the modification time as yyyy-mm-dd hh:mm:ss."""
return time.strftime ( TIME_FORMAT,
time.localtime ( self.modEpoch ) )
For the format of the string returned by this method, see
Section 2, “The interface to the PathInfo object”.
We call internal methods to develop the file type, file permissions, and file modification timestamp. The assembly of the pieces, and the formatting of the size, is handled by the usual Python string format operator.
# - - - P a t h I n f o . _ _ s t r _ _ - - -
def __str__ ( self ):
"""Convert self to a string."""
return ( "%s%s %s %8d %s" %
(self.__fileType(), self.__permFlags(),
self.modTime(), self.size, self.path) )
This method derives the one-letter file type code:
- for a regular file, d for a directory, or l for a link. Because it is a private
method of the class, its name starts with two
underscores (__) so it is not
visible to code that imports this class.
# - - - P a t h I n f o . _ _ f i l e T y p e - - -
def __fileType ( self ):
"""Return the file type code.
[ if self is a regular file ->
return "-"
if self is a directory ->
return "d"
if self is a soft link ->
return "l" ]
"""
Just as a defensive programming measure, we return
"?" if for some reason the path
is neither a file, a directory, or a soft link. This can
happen for Unix device files, but those are beyond the
intended audience of the pathinfo.py module.
if self.isLink():
return "l"
elif self.isDir():
return "d"
elif self.isFile():
return "-"
else:
return "?"
This method formats the permission bits in self.mode using the time-honored format of
the Unix “ls”
command.
# - - - P a t h I n f o . _ _ p e r m F l a g s - - -
def __permFlags ( self ):
"""Format self.mode's permissions as 'rwxrwxrwx'.
"""
Each set of three permission bits is formatted in the
same way, so we call method self.__rwx() to format them. This method
takes three arguments. The first argument is 0 if there
is no read permission, nonzero otherwise. The second and
third arguments are the write and execute permission with
the same convention.
To get the values for each permission, we use a Boolean
“and” (&)
operator on self.mode and the mask values from the
stat module to discard all but
the bit of interest. Mask stat.S_IRUSR has a one bit in the position
of the owner read permission of the mode word, and zeroes
in the other positions. Mask stat.S_IWUSR is a mask for the owner write
permission, and so on.
return ( "%s%s%s" %
(self.__rwx ( self.mode & stat.S_IRUSR,
self.mode & stat.S_IWUSR,
self.mode & stat.S_IXUSR ),
self.__rwx ( self.mode & stat.S_IRGRP,
self.mode & stat.S_IWGRP,
self.mode & stat.S_IXGRP ),
self.__rwx ( self.mode & stat.S_IROTH,
self.mode & stat.S_IWOTH,
self.mode & stat.S_IXOTH ) ) )
This little method takes three permission values and
returns a three-character string formatted in the
“ls -l”
convention. The read permission is formatted as
"r" if true, "-" if false. Similarly, the write
permission is formatted as "w"
or "-", and the execute
permission as "x" or "-".
Each argument is nonzero if the permission is set, zero if it is not set.
# - - - P a t h I n f o . _ _ r w x - - -
def __rwx ( self, r, w, x ):
"""Format three permission bits.
[ r, w, and x are Boolean values indicating read,
write and execute permissions ->
return a three-character string displaying those
permissions as "ls -l" displays them ]
"""
The .__dasher() method handles
generation of either a letter or a dash depending on
the permission value.
return ( "%s%s%s" %
(self.__dasher ( r, "r" ),
self.__dasher ( w, "w" ),
self.__dasher ( x, "x" ) ) )
Formats a permission bit using the “ls
-l” convention. The bit argument is true if the permission is
granted, false otherwise. The flag argument is returned if the permission
is true, otherwise the method returns "-".
# - - - P a t h I n f o . _ _ d a s h e r - - -
def __dasher ( self, bit, flag ):
"""Format a permission bit as in ls -l.
[ (bit is a Boolean value) and (flag is a string) ->
if bit is true ->
return flag
else -> return "-" ]
"""
if bit: return flag
else: return "-"
We want PathInfo objects to sort by their .path attributes, that is, in ascending
order by pathname.
Because these pathnames are simple strings, and because
Python's built-in cmp() function
can compare strings, all we have to do here is call
cmp() on the pathnames and
return its result as our result.
# - - - P a t h I n f o . _ _ c m p _ _ - - -
def __cmp__ ( self, other ):
"""Comparison operator for PathInfo objects.
[ other is a PathInfo object ->
if self.path < other.path ->
return a negative number
else if self.path > other.path ->
return a positive number
else ->
return 0 ]
"""
return cmp ( self.path, other.path )