The script starts out by making up a list of directory
trees to be visited. If there are any command line
arguments, these arguments make up the list of directories.
If there are no arguments, we default to a list containing
The following steps are done for each directory in the list:
Look at every file in and under that directory. Use
PathInfo to take a snapshot of that file, and build a
list of these
PathInfo instances. This process of
“walking the directory tree” is easy because
os.path module has a function called
walk() that handles the process of visiting
every directory in the tree. For documentation on this
function, see the Python Library Reference.
Sort this list in descending order by the size attribute.
Python makes it easy to sort a list: all list objects
.sort() method that
sorts the list in place. But how do we get a list of
PathInfo objects to sort in descending order by size? The
.sort() method can take as an
optional argument a function that compares two objects.
However, a more Pythonic way to do it is to define a
new class that inherits from the
PathInfo class called
BigInfo. In that class we
define a special method named
.__cmp__() that tells Python how to order
two objects of that class.
Print a heading for this section of the report, then go
through the sorted list and print one line for each
PathInfo instance in that list.
One of the goals of object-oriented programming is to
minimize, if not eliminate, the use of global variables.
An earlier version of this program used a global variable
to hold the list of
PathInfo objects. A better way is to
define a class that holds this list. The methods in the
class have access to the list, but no code outside the
class needs such access. This is in accord with the
generally accepted software design principle of
“information hiding”: we will feed the class
constructor the name of a directory, and it will return an
object that has everything we need to generate the final
We'll call this class
because it represents a report on big files. With this
design, the overall program flow for each directory
object. Pass the name of the directory to its
BigReport object has a method named
.genFiles() that generates the lines of the report.