Here is the main and its intended function.
# - - - - - m a i n def main(): """Main. [ if (the command line is valid) and (the effective directory specified by the command line exists) -> sys.stdout +:= report of sets of two or more files in or under that directory that have the same hash sys.stderr +:= report of files in or under that directory that are unreadable else -> sys.stderr +:= error message ] """
The first step is to check the command line arguments and
digest them into an
instance. See Section 10, “
checkArgs(): Process the command line
#-- 1 # [ if the command line arguments are valid -> # baseDir := the effective value of the DIR argument # minSize := the effective value of the SIZE argument # else -> # sys.stderr +:= error message # stop execution ] args = checkArgs() baseDir = getattr(args, DIR_ATTR) minSize = getattr(args, SIZE_ATTR)
Next we construct the database; see Section 17, “
class FileData: The database”.
#-- 2 # [ fileData := a new FileData instance representing an empty # database with minimum size (minSize) ] fileData = FileData(minSize)
os.path.walk() function takes care
of recursively visiting all subdirectories. It needs three
The starting directory.
A “visitor function” that will be called
to process each directory, including the starting
directory: see Section 12, “
visitor(): Process one directory's
An arbitrary state item that will be passed to the
visitor function so that it can accumulate whatever
data the visitor function is collecting. In this case,
FileData instance accumulates rows
representing qualifying files.
#-- 3 # [ fileData +:= rows representing files no smaller than # fileData.minSize that are located in or under baseDir # sys.stderr +:= error messages for unreadable files # in or under baseDir ] os.path.walk(baseDir, visitor, fileData)
For the report generation logic, see Section 15, “
report(): Generate the report”.
#-- 4 # [ sys.stdout +:= report of sets of two or more rows in fileData # that have the same hash, ordered by the lowest path in # each such set ] report(fileData)