Next / Previous / Contents / Shipman's homepage

14. hashFile(): Compute a file's hash digest

deduper
# - - -   h a s h F i l e

def hashFile(path):
    '''Compute the sha256 hash hex digest of file (path).

      [ path is a str ->
          if path names a readable file ->
            return the sha256 hex digest of that file as a str
          else -> return None ]
    '''

To compute a file's hex digest, we instantiate a hashlib.sha256 instance, read the file in big chunks, and feed those chunks into that instance. The hasher's .hexdigest() method returns the digest as a 64-character string.

deduper
    #-- 1
    # [ if path names a readable file ->
    #     inFile  :=  that file, so opened
    #     hasher  :=  a new, empty hashlib.sha256 instance
    #   else ->
    #     sys.stderr  +:=  error message
    #     return ]
    try:
        inFile = open(path, 'rb')
        hasher = hashlib.sha256()
    except Exception as x:
        message("*** Can't read {0!r}: {1}".format(path, x))
        return

For the definition of BLOCK_SIZE, see Section 8, “Manifest constants”.

deduper
    #-- 2
    # [ hasher  :=  hasher + (contents of inFile) ]
    data = inFile.read(BLOCK_SIZE)
    while len(data) > 0:
        hasher.update(data)
        data = inFile.read(BLOCK_SIZE)

    #-- 3
    # [ return the hex digest of hasher ]
    return hasher.hexdigest()