Next / Previous / Contents / TCC Help System / NM Tech homepage

17. Type file: Input and output files

To open a file, use this general form:

f = open(name[,mode[,bufsize]]])

name

The path name of the file to be opened, as a string.

mode

An optional string specifying what you plan to do with the file. If omitted, you will get read access to the file. In general the value consists of three parts:

  • General mode, one of:

    r Read access. The file must already exist. You will not be allowed to write to it.
    w

    Write access. If there is no file by this name, a new one will be created.

    Important

    If there is an existing file, it will be deleted!

    a Append access. If there is a file by this name, your initial position will be at the end of the file, and you will be allowed to write (and read). If there is no file by this name, a new one will be created. On some systems, all writes to a file with append mode are added at the end of the file, regardless of the current file position.
  • If you plan to modify the file, append a “+” next.

    For example, mode “r+” puts you at the beginning of an existing file and allows you to write to the file anywhere.

    Mode “w+” is the same as “w”: it deletes an existing file if there is any, then creates a new file and gives you write access.

    Mode “a+” allows you to write new data at the end of an existing file; if no file by this name exists, it will create a new one.

  • If you are handling binary data, as opposed to lines of text, add “b” at the end of the mode string.

  • For modes beginning with 'r', you may append a capital 'U' to request universal newline treatment. This is handy when you are reading files made on a platform with different line termination conventions.

    When reading lines from a file opened in this way, any line terminator ('\n', '\r', or '\r\n') will appear in the return value as the standard '\n'. Also, files so opened will have an attribute named .newlines; this attribute will be None initially, but after any line terminators have been read, it will be a tuple containing all the different line terminator strings seen so far.

bufsize

Buffer size: this affects when physical device writes are done, compared to write operations that your program performs.

  • In most cases you will probably want to omit this argument. The default is to use line buffering for terminal-type devices, or some system default for other devices.

  • Use 0 to force unbuffered operation. This may be inefficient, but any file writes are performed immediately.

  • Use 1 for line buffering: output lines are written whenever you write a line terminator such as '\n'.

  • Use larger values to specify the actual size of the buffer.

  • Use a negative value to request the system defaults.

If you are reading text files, and you don't want to worry about the variety of line termination protocols, you may use a mode value of “U” for “universal line terminator mode.” In this mode, input lines may be terminated with either carriage return ('\r'), newline ('\n'), or both, but the lines you receive will always be terminated with a single newline. (Exception: If the last line is unterminated, the string you get will also be unterminated.)

There are a number of potential error conditions. For modes starting with “r”, the file must exist before you open it. Also, you must have access according to the underlying operating system. For example, in Linux environments, you must have read access to read a file, and you must have write access to modify or delete a file. These sorts of failures will raise an IOError exception.

A file is its own iterator (see Section 24.2, “Iterators: Values that can produce a sequence of values”). Hence, if you have a file inFile opened for reading, you can use a for loop that looks like this to iterate over the lines of the file:

for line in inFile:
    ...

The variable line will be set to each line of the file in turn. The line terminator character (if any) will be present in that string.

Other aspects of files: