Next / Previous / Contents / TCC Help System / NM Tech homepage

3. Character string basics

Python has extensive features for handling strings of characters. There are two types:

We'll mainly talk about working with str values, but most unicode operations are similar or identical.

3.1. String literals

In Python, you can enclose string constants in either single-quote ('...') or double-quote ("...") characters.

>>> cloneName = 'Clem'
>>> cloneName
'Clem'
>>> print cloneName
Clem
>>> fairName = "Future Fair"
>>> print fairName
Future Fair
>>> fairName
'Future Fair'

When you display a string value in conversational mode, Python will usually use single-quote characters. Internally, the values are the same regardless of which kind of quotes you use. Note also that the print statement shows only the content of a string, without any quotes around it.

To convert an integer (int type) value i to its string equivalent, use the function “str(i)”:

>>> str(-497)
'-497'
>>> str(000)
'0'

The inverse operation, converting a string s back into an integer, is written as “int(s)”:

>>> 
>>> int("-497")
-497
>>> int("-0")
0
>>> int ( "012this ain't no number" )
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for int(): 012this ain't no number

The last example above shows what happens when you try to convert a string that isn't a valid number.

To convert a string s number in base B, use the form “str(s, B)”:

>>> int ( '0F', 16 )
15
>>> int ( "10101", 2 )
21
>>> int ( "0177776", 8 )
65534

To obtain the 8-bit integer code contained in a one-character string s, use the function “ord(s)”. The inverse function, to convert an integer i to the character that has code i, use “chr(i)”. The numeric values of each character are defined by the ASCIIcharacter set.

>>> chr( 97 )
'a'
>>> ord("a")
97
>>> chr(65)
'A'
>>> ord('A')
65

In addition to the printable characters with codes in the range from 32 to 127 inclusive, a Python string can contain any of the other unprintable, special characters as well. For example, the null character, whose official name is NUL, is the character whose code is zero. One way to write such a character is to use this form:

'\xNN'

where NN is the character's code in hexadecimal (base 16) notation.

>>> chr(0)
'\x00'
>>> ord('\x00')
0

Another special character you may need to deal with is the newline character, whose official name is LF (for “line feed”). Use the special escape sequence\n” to produced this character.

>>> s = "Two-line\nstring."
>>> s
'Two-line\nstring.'
>>> print s
Two-line
string.

As you can see, when a newline character is displayed in conversational mode, it appears as “\n”, but when you print it, the character that follows it will appear on the next line. The code for this character is 10:

>>> ord('\n')
10
>>> chr(10)
'\n'

Python has several other of these escape sequences. The term “escape sequence” refers to a convention where a special character, the “escape character”, changes the meaning of the characters after it. Python's escape character is backslash (\).

InputCodeNameMeaning
\b8BSbackspace
\t9HTtab
\"34"Double quote
\'39'Single quote
\\92\Backslash

There is another handy way to get a string that contains newline characters: enclose the string within three pairs of quotes, either single or double quotes.

>>> multi = """This string
...   contains three
...   lines."""
>>> multi
'This string\n  contains three\n  lines.'
>>> print multi
This string
  contains three
  lines.
>>> s2 = '''
... xyz
... '''
>>> s2
'\nxyz\n'
>>> print s2

xyz

>>>

Notice that in Python's conversational mode, when you press Enter at the end of a line, and Python knows that your line is not finished, it displays a “...” prompt instead of the usual “>>>” prompt.