Python has two string types. Type
str holds strings of zero or more
8-bit characters, while unicode
strings provide full support of the expanded Unicode
character set (see the
Unicode homepage).
There are many forms for string constants:
'...': Enclose the string
in single quotes.
"...": Enclose it in
double quotes.
'''...''':
Enclose it between three single quotes in a row.
The difference is that you can continue such a
string over multiple lines, and the line breaks
will be included in the string as newline
characters.
"""...""": You can use
three sets of double quotes. As with three sets of
single quotes, line breaks are allowed and preserved
as "\n" characters.
The above forms give you regular strings. To get a
unicode string, prefix the string with
u. For example:
u"klarn"
is a five-character Unicode string.
In addition, you can use any of these escape sequences inside a string constant:
\ | A backslash at the end of a line is ignored. |
\\ | Backslash (\) |
\' | Closing single quote (') |
\" | Double-quote character (") |
\n | Newline (ASCII LF or linefeed) |
\b | Backspace (in ASCII, the BS character) |
\f | Formfeed (ASCII FF) |
\r | Carriage return (ASCII CR) |
\t | Horizontal tab (ASCII HT) |
\v | Vertical tab (ASCII VT) |
\ | The character with octal code
, e.g.,
'\177'. |
\x | The character with hexadecimal value
, e.g.,
"\xFF". |
\u | The Unicode character with hexadecimal value
, e.g.,
u"\uFFFF". |
Raw strings:
If you need to use a lot of backslashes inside a
string constant, and doubling them is too confusing,
you can prefix any string with the letter
r to suppress the interpretation
of the escape sequences above. For example,
'\\\\' contains two backslashes,
but r'\\\\' contains four. Raw
strings are particularly useful with
the regular expression module.
In addition to the operations common to all sequences, strings support the operator
f%v
Format values from a tuple
using a format string
v;
the result is a single string with all the values
formatted.
See the table of
format codes below.f
All format codes start with
%; the other characters of
appear unchanged in the result. A conversational
example:f
>>> print "We have %d pallets of %s today." % (49, "kiwis") We have 49 pallets of kiwis today.
In general, format codes have the form
%[p][m[.n]]c
where:
| is an optional prefix; see the table of format code prefixes below. |
| specifies the total desired field width.
The result will never be shorter than this value,
but may be longer if the value doesn't fit; so,
"%5d" % 1234 yields
" 1234", but
"%2d" % 1234 yields
"1234". |
| specifies the number of digits after the decimal point for float types. |
| indicates the type of formatting. |
Here are the format codes :c
%s | String; e.g.,
"%-3s" % "xy" yields
"xy " (because the
"-" prefix forces
left alignment). |
%d | Decimal conversion, e.g.,
"%3d" % -4 yields the string
" -4". |
%e | Exponential format; allow four
characters for the exponent. Examples:
"%08.1e" % 1.9783
yields "0002.0e+00". |
%E | Same as %e, but an
uppercase E is used for
the exponent. |
%f | For float type. E.g.,
"%4.1f" % 1.9783 yields
" 2.0". |
%g | General numeric format. Use
%f if it fits,
otherwise use %e. |
%G | Same as %G, but an
uppercase E is used for
the exponent if there is one. |
%o | Octal, e.g.,
"%o" % 13 yields
"15". |
%x | Hexadecimal, e.g.,
"%x" % 247 yields
"f7". |
%X | Same as %x,
but capital letters are used for the digits
A-F, e.g.,
"%04X" % 247 yields
"00F7". |
%c | Convert an integer to the corresponding
ASCII code; for example,
"%c" % 0x61 yields the
string "a". |
%% | Places a percent sign
(%) in the result.
Does not require a corresponding value. |
+ | For numeric types, forces the sign to appear even for positive values. |
- | Left-justifies the value in the field. |
0 | For numeric types, use zero fill. For
example, "%04d" % 2
produces the value "0002". |
# | With the %o
(octal) format, append a leading
"0"; with the
%x (hexadecimal)
format, append a leading
"0x"; with the
%g (general numeric)
format, append all trailing zeroes. Examples:
>>> "%4o" % 127 ' 177' >>> "%#4o" % 127 '0177' >>> "%x" % 127 '7f' >>> "%#x" % 127 '0x7f' >>> "%10.5g" % 0.5 ' 0.5' >>> "%#10.5g" % 0.5 ' 0.50000' |
You can use the string format operator
% to format a set of values
from a dictionary :D
f%D
In this form, the general form for a format code is:
%(k)[p][m[.n]]c
where
is a key in dictionary k, and the rest of the
format code is as in the usual
string format
operator. For each format code, the value of
D is used.D[k]
For example, suppose D is
the dictionary {'baz':39,
'foo':'X'}; then
("=%(foo)s=%(baz)03d=" % D) yields
'=X=039='.
Functions:
str(obj)Converts , an
object of any type, to a string. For example,
objstr(17) produces the string
'17'.
unicode(s[,enc[,errs]])Converts an object
,
of any type, to a
Unicode string. The optional
s
argument specifies an encoding, and the optional
enc
argument specifies what to do in case of errors
(see the
Python Library Reference
for details).errs
raw_input(p)Prompt for input with string
p, then return
a line entered by the user, without the newline.
p
may be omitted for unprompted input.
These methods are available on any string or Unicode
object S:
S.capitalize()Return S with its
first character capitalized.
S.center(w)Return S centered
in a string of width ,
padded with spaces. If w, the result is a copy of
w<=len(S).
Example:
S'x'.center(4) returns
' x '.
S.count(t[,start[,end]])Return the number of times string
occurs in
t.
To search only a slice S of S, supply
S[start:end]
and start
arguments.end
S.endswith(t[,start[,end]])Predicate
to test whether S
ends with string . If you supply
the optional
t
and start
arguments, it tests whether the slice
end
ends with
S[start:end].t
S.expandtabs([tabsize])Returns a copy of with all tabs
expanded to spaces using. The optional
S argument specifies the number
of spaces between tab stops; the default is 8.tabsize
S.find(t[,start[,end]])If string is not found in
t,
return -1; otherwise return the index of the
first position in S that matches
S.
For example,
t"banana".find("an")
returns 1. The optional
and start
arguments restrict the search to slice
end.S[start:end]
S.index(t[,start[,end]])Works like .find(),
but if is not found, it
raises a tValueError exception.
S.isalnum()Predicate
that tests whether
is nonempty and all its characters are alphanumeric.S
S.isalpha()Predicate
that tests whether
is nonempty and all its characters are letters.S
S.isdigit()Predicate
that tests whether
is nonempty and all its characters are digits.S
S.islower()Predicate
that tests whether
is nonempty and all its characters are lowercase
letters.S
S.isspace()Predicate
that tests whether
is nonempty and all its characters are whitespace
characters.S
In Python, the characters
considered whitespace include
' ' (space, called SP in ASCII),
'\n' (newline, NL),
'\r' (return, CR),
'\t' (tab, HT),
'\f' (form feed, FF), and
'\v' (vertical tab, VT).
S.isupper()Predicate
that tests whether
is nonempty and all its characters are uppercase
letters.S
S.join(L)
must be a sequence. Returns a string containing
the members of the sequence with copies of string
L
inserted between them. For example,
S'/'.join(['foo', 'bar', 'baz'])
returns the string
'foo/bar/baz'.
S.ljust(w)Return a copy of left-justified
in a field of width
S,
padded with spaces. If w, the result is a copy of
w<=len(S).
Example: S"Ni".ljust(4)
returns "Ni ".
S.lower()Returns a copy of S
with all uppercase letters replaced by their
lowercase equivalent.
S.lstrip([c])Return
with all leading characters from string
S
removed. The default value for
c
is a string containing all the
whitespace
characters.c
S.replace(old,new[,max])Return a copy of with all occurrences
of string S replaced by string
old.
Normally, all occurrences are replaced; if you
want to limit the number of replacements, pass
that limit as the
new
argument.max
S.rfind(t[,start[,end]])Like .find(),
but if
occurs in t, this method returns
the highest starting index.S
For example,
"banana".rfind("an")
returns 3.
S.rjust(w)Return a copy of right-justified
in a field of width Sw,
padded with spaces. If , the result is a copy of
w<=len(S).S
S.rstrip([c])Return
with all trailing characters from string
S
removed. The default value for
c
is a string containing all the
whitespace
characters.c
S.split([d[,max]])Returns a list of strings
[ made by splitting
s0,
s1,
...]
into pieces wherever the delimiter string
S
is found. The default is to split up
d
into pieces wherever clumps of one or more
whitespace
characters are found. Some examples:S
>>> "I'd annex \t \r the Sudetenland" .split()
["I'd", 'annex', 'the', 'Sudetenland']
>>> '3/crunchy frog/ Bath & Wells'.split('/')
['3', 'crunchy frog', ' Bath & Wells']
>>> '//Norwegian Blue/'.split('/')
['', '', 'Norwegian Blue', '']
>>> 'never<*>pay<*>plan<*>'.split('<*>')
['never', 'pay', 'plan', '']The optional argument limits
the number of pieces removed from the front
of max.
For example,
S'a/b/c/d/e'.split('/',2)
yields the list
['a', 'b', 'c/d/e'].
S.splitlines([keepends])Splits into lines and returns a list of
the lines as strings. Discards the line
separators unless the optional
S
arguments is true.keepends
S.startswith(t[,start[,end]])Predicate
to test whether S
starts with string . Otherwise similar
to t.endswith().
S.strip([c])Return
with all leading and trailing characters from string
S
removed. The default value for
c
is a string containing all the
whitespace
characters.c
S.swapcase()Return a copy of S
with each lowercase character replaced by
its uppercase equivalent, and vice versa.
S.translate(new[,drop])This function is used to translate or remove
each character of . The
S
argument is a string of exactly 256 characters,
and each character new
of the result is replaced by
xnew[ord(. If you would like certain
characters removed from x)] before the
translation, provide a string of those characters
as the S argument.drop
S.upper()Return a copy of with all lowercase
characters replaced by their uppercase
equivalents.S
S.zfill(w)Return a copy of
left-filled with S'0'
characters to width . For example,
w'12'.zfill(5) returns
'00012'.