Next / Previous / Contents / TCC Help System / NM Tech homepage

21.5.1. Characters in regular expressions

Note: The raw string notation r'...' is most useful for regular expressions; see raw strings, above.

These characters have special meanings in regular expressions:

.Matches any character except a newline.
^Matches the start of the string.
$Matches the end of the string.
r* Matches zero or more repetitions of regular expression r.
r+ Matches one or more repetitions of r.
r? Matches zero or one r.
r*? Non-greedy form of r*; matches as few characters as possible. The normal * operator is greedy: it matches as much text as possible.
r+? Non-greedy form of r+.
r?? Non-greedy form of r?.
r{m,n} Matches from m to n repetitions of r. For example, r'x{3,5}' matches between three and five copies of letter 'x'; r'(bl){4}' matches the string 'blblblbl'.
r{m,n}? Non-greedy version of the previous form.
[...] Matches one character from a set of characters. You can put all the allowable characters inside the brackets, or use a-b to mean all characters from a to b inclusive. For example, regular expression r'[abc]' will match either 'a', 'b', or 'c'. Pattern r'[0-9a-zA-Z]' will match any single letter or digit.
[^...] Matches any character not in the given set.
rs Matches expression r followed by expression s.
r|s Matches either r or s.
(r) Matches r and forms it into a group that can be retrieved separately after a match; see MatchObject, below. Groups are numbered starting from 1.
(?:r) Matches r but does not form a group for later retrieval.
(?P<n>r) Matches r and forms it into a named group, with name n, for later retrieval.

These special sequences are recognized:

\n Matches the same text as a group that matched earlier, where n is the number of that group. For example, r'([a-zA-Z]+):\1' matches the string "foo:foo".
\A Matches only at the start of the string.
\b Matches the empty string but only at the start or end of a word (where a word is set off by whitespace or a non-alphanumeric character). For example, r'foo\b' would match "foo" but not "foot".
\B Matches the empty string when not at the start or end of a word.
\d Matches any digit.
\D Matches any non-digit.
\s Matches any whitespace character.
\S Matches any non-whitespace character.
\w Matches any alphanumeric character.
\W Matches any non-alphanumeric character.
\Z Matches only at the end of the string.
\\ Matches a backslash (\) character.