When your input matches the parser you have built, the
.parseString() method returns an instance of
For a complex structure, this instance may have many
different bits of information inside it that represent the
important pieces of the input. The exact internal
structure of a
ParseResults instance depends
on how you build up your top-level parser.
You can access the resulting
instance in two different ways:
As a list. A parser that matches n internal components will return a result
act like a list of the n
strings that matched those components. You can extract
the nth element as
you can convert it to an actual list of strings with
>>> import pyparsing as pp >>> number = pp.Word(pp.nums) >>> result = number.parseString('17') >>> print result ['17'] >>> type(result) <class 'pyparsing.ParseResults'> >>> result '17' >>> list(result) ['17'] >>> numberList = pp.OneOrMore(number) >>> print numberList.parseString('17 33 88') ['17', '33', '88']
As a dictionary. You can attach a results
to a parser by calling its
.setResultsName( method (see Section 5.1, “
ParserElement: The basic parser
Once you have done that, you can extract the matched
string from the
>>> number = pp.Word(pp.nums).setResultsName('nVache') >>> result = number.parseString('17') >>> print result ['17'] >>> result['nVache'] '17'
Here are some general principles for structuring your
Like any nontrivial program, structuring a parser with any complexity at all will be more tractable if you use the “divide and conquer” principle, also known as stepwise refinement.
In practice, this means that the top-level
ParseResults should contain no more than, say,
five or seven components. If there are too many
components at this level, look at the total input and
divide it into two or more subparsers. Then structure
the top level so that it contains just those pieces. If
necessary, divide the smaller parsers into smaller
parsers, until each parser is clearly defined in terms of
built-in primitive functions or other parsers that you
Section 5.13, “
Group: Group repeated items into a
list” is the basic tool for
creating these levels of abstraction.
Normally, when your parser matches multiple things,
the result is a
that acts like a list of the strings that matched.
For example, if your parser matches a list of words,
it might return a
prints as if it were a list.
type() function we can see
the actual type of the result, and that the
components are Python strings.
>>> word = pp.Word(pp.alphas) >>> phrase = pp.OneOrMore(word) >>> result = phrase.parseString('farcical aquatic ceremony') >>> print result ['farcical', 'aquatic', 'ceremony'] >>> type(result) <class 'pyparsing.ParseResults'> >>> type(result) <type 'str'>
However, when you apply
some parser, all the matching pieces are returned in
pp.ParseResults that acts
like a list.
For example, suppose your program is disassembling a sequence of words, and you want to treat the first word one way and the rest of the words another way. Here's our first attempt.
>>> ungrouped = word + phrase >>> result = ungrouped.parseString('imaginary farcical aquatic ceremony') >>> print result ['imaginary', 'farcical', 'aquatic', 'ceremony']
That result doesn't really match our concept that the parser is a sequence of two things: a single word, followed by a sequence of words.
pp.Group() like this, we
get a parser that will return a sequence of two
items that match our concept.
>>> grouped = word + pp.Group(phrase) >>> result = grouped.parseString('imaginary farcical aquatic ceremony') >>> print result ['imaginary', ['farcical', 'aquatic', 'ceremony']] >>> print result ['farcical', 'aquatic', 'ceremony'] >>> type(result) <class 'pyparsing.ParseResults'> >>> result 'farcical' >>> type(result) <type 'str'>
grouped parser has two components:
word and a
Hence, the result returned acts like a two-element
The first element is an actual string,
The second part is another
pp.ParseResults instance that acts like a
list of strings.
So for larger grammars, the
instance, which the top-level parser returns when it matches,
will typically be a many-layered structure containing this
kind of mixture of ordinary strings and other instances of
The next section will give you some suggestions on manage the structure of these beasts.