For parsers that appear only once at a specific level,
associate a results name with that parser. This allows
you to retrieve the matched text by treating the returned
ParseResults instance as a dictionary, and
the results name as the key.
Design rules for this option:
Access by name is more robust than access by
position. The structures you are working on may
change over time. If you access the results as a
list, what was element
 now may
suddenly become element
 when the
underlying structure changes.
If, however, you give the results some name like
'swampName', the access code
result['swampName'] will probably continue
to work even if other names are added later to the
ParseResults, use either access
by position or access by key (that is, by results
name), not both. If some subelement of a parser has
a results name, and some subelements do not have a
results name, the matching text for all those
subelements will be mixed together in the result.
Here's an example showing what happens when you mix positional and named access at the same level: in bull-riding, the total score is a combination of the rider's score and the bull's score.
>>> rider = pp.Word(pp.alphas).setResultsName('Rider') >>> bull = pp.Word(pp.alphas).setResultsName('Bull') >>> score = pp.Word(pp.nums+'.') >>> line = rider + score + bull + score >>> result = line.parseString('Mauney 46.5 Asteroid 46') >>> print result ['Mauney', '46.5', 'Asteroid', '46']
In the four-element list shown above, you can access the first and third elements by name, but the second and fourth would be accessible only by position.
A more sensible way to structure this parser would be to write a parser for the combination of a name and a score, and then combine two of those for the overall parser.
>>> name = pp.Word(pp.alphas).setResultsName('name') >>> score = pp.Word(pp.nums+'.').setResultsName('score') >>> nameScore = pp.Group(name + score) >>> line = nameScore.setResultsName('Rider') + nameScore.setResultsName('Bull') >>> result = line.parseString('Mauney 46.5 Asteroid 46') >>> result['Rider']['name'] 'Mauney' >>> result['Bull']['score'] '46'
Don't use a results name for a repeated element. If
you do, only the last one will be accessible by
results name in the
>>> catName = pp.Word(pp.alphas).setResultsName('catName') >>> catList = pp.OneOrMore(catName) >>> result = catList.parseString('Sandy Mocha Bits') >>> result['catName'] 'Bits' >>> list(result) ['Sandy', 'Mocha', 'Bits']
A better approach is to wrap the entire name in a
pp.Group() and then apply the results
name to that.
>>> owner = pp.Word(pp.alphas).setResultsName('owner') >>> catList = pp.Group(pp.OneOrMore(catName)).setResultsName('cats') >>> line = owner + catList >>> result = line.parseString('Carol Sandy Mocha Bits') >>> result['owner'] 'Carol' >>> print result['cats'] ['Sandy', 'Mocha', 'Bits']