Next / Previous / Contents / Shipman's homepage

5.7. Dict: A scanner for tables


The Dict class is a highly specialized pattern used to extract data from text arranged in rows and columns, where the first column contains labels for the remaining columns. The pattern argument must be a parser that describes a two-level structure such as a Group within a Group. Other group-like patterns such as the delimitedList() function may be used.

The constructor returns a parser whose .parseString() method will return a ParseResults instance like most parsers; however, in this case, the ParseResults instance can act like a dictionary whose keys are the row labels and each related value is a list of the other items in that row.

Here is an example.

#!/usr/bin/env python
# catbird: Example of pyparsing.Dict pattern
import pyparsing as pp

data = "cat Sandy Mocha Java|bird finch verdin siskin"
rowPat = pp.OneOrMore(pp.Word(pp.alphas))
bigPat = pp.Dict(pp.delimitedList(pp.Group(rowPat), "|"))
result = bigPat.parseString(data)
for rowKey in result.keys():
    print "result['{0}']={1}".format(rowKey, result[rowKey])

Here is the output from that script:

result['bird']=['finch', 'verdin', 'siskin']
result['cat']=['Sandy', 'Mocha', 'Java']