Next / Previous / Contents / Shipman's homepage

6.4. dictOf(): Build a dictionary from key/value pairs

pp.dictOf(keyParser, valueParser)

This function builds a parser that matches a sequence of key text alternating with value text. When matched, this parser will deposit a dictionary-like value into the returned ParseResults with those keys and values.

The keyParser argument is a parser that matches the key text and the valueParser is a parser that matches the value text.

Here is a very simple example to give you the idea. The text to be matched is a sequence of five-character items, each of which is a one-letter color code followed by a four-character color name.

>>> colorText = 'R#F00 G#0F0 B#00F'
>>> colorKey = pp.Word(pp.alphas, exact=1) # Matches 1 letter
>>> rgbValue = pp.Word(pp.printables, exact=4) # Matches 4 characters
>>> rgbPat = pp.dictOf(colorKey, rgbValue)
>>> rgbMap = rgbPat.parseString(colorText)
>>> rgbMap.keys()
['B', 'R', 'G']
>>> rgbMap['G']
'#0F0'

Here's a slightly more subtle example. The text has the form "degree: name; ...", where the degree part is the degree of the musical scale as a number, and the name part is the name of that note. Here's a first attempt.

>>> text = '1, do; 2, re; 3, mi; 4, fa; 5, sol; 6, la; 7, ti'
>>> key = pp.Word(pp.nums) + pp.Suppress(',')
>>> value = pp.Word(pp.alphas) + pp.Suppress(';')
>>> notePat = pp.dictOf(key, value)
>>> noteNames = notePat.parseString(text)
>>> noteNames.keys()
['1', '3', '2', '5', '4', '6']
>>> noteNames['4']
'fa'
>>> noteNames['7']
KeyError: '7'

Note that the last key-value pair is missing. This is because the value pattern requires a trailing semicolon, and the text string does not end with one of those. Unless you were careful to check your work, you might not notice that the last item is missing. This is one reason that it is good practice always to use the parseAll=True option when calling .parseString(). Notice how that reveals the error:

>>> noteNames = notePat.parseString(text, parseAll=True)
pyparsing.ParseException: Expected end of text (at char 43), (line:1,
col:44)

It's easy enough to fix the definition of the text, but instead let's fix the parser so that it defines value as ending either with a semicolon or with the end of the string:

>>> value = pp.Word(pp.alphas) + (pp.StringEnd() | pp.Suppress(';'))
>>> notePat = pp.dictOf(key, value)
>>> noteNames = notePat.parseString(text)
>>> noteNames.keys()
['1', '3', '2', '5', '4', '7', '6']
>>> noteNames['7']
'ti'