Next / Previous / Contents / Shipman's homepage

7.13. indexer: cpNameReport()

indexer
# - - -   c p N a m e R e p o r t

def cpNameReport(uniData, entSetNames):
    '''Produce the index by code point full name.

      [ (uniData is a unidata.UniData instance) and
        (entSetNames is a set of entity set names) ->
          file by-full.dbk  :=  a DocBook tbody element containing
              an index of that entity group by entity name ]
    '''

Because there are often multiple entities for the same code point, we will store the UniEntity instances of interest in a dictionary whose keys are a 2-tuple containing the major key (the code point's full name) and the minor key (the entity name, capitalized). Thus when we extracted the values sorted by key, they will be in the desired report order.

indexer
    #-- 1
    # [ tbody  :=  a new, empty tbody et.Element
    #   outFile  :=  a new, empty file named BY_FULL_NAME_FILE
    #   fullMap  :=  a dictionary whose keys are tuples (F, E) where
    #       F is a uniChar.fullName of an entity that is in entSetNames
    #       and E is the uppercased UniEntity.id for that entity,
    #       and each related value is the UniEntity instance ]
    tbody = E.tbody()
    outFile = open(BY_FULL_NAME_FILE, 'w')
    fullMap = {}
    for uniChar in uniData.genChars():
        for uniEnt in uniChar.genEnts():
            if uniEnt.setName in entSetNames:
                fullMap[(uniChar.fullName, uniEnt.id.upper())] = uniEnt

    #-- 2
    # [ tbody  +:=  rows representing the values of fullMap in
    #               ascending order by key ]
    for key in sorted(fullMap):
        outRow(tbody, fullMap[key])

    #-- 3
    # [ outFile  +:=  tbody, serialized as XML ]
    outFile.write(et.tostring(tbody, pretty_print=True))
    outFile.close()