Next / Previous / Contents / Shipman's homepage

7.10. indexer: entNameReport

indexer
# - - -   e n t N a m e R e p o r t

def entNameReport(uniData, entSetNames):
    '''Produce the index by entity name.

      [ (uniData is a unidata.UniData instance) and
        (entSetNames is a set of entity set names) ->
          file by-ent.dbk  :=  a DocBook tbody element containing
              an index of the entities in uniData whose set names
              are in entSetNames
          file OUT_MODULE_PY  :=  a Python module defining constants
              for each of those entities ]
    '''

The sorting order for this report is alphabetic order by entity name, case-insensitive. To hold the UniEntity instances representing the entities of interest, we'll use a dictionary with the uppercased entity names as keys.

In some entity sets (although not, to my knowledge, in ISO 9573), it is possible to have more than one code point contain entities with the same name in that entity set. In particular, in set 8879-isogrk3, entity ε appears under both U003B5 (ε) and U003F5 (ϵ). In such cases, we will explicitly ignore any occurrences after the first. In that case, if you really want the variant, you can use ϵ.

indexer
    #-- 1
    # [ tbody  :=  a new tbody et.Element
    #   outFile  :=  a new file named BY_ENT_NAME_FILE containing
    #       the boilerplate for the module
    #   outModule  :=  a new, empty file named OUT_MODULE_PY
    #   nameMap  :=  a dictionary whose keys are the entity names of
    #       characters in uniData whose entity sets are in entSetNames,
    #       and each related value is a UniEntity instance ]
    tbody = E.tbody()
    outFile = open(BY_ENT_NAME_FILE, 'w')
    outModule = startModule()
    nameMap = {}
    for uniChar in uniData.genChars():
        for uniEnt in uniChar.genEnts():
            if uniEnt.setName in entSetNames:
                if uniEnt.id not in nameMap:
                    nameMap[uniEnt.id] = uniEnt

    #-- 2
    # [ tbody  +:=  rows displaying the values of nameMap in
    #               ascending order by key, case-insensitive
    #   outModule  +:=  Python declarations for those values ]
    for entName in sorted(nameMap, key=str.upper):
        ent = nameMap[entName]
        outRow(tbody, ent)
        moduleWrite(outModule, ent)

    #-- 3
    # [ outFile  +:=  tbody rendered as XML ]
    outFile.write(et.tostring(tbody, pretty_print=True))
    outFile.close()