Next / Previous / Contents / Shipman's homepage

7.7. indexer: cpReport()

indexer
# - - -   c p R e p o r t

def cpReport(uniData, entSetNames):
    '''Produce the index by code point.
    '''

First we create the tbody element that will hold the entire table. This report is structured in subsections, one per Unicode block. Hence, we start out by building a list of UniBlock instances for those blocks, sorted in ascending order by code point.

indexer
    #-- 1
    # [ tbody  :=  a new tbody et.Element
    #   cpBlocks  :=  a list of the blocks in uniData as UniBlock
    #       instances, in ascending order by code point
    #   outFile  :=  a new, empty file named BY_CP_FILE ]
    tbody = E.tbody()
    def keyFunction(block):
        return block.start
    cpBlocks = sorted ( [ block
                          for block in uniData.genBlocks() ],
                        key=keyFunction )
    outFile = open(BY_CP_FILE, 'w')

For the code that adds table rows for all the entities that lie in a given block, see Section 7.8, “indexer: blockReport.

indexer
    #-- 2
    # [ tbody  +:=  rows displaying all entities in cpBlocks
    #       whose set names are in entSetNames ]
    for uniBlock in cpBlocks:
        #-- 2 body
        # [ tbody  +:=  rows displaying entities for characters
        #       in uniBlock whose set names are in entSetNames ]
        blockReport(tbody, uniData, entSetNames, uniBlock)

Finally, serialize the generated XML to the output file.

indexer
    #-- 3
    # [ outFile  +:=  tbody rendered as XML ]
    outFile.write(et.tostring(tbody, pretty_print=True))
    outFile.close()