Next / Previous / Contents / TCC Help System / NM Tech homepage

7. addTextMixed(): Generating mixed content with lxml

The lxml package, for all its virtues, does complicate life in one way for programs that generate mixed content, that is, XML content that is a mixture of child elements and ordinary text.

Specifically, every Element instance has a .text attribute that contains any text that occurs before any child elements, and a .tail attribute that contains any text that occurs after the closing tag of that element.

This function takes care of adding text content to the inside of a given parent element. Basically, if parent has no child elements, the new text is added to parent.text, appending it to the old value if there is any. However, if parent does have child elements, the new text is added to the .tail attribute of the last child element.

tccpage2.py
# - - -   a d d T e x t M i x e d

def addTextMixed ( parent, s ):
    """Add text s inside a parent element.

      [ (parent is an et.Element instance) and
        (s is a string) ->
          if  parent has no element children ->
            parent.text  +:=  s
          else ->
            (last element child of parent).tail  +:=  s ]
    """

Regardless of where we are putting the text s, we have to be sure not to overwrite any existing value there. We use the Python “or” operator to produce either the old value, or an empty string, to which s is appended before storing it. Note also that the len() of an et.Element is the number of element children.

tccpage2.py
    #-- 1 --
    if  len(parent) == 0:
        #-- 1.1 --
        # [ if bool(parent.text) ->
        #     parent.text  +:=  s
        #   else ->
        #     parent.text  :=  s ]
        parent.text  =  (parent.text or "") + s
    else:
        #-- 1.2 --
        # [ let
        #     youngest == (last element child of parent)
        #   in
        #     if  bool(youngest.tail) ->
        #       youngest.tail  +:=  s
        #     else ->
        #       youngest.tail  :=  s ]
        youngest  =  parent[-1]
        youngest.tail  =  (youngest.tail or "") + s