Next / Previous / Contents / Shipman's homepage

13. Sox._nameCheck(): Is this a valid xml-name?

sox.py
# - - -   S o x . _ n a m e C h e c k

    def _nameCheck(self, name):
        '''Is (name) a valid XML name?

          [ name is unicode or a UTF-8 encoded str ->
              if name is a valid xml-name ->
                return name as unicode
              else -> raise SoxError ]
        '''

First we coerce the name to Unicode; see Section 14, “Sox._unicodify(): Force to Unicode”. For the definition of a valid XML name, see Section 8.5, “xml-name: the name must have at least one NameStartChar character followed by zero or more NameChar characters.

sox.py
        #-- 1
        # [ if name is unicode-okay ->
        #     uName  :=  name as unicode
        #   else -> raise SoxError ]
        uName = self._unicodify(name)

        #-- 2
        # [ if uName is empty ->
        #     raise SoxError
        #   else ->
        #     start  :=  uName[0]
        #     rest  :=  uName[1:] ]
        if len(uName) == 0:
            raise SoxError("The empty string is not a valid "
                "XML name.")
        else:
            start = uName[0]
            rest = uName[1:]

For the tables of Unicode code point ranges that define the XML NameStartChar and NameChar categories, see Section 7.2, “NAME_START_RANGES and Section 7.3, “NAME_CHAR_RANGES. For the logic that checks whether a character is within one of those ranges, see Section 15, “Sox._checkCharRange(): Is this code point in a given set of intervals?”.

sox.py
        #-- 3
        # [ if ord(start) is within any of the closed intervals in
        #   NAME_START_RANGES ->
        #     I
        #   else -> raise SoxError ]
        if not self._checkCharRange(start, NAME_START_RANGES):
            raise SoxError("The first character of %s is not "
                "a valid XML NameStartChar." % uName)

        #-- 4
        # [ if the ordinal of any character in (rest) is not in
        #   any of the closed intervals in NAME_CHAR_RANGES ->
        #     raise SoxError
        #   else -> I ]
        for k, c in enumerate(rest):
            if ((not self._checkCharRange(c, NAME_START_RANGES)) and
                (not self._checkCharRange(c, NAME_CHAR_RANGES))):
                raise SoxError("The character in position %s of "
                    "%s is not a valid XML NameChar." % (k+1, uName))

        #-- 5
        return uName