Next / Previous / Contents / Shipman's homepage

30. validateEng(): Validate an English name

nomcompile3
# - - -   v a l i d a t e E n g

def validateEng ( scan, eng ):
    '''Check an English bird name for validity.

      [ (scan is a Scan instance) and
        (eng is a string) ->
          if eng is a valid English bird name ->
            return None
          else ->
            Log()  +:=  error message
            raise SyntaxError ]
    '''

The specific rules for valid English names are discussed in a section of the specification. None of them apply to one-word names, so we eliminate those right away. We also check to make sure that the name is not empty. In the general case, count the number of each of the special characters.

nomcompile3
    #-- 1 --
    # [ if eng is empty ->
    #     Log()  +:=  error message(s)
    #     raise SyntaxError
    #   else if eng contains no embedded whitespace ->
    #     return
    #   else ->
    #     nCommas  :=  number of ',' in eng
    #     nDots  :=  number of '.' in eng
    #     nLefts  :=  number of '(' in eng
    #     nRights  :=  number of ')' in eng
    #     nQuotes  :=  number of '"' in eng
    #     nUnders  :=  number of '_' in eng ]
    if len(eng.strip()) == 0:
        scan.syntax("Expecting an English name field.")
    else:
        fieldList = eng.split()
        if len(fieldList) < 2:
            return
        (nCommas, nDots, nLefts, nRights, nQuotes, nUnders
        ) = [ eng.count(what)
              for what in (',', '.', '(', ')', '"', '_') ]

Check for equal numbers of the kinds of parentheses (we don't check for proper nesting here); and make sure double-quotes and underbars occur in pairs.

nomcompile3
    #-- 2 --
    if (((nLefts > 0) or (nRights > 0)) and
        (nLefts != nRights)):
        scan.syntax("Unbalanced parentheses: %d lefts, %d rights." %
                    (nLefts, nRights))

    #-- 3 --
    if (nQuotes % 2) != 0:
        scan.syntax("Unbalanced quotes.")

    #-- 4 --
    if (nUnders % 2) != 0:
        scan.syntax("Unbalanced '_' markup.")

The final rule is that multi-word names must have at least one of the special characters. (We eliminated the single-word names back in step 1.)

nomcompile3
    #-- 5 --
    if((nLefts==0) and (nQuotes==0) and (nUnders==0) and
       (nDots==0) and (nCommas==0)):
        scan.syntax("Multi-word names must contain a comma, "
            "period, quotes, or underbars.")