Next / Previous / Contents / TCC Help System / NM Tech homepage

51.13. asciifyString: Encode non-ASCII characters

This function replaces non-ASCII characters in a string with their URL-encoded equivalents. For an explanation of why we need to do this, see Section 51.12, “cleanURL(): Process the raw URL”.

pageget.py
# - - -   a s c i i f y S t r i n g   - - -

def asciifyString ( s ):
    """Like urllib.quote(), but only quotes non-ASCII characters.

      [ s is a string ->
          return s with all characters >= 0x80 escaped using
          URL encoding ]
    """

URL encoding replaces characters with the string "%XX", where XX is the character's hex code. For the function that does the escaping, see Section 51.14, “asciifyChar(): Escape a non-ASCII character”.

pageget.py
    #-- 1 --
    # [ cleaned  :=  a list of the characters from s, with those
    #                >= 0x80 replaced by the URL-encoded equivalent ]
    cleaned  =  [ asciifyChar(c) for c in list(s) ]

    #-- 2 --
    return "".join ( cleaned )