Statistics on sgmltag.py
------------------------
Unverified.
Program size: 141/522 code lines, 27.0%.
================================================================
Syntax errors
----------------------------------------------------------------
S1
----------------------------------------------------------------
================================================================
Bugs that would be caught in a more strongly-typed language
----------------------------------------------------------------
T1 In SGMLTag.__init__(), this line:
if self.attrList:
should be:
if attrList:
----------------------------------------------------------------
================================================================
Logic bugs
----------------------------------------------------------------
B1 SGMLTag.__str__(): Failed to produce opening "<"
----------------------------------------------------------------
B2 Tag_Attr_Scan_Value(): Wrong rule for unquoted strings. It
is looking for a string starting with giStartCset, but that
includes only letters. Loose HTML allows numbers to be
unquoted as well, e.g., "
". So these lines:
pastFirst = scan.any ( giStartCset )
if pastFirst: ...
should be:
pastFirst = scan.any ( giCset )
if pastFirst is not None: ...
Also, there is no `else' case for this test:
pastOpen = scan.match ( '"' )
if pastOpen:
But there should be an else case to issue an error if the
value is neither alphanumeric nor a double-quote.
----------------------------------------------------------------
B3 In SGML_Tag_Scan_Comment(), this line crashes because `gi'
is undefined:
return SGMLTag ( gi, 0, None, string.join ( L, "" ) )
It should be:
return SGMLTag ( SGML_COMMENT_GI, 0, None, string.join ( L, "" ) )
----------------------------------------------------------------
B4 In SGMLTAG.__str__(), the initial "--" is missing from the
reconstituted string. This is because the .gi member is
only "!", not "!--". Change:
if self.gi == SGML_COMMENT_GI:
return "<%s%s%s" % ( self.gi, self.text, SGML_COMMENT_TAIL )
to:
if self.gi == SGML_COMMENT_GI:
return "<%s%s%s" % ( SGML_COMMENT_HEAD,
self.text, SGML_COMMENT_TAIL )
----------------------------------------------------------------
B5 In SGML_Tag_Scan_Comment(), the scanning logic does not move
past the closing "-->", which causes it to be treated as
ordinary text. With this bug and B4, a comment of the form:
will be reconstituted as:
-->
Change prime 2.2 from:
#-- 2.2 --
# [ if endPos is None ->
# scan := scan advanced to the start of the next line
# L +:= (remainder of the current line) + "\n"
# else ->
# scan := scan advanced up to endPos
# L +:= text from scan up to endPos
# done := 1 ]
if endPos is not None:
L.append ( scan.tab ( endPos ) )
done = 1
else:
...
to:
#-- 2.2 --
# [ if endPos is None ->
# scan := scan advanced to the start of the next line
# L +:= (remainder of the current line) + "\n"
# else ->
# scan := scan advanced up to endPos, plus the length
# of SGML_COMMENT_TAIL
# L +:= text from scan up to endPos
# done := 1 ]
if endPos is not None:
L.append ( scan.tab ( endPos ) )
scan.move ( len ( SGML_COMMENT_TAIL ) )
done = 1
else:
...
----------------------------------------------------------------
B6 In Tag_Attr_Scan_Value, if the value starts with a character
that is not a double-quote and not in giCset, the error
message is issued correctly ("Tag attribute values must be
alphanumeric or enclosed in double-quotes"), but then it
falls through to prime 2 and then prime 3, where this line
return ( value, isQuoted )
fails because `value' has never been set. The fix is to
add the return statement in this else clause:
else:
scan.error ( "Tag attribute values must be alphanumeric "
"or enclosed in double-quotes ('\"')." )
return (None, 0)
----------------------------------------------------------------
|