This project turned out to be a lot more work than the author anticipated, even given the challenges of dealing with the enormous feature set of XSL-FO. The author cannot say for sure whether the increased effort was due to problems with the local FO-to-PDF toolchain or his imperfect understanding of the XSL-FO, but two major features that were supposed to be automatic had to be handled by custom code.
Originally the wrapping of long lines was going to be
handled automatically using a “hanging
indentation”, implemented as a
element with a positive
start-indent and a
text-indent. The author expected
that a gray color in an enclosing
block-container would be visible only in the
However, this ran afoul of the requirement that overflow lines be preceded by a gray indentation. The “outdented” portion of the first line also displayed over the gray color. This is not an acceptable rendering:
--break option seemed to the author to
translate directly to XSL-FO's
keep-with-previous property groups. However, in
practice, the toolchain did not express this property in a
Consequently, this program handles the breaking of overly long
lines, and the implementation of the
option, entirely with custom logic.
This custom logic requires that we compute the exact height and width of each column so we can cut the input lines into pieces that fit. In order to solve this problem with full generality, we would need to know what font will be used, and read the font metrics files for that font. That is a nontrivial problem.
However, so long as this application is used only at the TCC, where the FO processor (xep) uses only a single monospaced font (Vera Sans Mono), we can assume the metrics of that font to predict font width. Experimentation shows that this font has a consistent 5:3 ratio of character height to width.
To force verbatim treatment of spaces, the
block will need two attributes:
There is one other subtle problem: what to do about unprintable characters, those not in the range from ASCII SP (space) to “~”. Some of them will affect the display:
Tabs (ASCII HT) will be expanded according to the interval specified on the command line, or its default interval.
If a form feed (ASCII FF) occurs at the beginning of a line, and if it is the effective break string, it will be removed before the line is displayed. In any other case it will be treated as unprintable.
Carriage return (ASCII CR) may appear as the line terminator, especially if the file came from the Microsoft world. We will ignore them, and they will not be displayed.
Linefeed (ASCII LF) is the line terminator character and will not be displayed.
For other unprintables, the report will display the character's hexadecimal code using two tiny characters in the normal character space. In practice, the tiny font will be half the current font size (rounded down, so in a 9-point font the tiny font will be 4-point).
Two other options affect the layout of the report body:
--leading value is implemented as a
space-after property on each line block.
--break option is implemented by
to each line except lines that
start with the break string.