Dynamic Pretty-printing in DomTerm

The DomTerm terminal emulator has a number of unique features. In this article we will explore how it enables dynamic re-flow for Lisp-style pretty-printing.

The goal of pretty-printing is to split a text into lines with appropriate indentation, in a way that conforms to the logical structure of the text.

For example if we need to print the following list:

((alpha-1 alpha-2 alpha-3) (beta-1 beta-2 beta-3 beta-4))

in a window 40 characters wide, we want:

((alpha-1 alpha-2 alpha-3)
 (beta-1 beta-2 beta-3 beta-4))

but not:

((alpha-1 alpha-2 alpha-3) (beta-1
 beta-2 beta-3 beta-4))

Pretty-printing is common in Lisp environments to display complicated nested data structures. Traditionally, it is done by the programming-language runtime, based on a given line width. However, the line width of a console can change if the user resizes the window or changes font size. In that case, previously-emitted pretty-printed lines will quickly become ugly: If the line-width decreases, the breaks will be all wrong and the text hard to read; if the line-width increases we may be using more lines than necessary.

Modern terminal emulators do dumb line-breaking: Splitting a long lines into screen lines, but regardless of structure or even word boundaries. Some emulators remember for each line whether an overflow happened, or whether a hard newline was printed. Some terminal emulators (for example Gnome Terminal) will use this to re-do the splitting when a window is re-sized. However, that does not help with pretty-printer output.

Until now. Below is a screenshot from Kawa running in DomTerm at 80 colutions.

We reduce the window size to 50 columns. The user input (yellow background) is raw text, so its line is split non-pretty, but the output (white background) gets pretty re-splitting. (Note the window size indicator in the lower-right.)

We reduce the window size to 35 columns:

It also works with saved pages

DomTerm allows you to save the current session as a static HTML page. If the needed DomTerm CSS and JavaScript files are provided in the hlib directory, then dynamic line-breaking happens even for saved log files. (The lazy way is to create hlib as a symbolic link to the hlib directory of the DomTerm distribution.)

Try it yourself on a saved session. The --debug-print-expr flag causes Kawa to print out each command before it is compiled and evaluated. The result (shown in red because it is sent to the standard error stream) is pretty-printed dynamically.

Structured text

This is how it works.

When an application pretty-prints a structure, it calls special output procedures to mark which parts of the output logically belong together (a logical block), and where line-breaks and indentation may be inserted. In Kawa the default print formatting for lists and vectors automatically calls these procedures when the output is a pretty-printing stream. The pretty-printing library calculates where to put line-breaks and indentation, based on these commands and the specified line length.

However, when the output stream is a DomTerm terminal, Kawa's pretty-printing library does not actually calculate the line-breaks. Instead it encodes the above-mentioned procedure calls as special escape sequences that get sent to DomTerm.

When DomTerm receives these escape sequences, it builds a nested DOM structure that corresponds to the orginal procedure calls. DomTerm calculates how to layout that structure using a variant of the Common Lisp pretty-printing algorithm, inserting soft left-breaks and indentation as needed.

When DomTerm detects that its window has been re-sized or zoomed, it first removes old soft line-breaks and identation. It does re-runs the layout algorithm.

When a page is saved, the nested DOM structure is written out too. If the saved page is loaded in a browser, and the necessary JavaScript libraries are available, then the pretty-printing algorithm is run on the saved page, both on initial load, and whenever the window is re-sized.

Hyphenation and word-breaking

DomTerm supports general soft (optional) line breaks: You can specify separate texts (optionally with styling) for each of the following: The text used when the line is not broken at that point; the text to use before the line-break, when broken (commonly a hyphen); the text to use after the line-break (following indentation). One use for this words that change their spelling when hyphenated, as may happen in German. For example the word backen becomes bak-ken. You can handle this using ck as the non-break text; k- as the pre-break text; and k as the post-break text.

Per Bothner's blog