exorcising the WIMP

Despite inherent weakness, the WIMP is ironically strong (at least in the brute force of technological/market inertia) and practically unavoidable. Possessed by pernicious paradigms, it can be exorcised to improve fitness.

This document describes remedial modifications which I have come to depend on to make the WIMP environment more useful and bearable (though some of the things described here remain useful in the absence of a GUI).

input welcome: help!

preliminary

While also working with other systems, Macintosh (with essential extensions thanks to many especially thoughtful third-party developers) was my primary environment for more than 20 years until I could no longer tolerate Apple and moved to Linux (where I remain appalled that so much is still missing after so long!).

Although increasingly involved in programming, most of my work with computers remains concentrated around the manipulation of my own data, and I have been slow in implementing new tools. I have long contemplated information systems and their human interfaces, but my curiosity generally withers in studying common theory/practice because it seems unnecessarily complicated to me (like so many forms of human expression…). In any case, my interest is to the greater meaning of information technology and symbolic manipulation (in contrast to the limited scope of computers and computation).

The WIMP is certainly not my ideal but I have yet to either produce or find a better practical alternative, even though I’ve been thinking about it for years. Short of starting fresh, one might build a typographic interface upon Plan 9. Meanwhile, I try to make the best of my situation.

display

resolution

For optimal resolution, I keep my display at a distance just beyond the threshold of pixel visibility; i.e. with enough distance that pixel mosaics become coherent shapes/lines (rather than collections of tiles), yet close enough that subtle differences remain detectable, ready for closer inspection.

zoom/pan

I prefer a global key binding to zoom the entire display. I came to appreciate this capability long ago while using graphics cards with instantaneous zoom/pan implemented in hardware, but newer/faster hardware does it with software (and still slower after all these years…).

software

X: XRandROS X: Accessibility

Under X, video modes can be configured to display a portion of a larger area with panning, but this is slow and blurry as the display is driven at a resolution less than native (and displays usually force anti-aliasing on such input). I have yet to find a simple software-only solution for X (Compiz, &c. is too much).

fonts

Assuming a capable system, I enable anti-aliasing and disable hinting. Some complain that this results in blurry text but it actually displays the type more accurately. I generally find it more readable and I prefer less contrast anyway.

Under X, these settings can be configured using Fontconfig (in case they are not otherwise accessible, or to override defaults).

The most legible fonts I see for low-resolution display are those commissioned by Microsoft specifically for that purpose. Ironically, font display has been poor under Windows (due to forced hinting and the absence of sub-pixel positioning) but it is improving.

For general use, I prefer these fonts:

Many Linux distributions have a Microsoft core fonts package, which includes Georgia and Verdana (but for legal reasons it must be installed manually). A script is available to install Vista fonts on Linux (including Candara and Consolas).

The fonts above are freely available and they work very well, but some commercial fonts are worth mentioning:

For purposes which require bitmapped fonts, the (monospaced) Tamsyn family works well.

Further reading:

appearance

Popular WIMP implementations are rife with superfluous and distracting graphics. But graphical interface elements (widgets) need not be conspicuous. They can instead remain below the threshold of distraction yet visible enough to be easily located.

Contrast is usually extreme, especially in text input fields which are often set with a background of maximum intensity (100% white). Light text on a dark background has advantages in principle but problems in practice as software developers often make assumptions about the background colour/intensity. In any case, I naturally prefer dark-on-light for reflective displays (e.g. e-ink) and light-on-dark for transmissive (light-emitting) displays.

Instead of using fixed values, an interface palette (theme) can be defined as a set of related intensity intervals which can be easily adjusted (scaled/shifted) as a unit (preserving proportionality) throughout the full intensity range of the display as need be. (But as obvious as this seems, I’m not aware of software which works this way.)

Adjusting the appearance of different systems: Under X, GTK+ provides a relatively high level of control (and Qt can use GTK+ themes); Windows provides basic but satisfactory control of appearance; in contrast, Apple makes it difficult (i.e. OS X themes are possible but generally cause more trouble than they’re worth). The search terms «flat grey» work well for finding minimal themes.

My preferred theme files (assuming the fonts listed above):

Keyboard mnemonic labels (i.e. the lines under hotkey characters in menus/buttons/&c.) can remain hidden until the modifier key is pressed. For GTK2:

echo 'gtk-auto-mnemonics = 1' >>~/.gtkrc-2.0

For GTK3: “mnemonics has been deprecated since version 3.10 [..] setting is ignored” (source)

Windows includes a similar preference. Mac OS never had them.

windows

window manager

In contrast to the usual stacking/overlapping, tiling can be useful but I prefer window managers which supports both modes. In any case, GUI programs tend to fill their windows with contents that are not very adaptable which greatly hinders the utility of tiling (wasting space as content does not adjust to fill it, and requiring unnecessary scrolling to see what would otherwise be visible).

Tiling window managers generally change the layout every time a window is shown/hidden (disrupting the size/position of windows in order to fill the display and to avoid overlapping). This automatic layout process can be more or less disruptive, especially in respect to the current/active window, and I prefer minimal layout disruption.

software

In contrast to Windows and OS X, many different window managers are available for X (see Arch’s list of window managers).

The most promising tiling window managers I’ve used are: bspwm and herbstluftwm. Both require patience to read and configure but work very well.

After using bspwm for a year or so, I intend to stay with tiling, and it’s worth noting that implementations vary greatly, as other tiling window managers had the opposite effect on me (e.g. i3).

focus switching

Beyond the basic set of all windows, various subsets can be useful (grouped by context, workspace, tag, &c.). In contrast to cycling through a list of titles/icons, I prefer to display a spread of actual windows proportionally scaled down (so the contents of each is entirely visible).

Relative placement and proportional scaling (i.e. preserving relative position and size) are essential (at least for non-tiled windows) and increasingly important as the number of windows increases (making it more difficult to distinguish them based on contents alone).

software

X: Skippy-XDOS X: Exposé

Under X, Skippy-XD (for NetWM-compliant window managers) has proportional scaling but not relative placement. Exposé (OS X) initially supported relative placement and proportional scaling but has since dropped both.

geometry

Windows can be manipulated by command (and hence keyboard). At the very least, I use key bindings for these basic actions:

Whether stacked or tiled, layout is essential to me and I prefer to define consistent geometry (location/size), or layout templates, into which windows can be placed (and adjusted as need be). In any case, I generally want window geometry to remain unchanged unless I change it.

software

X: window-ctrlOS X: AppleScript

Under X, the window-ctrl script works well with various stacking window managers I’ve used.

filtering

Per-window image filtering can be applied to uncooperative software which remains too difficult to configure otherwise (e.g. web browser displays too much contrast/brightness/&c.). It seems reasonable that this should be among the basic capability of a window manager (or compositor); at least: brightness, contrast, inversion.

software

X: ?

Window managers with transparency can reduce contrast by increasing the transparency of the offending window (assuming an appropriate background).

keyboard control

Ideally, programs can be completely controlled by text interface; i.e. each graphical input (menu item, button, widget, &c.) corresponds to a text command, and each command can be bound to the keyboard (and other triggers).

key binding
(shortcuts/hotkeys)

The capability to globally bind keys to arbitrary commands can be very useful, but it can also be troublesome due to conflict with other software which uses the same keys. Individual programs often implement global key bindings (e.g. music player, clipboard manager, &c.) but I prefer to disable them and instead manage global key bindings with a separate/dedicated tool (which helps avoid conflicts and improves abstraction from target programs).

Another way to avoid conflict is to restrict context (name space). The menu interface described below is one way to do that.

software

X: sxhkd, xbindkeys Windows: AutoHotKeyOS X: Automator, &c.

Under X, sxhkd and xbindkeys provide global key binding support independently of specific window managers (and desktop environments), which is useful for making one’s configuration more portable/modular.

menu

A global key binding can call a special program which displays a text input field and matches input to an index of files/commands/&c. displaying/updating a menu as the input changes. There are many programs like this which only launch programs or open files, but these neglect the general utility/versatility of this simple interface: a programmable menu driven by keyboard. Such an interface can be extended and applied to many different contexts:

thus bypassing many impediments/irritations of the WIMP.

Such a tool is so useful that I wonder how I endured without it.

software

X: Albert, dmenu Windows: FARROS X: LaunchBar

LaunchBar (OS X) is the best implementation I’ve seen (not to be confused with the variety of comparatively feeble launchers). FARR (Windows) is close. Albert (Linux) looks promising but it’s not there yet. dmenu (Linux) works well with Unix tools but it provides only very basic menu display/selection.

Dissatisfied with available software under Linux, I developed a system of shell scripts and a menu in Tcl/Tk which has been working well, and I will eventually release it here.

clipboard

Clipboard history (i.e. a running buffer/stack of the system clipboard) is available by extension on most popular systems.

I prefer the following essential behaviour:

Clipboard extensions for OS X (and earlier versions of Mac OS) generally work like that but the common model on Windows and X differs:

When an item is selected from the history stack, it is moved to the top, replacing the system clipboard. Of course this is useful if automatic pasting is not possible, but even software which supports automatic pasting persists in destructive modification by reordering the history and writing to the system clipboard (displacing the last cut/copied item).

If the system clipboard must be modified to perform pasting, it should be immediately restored to it’s previous state.

key bindings

I prefer that the behaviour of standard cut/copy/paste system key bindings remains unchanged, and augmented with these:

selections

X supports selections and many clipboard managers have options to merge selections into the clipboard history. As copied data and selected data are distinct input streams, I prefer a separate running buffer for each stream (which does not prevent them from being blended downstream).

software

X: Clipman, xclip, xsel Windows: DittoOS X: LaunchBar

X has a lot of clipboard history managers (several for many years) but, surprisingly, none work as one expects. The basic utility of storing clipboard/selection history can be handled by a simple daemon which maintains a plain text file, to provide a common infrastructure for any number of different interfaces.

text

navigation/selection

Using the arrow keys for text navigation seems obvious enough but different systems behave differently. I prefer at least the following basic control (similar to NeXTStep & OS X).

up/down
move up/down one line (maintaining horizontal offset if possible), or to beginning/end of text (if current line is first/last line)
left/right
move backward/forward one character
m1-up/down
move to next beginning/end of logical (possibly auto-wrapped) line (i.e. effectively beginning/end of paragraph)
m1-left/right
move backward/forward to next leading/trailing word boundary
m2-up/down
move to beginning/end of text
m2-left/right

move to beginning/end of line

The shift key effectively changes move to select (for the above commands). The backspace/delete keys follow the same pattern as left/right (deleting, instead of moving, in smaller or larger units depending on modifier keys).

mouse

Mouse selection can be easily constrained to a given unit size: character, word, line, paragraph.

single-click
place insertion point under pointer; drag selects by character
double-click
select word under pointer; drag selects by word
triple-click
select line under pointer; drag selects by line
quad-click
select paragraph under pointer; drag selects by paragraph

abbreviation expansion
(text expansion)

To reduce input repetition (and error), frequently entered text can be stored together with associated abbreviations. Ideally, entering such an abbreviation in any text input field causes it to be replaced in situ by its associated text (e.g. btw → by the way). If this kind of automatic/implicit expansion is not feasible, it still remains useful to bind a key which prompts for input of an abbreviation and returns the replacement.

Abbreviation expansion is useful for static text like addresses and other frequently entered text but it can also be used to trigger command output (e.g. insert current date in various formats, &c.).

software

X: AutoKey Windows: AutoHotKeyOS X: partial support built-in

contextual command

Binding the availability/presence of commands to context (rather than making all of them available everywhere all the time) makes menus more concise/relevant and enables the reuse of key bindings. Additional user input can be prompted at run time as need be (to request a new name, &c.).

Ideally, contextual commands are available by both menu and key binding, both easily defined/modified.

text selections

Ideally, any text can be selected, making it available to commands which operate on text. Text returned by commands can replace the selection (if mutable), or be redirected (to the clipboard, notification display, terminal, &c.).

General commands (available for any text selection):

query
WWW, encyclopedia, dictionary, thesaurus, map, image, video, …
convert
substitute, change case (lower/upper/title/…), tabs/spaces, URL encode/decode, …
format
shift right/left (indentation), sort, trim, condense multiple spaces/newlines, …
statistics
count characters/words/lines
translate
to/from frequently used languages
calculate

evaluate arithmetic expression

Direct shell interaction provides essential versatility:

shell

filter/pipe through shell command, evaluate as shell command

Specific commands can be bound to patterns (to be available/displayed only when the selection matches the pattern):

URL
open, query archive.org
domain name, IP address
whois, traceroute, ping, …
hex colour
graphic colour picker

software

X: xsel, xclip, xdotoolOS X: OnMyCommand, TextExtras

X provides an interface to the current text selection, which can be accessed by a simple program (or shell script) to display a consistent contextual menu of one’s own design (independently, without the problem of integrating with existing menus).

files & directories

General commands:

copy
path(s), basename(s), dirname(s)
search
find/locate (glob/regex) file names/contents in current directory
create
file, directory, symlink, archive
display

checksum (MD5, SHA1), file type, aggregate size

Specific commands can be bound to file type:

extract archive
tgz, zip, rar, …
convert
flac to mp3, scale/rotate/crop image, text encoding, …

software

X: SpaceFM, ThunarOS X: OnMyCommand

For X, some file managers (e.g. SpaceFM, Thunar) have integrated support for custom contextual commands (often called «actions»). Thunar has a useful batch renaming interface which can be independently invoked by shell command (enabling it to be called from other file managers, &c.).

Instead of implementing custom commands within the confines of a specific file manager, an independent menu can be maintained and called to appear as need be, providing a consistent interface for use with various file managers (thus relieving them to handle display/selection more than manipulation). Such a menu can also use the text selection (assuming it consists of one or more valid file paths); i.e. it need not depend on file managers at all.

script control

GUI automation

Many GUI programs have no alternative interface for interaction, but some automation is possible by simulating keyboard/mouse input.

software

X: xdotool, xvkbd, Xnee Windows: AutoHotKeyOS X: Automator

Unfortunately, X is hindered by a mess of uncooperative/competing GUI toolkits (in contrast to Windows and OS X, which are more easily automated).

CLI/GUI interaction

Similarly, command line software generally ignores the GUI, but a few basic tools provide essential utility to scripts in the absence of a terminal: to display information, and to get user input (confirmation, keyboard input, menu selection, &c.).

software

X: yadOS X: Platypus

statistics

Many programs maintain their own lists of recently opened files but such basic utility should be available globally. A simple rolling history list can be augmented/informed by additional usage statistics (e.g. frequency, duration, &c.).

keyboard

Easier keyboards do more work for less (even in the WIMP). Simple frequency statistics (n-grams: per character, pair, triplet, …) of text input can inform the keyboard layout, key bindings, &c.

Keyboard input can be streamed through a buffer for statistics and other reading processes (such as abbreviation expansion, key bindings, &c.).