exorcising the WIMP

Despite inherent weakness, the WIMP is ironically strong (at least in the brute force of technological/market inertia) and practically unavoidable. Possessed by pernicious paradigms, it can be exorcised to improve fitness.

This document describes remedial modifications which I have come to depend on to make the WIMP environment bearable and more useful (though some of the things described here remain useful in the absence of a GUI).

Input welcome: help!

preliminary

While also working with other systems, Macintosh (with essential extensions thanks to many especially thoughtful third-party developers) was my primary environment for more than 20 years until I could no longer tolerate Apple and moved to Linux (where I remain appalled that so much is still missing after so long!).

Although increasingly involved in programming, most of my work with computers remains concentrated around the manipulation of my own data, and I have been slow in implementing new tools. I have long contemplated information systems and their human interfaces, but my curiosity generally withers in studying common theory/practice because it seems unnecessarily complicated to me (like so many forms of human expression...). In any case, my interest is to the greater meaning of information technology and symbolic manipulation (in contrast to the limited scope of computers and computation).

The WIMP is certainly not my ideal but I have yet to either produce or find a better practical alternative, even though I've been thinking about it for years. Short of starting fresh, one might build a typographic interface upon Plan 9. Meanwhile, I try to make the best of my situation.

display

resolution

For optimal resolution (with low-resolution displays), I set my display at a distance just beyond the threshold of pixel visibility; i.e. with enough distance that pixel mosaics become coherent shapes/lines (rather than collections of tiles), yet close enough that subtle differences remain detectable, ready for closer inspection.

zoom/pan

I prefer a global key binding to zoom the entire display; something I came to appreciate long ago while using graphics cards with instantaneous zoom/pan implemented in hardware. Newer/faster CPUs can do it with software but it's still slower after all these years!

software

X: XRandR OS X: Accessibility

Under X, video modes can be configured to display a portion of a larger area with panning, but this is slow and blurry as the display is driven at a resolution less than native (and displays usually force anti-aliasing on such input). I have yet to find a simple software-only solution for X (Compiz, &c. is too much).

fonts

Assuming a capable system, I enable anti-aliasing and disable hinting. Some complain that this results in blurry text but it actually displays the type more accurately. I generally find it more readable and I prefer less contrast anyway.

Under X, these settings can be configured using Fontconfig (in case they are not otherwise accessible, or to override defaults).

The most legible fonts I see for low-resolution display are those commissioned by Microsoft specifically for that purpose. Ironically, font display has been poor under Windows (due to forced hinting and the absence of sub-pixel positioning) but it is improving.

For general use, I prefer these fonts:

Georgia (serif)
Candara (sans serif)
Consolas (monospaced)
Verdana (for programming; i.e. whenever individual characters must remain immediately distinct)

Many Linux distributions have a Microsoft core fonts package, which includes Georgia and Verdana (but for legal reasons it must be installed manually). See Windows fonts for manual installation.

The fonts above are freely available and they work very well, but some commercial fonts are worth mentioning:

Thesis family (includes serif, sans-serif, and monospaced fonts; Consolas, by the same designer, is derived from TheSansMono)

For purposes which require bitmapped fonts, the (monospaced) Tamsyn family works well.

appearance

Popular WIMP implementations are rife with superfluous and distracting graphics, but graphical interface elements (widgets) need not be conspicuous. They can instead remain below the threshold of distraction yet visible enough to be easily located.

Contrast is usually extreme, especially in text input fields which are often set with a background of maximum intensity (100% white). Light text on a dark background has advantages in principle but problems in practice as software developers often make assumptions about the background colour/intensity. In any case, I naturally prefer dark-on-light for reflective displays (e.g. e-ink) and light-on-dark for transmissive (light-emitting) displays.

Instead of using fixed values, an interface palette (theme) can be defined as a set of related intensity intervals which can be easily adjusted (scaled/shifted) as a unit (preserving proportionality) throughout the full intensity range of the display as need be (but as obvious as this seems, I'm not aware of software which works this way). When multiple hues are used, perceptual uniformity is essential.

Adjusting the appearance of different systems: Under X, GTK+ provides a relatively high level of control (and Qt can use GTK+ themes); Windows provides basic but satisfactory control of appearance; in contrast, Apple makes it difficult (i.e. OS X themes are possible but generally cause more trouble than they're worth). The search terms «flat grey» work well for finding minimal themes.

My preferred theme files (assuming the fonts listed above):

GTK+ [wot a mess...]
Windows: Windows-XP.theme (grey, reduced contrast, Candara font)

Keyboard mnemonic labels (i.e. the lines under hotkey characters in menus/buttons/&c.) can remain hidden until the modifier key is pressed. For GTK2:

echo 'gtk-auto-mnemonics = 1' >>~/.gtkrc-2.0

For GTK3: "mnemonics has been deprecated since version 3.10 [..] setting is ignored" (source)

Windows includes a similar preference. Mac OS never had them.

windows

window manager

In contrast to the usual stacking/overlapping, tiling can be useful but a window manager should support both modes. In any case, many GUI programs fill their windows with contents that are not very adaptable which greatly hinders the utility of tiling (wasting space as content does not adjust to fill it, and requiring unnecessary scrolling to see what would otherwise be visible).

Tiling window managers generally change the layout every time a window is shown/hidden (disrupting the size/position of windows in order to fill the display and to avoid overlapping). This automatic layout process can be more or less disruptive, especially in respect to the current/active window, and I prefer minimal layout disruption.

software

In contrast to Windows and OS X, many different window managers are available for X (see Arch's list of window managers).

The most promising tiling window managers I've used are: bspwm and herbstluftwm. Both require patience to read and configure but work very well.

After using bspwm for a year or so, I intend to stay with tiling, and it's worth noting that implementations vary greatly, as other tiling window managers had the opposite effect on me (e.g. i3).

focus switching

Changing window focus should be possible by both mouse and keyboard. The ability to select from a set of windows is also essential, especially when some windows are hidden.

Beyond the basic set of all windows, various subsets can be useful (grouped by context, workspace, tag, &c.). In contrast to cycling through a list of titles/icons, I prefer to display a spread of actual windows proportionally scaled down (so the contents of each is entirely visible).

Relative placement and proportional scaling (i.e. preserving relative position and size) are essential (at least for non-tiled windows) and increasingly important as the number of windows increases (making it more difficult to distinguish them based on contents alone).

software

X: Rofi, Skippy-XD OS X: Exposé

Under X, Rofi is simple title switcher (no previews) while Skippy-XD (for NetWM-compliant window managers) has proportional scaling but not relative placement. Exposé (OS X) initially supported relative placement and proportional scaling but has since dropped both.

geometry

Windows can be manipulated by command (and hence keyboard). At the very least, I use key bindings for these basic actions:

move to top left/centre/right
move right/left
increase/decrease width/height

Whether stacked or tiled, layout is essential to me and I prefer to define consistent geometry (location/size), or layout templates, into which windows can be placed (and adjusted as need be). In any case, I generally want window geometry to remain unchanged unless I change it.

software

X: window-ctrl OS X: AppleScript

Under X, this window-ctrl script works well with various stacking window managers I've used.

filtering

Per-window image filtering can be applied to uncooperative software which remains too difficult to configure otherwise (e.g. web browser displays too much contrast/brightness/&c.). It seems reasonable that this should be among the basic capability of a window manager (or compositor); at least: brightness, contrast, inversion.

software

X: ?

Window managers with transparency can reduce contrast by increasing the transparency of the offending window (assuming an appropriate background).

keyboard control

Ideally, programs can be completely controlled by text interface; i.e. each graphical input (menu item, button, widget, &c.) corresponds to a text command, and each command can be bound to the keyboard (and other triggers).

key binding
(shortcuts/hotkeys)

The capability to globally bind keys to arbitrary commands can be very useful, but it can also be troublesome due to conflict with other software which uses the same keys. Some programs implement their own global key bindings (e.g. music player, clipboard manager, &c.) but this is arguably wrong and I prefer to disable them and instead manage global key bindings with a separate/dedicated tool (which helps avoid conflicts and abstracts/separates user space from specific programs).

Another way to avoid conflict is to restrict context (name space). The menu interface described below is one way to do that.

software

X: sxhkd, xbindkeys Windows: AutoHotKey OS X: Automator, &c.

Under X, sxhkd and xbindkeys provide global key binding support independently of specific window managers (and desktop environments), which is useful for making one's configuration more portable/modular.

A global key binding can call a special program which displays a text input field and matches input to an index of files/commands/&c. displaying/updating a menu as the input changes. There are many programs like this which only launch programs or open files, but these neglect the general utility/versatility of this simple interface: a programmable menu driven by keyboard. Such an interface can be extended and applied to many different contexts:

run commands
navigate file systems and find/open/copy/move/... files
calculator
manage clipboard history
context menu (for text selection, &c.)
custom interfaces
&c.

thus bypassing many impediments/irritations of the WIMP.

Such a tool remains the single most important software for improving my general productivity (by reducing interface impedance) and I feel crippled without it.

software

X: Albert, dmenu Windows: FARR OS X: LaunchBar

LaunchBar (OS X) is the best implementation I've seen (not to be confused with the variety of comparatively feeble launchers). FARR (Windows) is close. Albert (Linux) looks promising but it's not there yet. dmenu (Linux) works well with Unix tools but it provides only very basic menu display/selection.

In the absence of adequate software under Linux, I developed a system of shell scripts and a menu in Tcl/Tk which has been working well, and I will eventually release it here.

clipboard

Clipboard history (i.e. a running buffer/stack of the system clipboard) is available by extension on most popular systems.

I prefer the following essential behaviour:

system clipboard is preserved
history order is preserved
paste directly from the history (without changing the system clipboard)

Clipboard extensions for OS X (and earlier versions of Mac OS) generally work like that but the common model on Windows and X differs:

When an item is selected from the history stack, it is moved to the top, replacing the system clipboard. Of course this is useful if automatic pasting is not possible, but even software which supports automatic pasting persists in destructive modification by reordering the history and writing to the system clipboard (displacing the last cut/copied item).

If the system clipboard must be modified to perform pasting, it should be immediately restored to it's previous state.

Keeping a history of the clipboard is a simple concept yet implementations tend to be both complex and unaccommodating. Indeed, exposing the history as a plain text file makes it more useful (available to myriad tools as need be), while the history process remains efficient and simply defined.

key bindings

I prefer that the behaviour of standard cut/copy/paste system key bindings remains unchanged, and augmented with these:

display the whole history list for navigation and selection by keyboard or mouse
paste directly from each of the first (most recent) several indices in the list
repeat: paste from last selected/pasted index

selections

X supports selections and many clipboard managers have options to merge selections into the clipboard history; however, copied data and selected data are distinct input streams, so I prefer a separate running buffer for each stream (which does not preclude them from being blended downstream).

software

X: Clipman, clipmenu, xsel, xclip Windows: Ditto OS X: LaunchBar

X has a lot of clipboard history managers (several for many years) but, surprisingly, none work as one expects. The basic utility of storing clipboard/selection history can be handled by a simple daemon which maintains a plain text file, to provide a common infrastructure for any number of different interfaces.

text

bindings

Key/mouse bindings for text navigation/selection/deletion seem obvious enough but different systems behave differently. I prefer at least the following basic control (similar to NeXTStep & OS X).

up/down: move up/down one line (maintaining horizontal offset if possible), or to beginning/end of text (if current line is first/last line)
left/right: move backward/forward one character
m1-up/down: move to next beginning/end of logical (possibly auto-wrapped) line (i.e. effectively beginning/end of paragraph)
m1-left/right: move backward/forward to next leading/trailing word boundary
m2-up/down: move to beginning/end of text
m2-left/right: move to beginning/end of line

selection/deletion

The shift key effectively changes move to select (for the above commands). The backspace/delete keys follow the same pattern as left/right (deleting, instead of moving, in smaller or larger units depending on modifier keys).

mouse

Mouse selection can be easily constrained to a given unit size: character, word, line, paragraph.

single-click: place insertion point under pointer; drag selects by character
double-click: select word under pointer; drag selects by word
triple-click: select line under pointer; drag selects by line
quad-click: select paragraph under pointer; drag selects by paragraph

abbreviation expansion
(text expansion)

To reduce input repetition (and error), frequently entered text can be stored together with associated abbreviations. Ideally, entering such an abbreviation in any text input field causes it to be replaced in situ by its associated text (e.g. btw → by the way). If this kind of automatic/implicit expansion is not feasible, it still remains useful to bind a key which prompts for input of an abbreviation and returns the replacement.

Abbreviation expansion is useful for static text like addresses and other frequently entered text but it can also be used to trigger command output (e.g. insert current date in various formats, &c.).

software

X: AutoKey Windows: AutoHotKey OS X: partial support built-in

contextual command

Binding the availability/presence of commands to context (rather than making all of them available everywhere all the time) makes menus more concise/relevant and enables the reuse of key bindings. Additional user input can be prompted at run time as need be (to request a new name, &c.).

Ideally, contextual commands are available by both menu and key binding, both easily defined/modified.

text selections

Ideally, any text can be selected, making it available to commands which operate on text. Text returned by commands can replace the selection (if mutable), or be redirected (to the clipboard, notification display, terminal, &c.).

General commands (available for any text selection):

query

WWW, encyclopedia, dictionary, thesaurus, map, image, video, ...

convert

substitute, change case (lower/upper/title/...), tabs/spaces, URL encode/decode, ...

format

shift right/left (indentation), sort, trim, condense multiple spaces/newlines, ...

statistics

count characters/words/lines

translate

to/from frequently used languages

calculate

evaluate arithmetic expression

Direct shell interaction provides essential versatility:

shell

filter/pipe through shell command, evaluate as shell command

Specific commands can be bound to patterns (to be available/displayed only when the selection matches the pattern):

URL

open, query archive.org

domain name, IP address

whois, traceroute, ping, ...

hex colour

graphic colour picker

software

X: xsel, xclip, xdotool OS X: OnMyCommand, TextExtras

X provides an interface to the current text selection, which can be accessed by a simple program (or shell script) to display a consistent contextual menu of one's own design (independently, without the problem of integrating with existing menus).

files & directories

General commands:

copy

path(s), basename(s), dirname(s)

search

find/locate (glob/regex) file names/contents in current directory

create

file, directory, symlink, archive

display

checksum (MD5, SHA1), file type, aggregate size

Specific commands can be bound to file type:

extract archive

tgz, zip, rar, ...

convert

flac to mp3, scale/rotate/crop image, text encoding, ...

software

X: SpaceFM, Thunar OS X: OnMyCommand

For X, some file managers (e.g. SpaceFM, Thunar) have integrated support for custom contextual commands (often called «actions»). Thunar has a useful batch renaming interface which can be independently invoked by shell command (enabling it to be called from other file managers, &c.).

Instead of implementing custom commands within the confines of a specific file manager, an independent menu can be maintained and called to appear as need be, providing a consistent interface for use with various file managers (thus relieving them to handle display/selection more than manipulation). Such a menu can also use the text selection (assuming it consists of one or more valid file paths); i.e. it need not depend on file managers at all.

script control

GUI automation

Many GUI programs have no alternative interface for interaction, but some automation is possible by simulating keyboard/mouse input.

software

X: xdotool, xvkbd, Xnee Windows: AutoHotKey OS X: Automator

Unfortunately, X is hindered by a mess of uncooperative/competing GUI toolkits (in contrast to Windows and OS X, which are more easily automated).

CLI/GUI interaction

Command line software generally ignores the GUI, but a few basic tools provide essential utility to scripts in the absence of a terminal: to display information, and to get user input (confirmation, keyboard input, menu selection, &c.). Scripts which use GUI integration should also gracefully revert to terminal interaction in the absence of a GUI.

software

X: yad OS X: Platypus

statistics

Many programs maintain their own lists of recently opened files but such basic utility should be available globally. A simple rolling history list can be augmented with additional usage statistics (e.g. frequency, time stamp, &c.).

keyboard

Easier keyboards do more work for less (even in the WIMP). Simple frequency statistics (n-grams: per character, pair, triplet, ...) of text input can inform the keyboard layout, key bindings, &c.

Keyboard input can be streamed through a buffer for statistics and other reading processes (such as abbreviation expansion, key bindings, &c.).

exorcising the WIMP

preliminary

display

resolution

zoom/pan

software

fonts

appearance

windows

window manager

software

focus switching

software

geometry

software

filtering

software

keyboard control

key binding (shortcuts/hotkeys)

software

menu

software

clipboard

key bindings

selections

software

text

bindings

selection/deletion

mouse

abbreviation expansion (text expansion)

software

contextual command

text selections

software

files & directories

software

script control

GUI automation

software

CLI/GUI interaction

software

statistics

keyboard

key binding
(shortcuts/hotkeys)

abbreviation expansion
(text expansion)