connotext

A simple/unobtrusive/versatile formal language to structure plain text for logical manipulation.

Connotext is a plain text language for producing structured text (i.e. to make text more useful by exposing its structure). It is intended to be of primary use as a source format for human writers/editors; the syntax, to be minimal/obvious/tolerant and unambiguous (i.e. machine readable). In contrast to many markup languages, semantics are generally inferred, reducing dependence on explicit metadata.

minimal
to infer structure from existing patterns/punctuation and to minimise the load/obtrusion of any additional syntax
obvious
to make the semantics of the syntax evident/intelligible
tolerant
to avoid brittleness/strictness of form beyond what is necessary to eliminate ambiguity
plain text
Connotext syntax uses basic ASCII (7-bit) characters. Additionally, some alternative characters are recognised, in which case UTF-8 is expected.

For more detail, see:

why

Text is a universal interface. It remains available to all the useful tools designed operate on text. It can be easily read/written both by humans and by machines, but machines generally don't read as much meaning from text as humans do, so we tend to pollute our text streams with metadata (to inform the machine about structure/meaning).

Languages which embed metadata (e.g. XML, HTML, even TeX, troff, &c.) ironically distract from (if not obscure) the data for which they exist to serve. Many such plain text markup languages are called «human readable» but it seems more accurate to describe them as machine readable in primary purpose, as human readability remains conspicuously deficient/subordinate. Structured plain text languages already exist, but I found none which satisfied my need (due to insufficient scope/abstraction/versatility and/or unnecessary syntactic complexity/noise).

The problem of structuring text to associate metadata can be informed by asking: What can be inferred? And if explicit syntax is necessary, its obtrusion can be reduced by asking: Can it be made easier to write/read (for a human being)?

Markdown

Connotext is similar to Markdown (as implemented/extended in PHP Markdown Extra, Pandoc, &c.) but deliberately departs from it (i.e. some syntax is interpreted differently, or simply not recognised). In practice, connotext produces very similar (if not identical) output for basic Markdown, but it is not a strict superset.

Markdown uses embedded HTML to augment its utility, but its limited semantic space remains insufficient for some texts, and although some dialects are inevitably stretching it, «Markdown is a text-to-HTML conversion tool» which remains limited in scope/utility (in contrast to a tool for the general problem of structuring text).

abstraction

Connotext is abstracted from any particular output format. The core syntax is small/simple/expressive/extensible.

Primary utility is found of these concepts:

primitive containers
isolate/identify any text (to attach attributes, &c.)
attributes
attach arbitrary metadata to any element/container
substitution
include command output, or file contents (not just images)
triggers
interactively run shell commands
input
get user input

Minimally, only primitive container elements and attributes are necessary to represent all remaining elements (i.e. the notation for other elements can be considered as short-form syntax).

Though connotext can be used to produce static text output, special attention is given to dynamic/interactive text (including input) and interaction with other software.

In short, connotext is a simple/unobtrusive/versatile formal language to structure plain text for logical manipulation.

example

Connotext aims to infer structure and minimise the cognitive load of any additional syntax.

Paragraphs are simply separated by a blank line.

For versatility, connotext has primitive container elements:

inline (for text without line breaks):

[contain text]{a}, even on[e]{b} character

and block (for text which contains line breaks):

--------------------------
this text block has several lines

but it could also be empty
(ready to be filled...)
as could an inline container []{d}
-----------------------{c}---

Braces define attributes which enable arbitrary metadata to be attached to elements:

make it [versatile]{name=value}
make it [intelligible]{#id .class name=value ref}
keep it [simple]{r}

[r]: {.help text="attributes can be separated by reference"}

Additional syntax is used for brevity/simplicity:

make it *italic*

but it's essentially shorthand (i.e. it can all be done with containers and attributes):

make it [italic]{font-style=italic}

For more detail, see: