X-TEXT

Working Draft HBR

The Alph Project <http://alph.io>
This document indexed at: <http://alph.io/AlpHTML>
©2017 Adam C. Moore (LÆMEUR) <mailto:adam@laemeur.com>

The essential thing we need for doing xanalogical documents in XML/HTML is a wrapper for Text nodes, which allows us to indicate a source for the enclosed text. In Alph, we're using an element called X-TEXT (for [x]analogical text), which looks like this:

<x-text src="{URL}" origin="{int}" extent="{int}"></x-text>

It's very simple. The src attribute provides the URL from which a text source can be retrieved, and the origin and extent attributes describe a substring of the retrieved text in the form of two zero-based indices into the Unicode code-points of the source text. That is, start and end pointers, not start and length. For Javascript people, that's substring(), not substr().

At load-time, the browser retrieves the text sources and automatically fills all X-TEXT elements. This means that entire documents may be distributed with empty X-TEXT elements – no readable text – and that's okay! That's the Xanadu way.

X-TEXT elements MUST contain only a single Text node.

IMPLEMENTATION NOTES

For compatibility purposes, the Alph client still saves X-TEXT elements in HTML documents with their text included, since not all browsers are going to have alph.js or another X-TEXT-aware plugin installed at this time.

The origin and extent attributes MAY be null.

If the src attribute points at a text source and no substring is defined by origin/extent, we put the whole resource text into the X-TEXT element; Alph.js currently sets the origin/extent attributes automatically upon filling it with its content.

Sources Other Than text/plain

There are a lot of ways to do this. Here's what we're planning to support, though most of this isn't in our working code yet.

For XML/HTML documents, using a limited subset of XPath and XPointers will work to specify textual portions of a document to be transcluded.

With the element() XPointer scheme, the src attribute can point directly at a Text node. This is the best way to do it, in our current thinking.

If a barename, XPath, or element() XPointer is used to point at an element node, all Text nodes inside of the specified element are concatenated and used as the source text.

Fragment Selectors In src URLs

And ...okay, in theory some wise-ass might decide to put an RFC5147 text fragment selector into the src attribute of an x-text, like this:

<x-text src="http://host/path/file#char=123,456" origin="" extent=""></x-text>

Y'know what? That's fine. The Alph software will just dereference the src address and create an X-TEXT that looks like this:

<x-text src="http://host/path/file" origin="123" extent="456"></x-text>

However, if some diabolical wise-ass decides to put an RFC5147 fragment selector into the src attribute, AND they provide an origin/extent.... well, that's not such a big deal either. Alph will read the following element:

<x-text src="http://host/path/file#char=123,456" origin="78" extent="91"></x-text>

And turn it into:

<x-text src="http://host/path/file" origin="201" extent=214"></x-text>

That is, the origin and offset are added to the origin given in the fragment selector.