Alph Developer Notes, 2016

I bashed-out the skeleton of the Docuplextron and the Alph server in July of 2016. I thought I was onto something good, and a lot of what I was writing and thinking about wasn't technical in nature, but rather more conceptual/philosophical – I was trying to explain to myself what I was doing, I think, as preparation for the inevitable task of explaining it to others. Xanadu advocacy has always been a key part of this project's mission.

I was still getting some concepts wrong. I'd not yet laid hands on a copy of Computer Lib/Dream Machines at this point (Ted gave me one a few months later, 'cause he's a cool dude), although I'd read everything of Nelson's that I could find online, and I owned a copy of Possiplex. Linking, in particular, I did not yet grok with adequacy.

These entries are sourced from a permascroll that I was keeping on my local machine. It's been copied onto the alph.io server here: <http://alph.io/xpub/history/alph-notes>.

—L.



===2016-07-23T08:09:21.195Z===
ALPH Notes
----------

Fundamental concepts of Xanalogical hypermedia:

1. Separation of form and content
2. Atomic media addressing and retrieval
3. Two-way linking

-----

- Don't fight copy/paste culture — just make copy/paste MUCH MORE POWERFUL.

- Authors don't have to fight reuse, redistribution, or recontextualization when their material remains always linked to the author.

- Centralized, highly-integrated systems are stupid. The Web was decentralized, heterogeneous, flexible, egalitarian at its onset, and it must remain so.

- Commerce is inevitable, culture is incidental. We don't have to fight for commerce, we have to fight for culture — unless, that is, we're happy with a culture of commerce.

- A Web with proper linking will deprecate the search engine.
===2016-07-23T17:46:09.262Z===
To empower the literature of the future is to empower the *literacy* of the future.
---2016-08-06T03:49:03.859Z
Where Nelson uses the term "document", I sometimes use the term "composition". In ALPH, we're using HTML(w/CSS) as a kind of compositing language. It defines the presentation of media elements which come from outside itself, and it imposes semantics and logical structure upon semantically ambiguous unstructured media. An HTML file may contain multiple compositions, or fragments which may be used as compositions in their own right.

This is, obviously, the way that HTML is already used for audiovisual media (<img>, <audio>, and <video> tags all have "src" attributes, don't they?), but it is not the way that it is used for text.
-----
Yes, text data is fundamentally unstructured. Don't argue with me on this. It is a flat sequence of codepoints. Structure may be implicit in the content of the text, but there is no explicit semantic or logical structure (beyond linear sequence) in plain-text. The same is true of sound, images, and video.
-----
Audio files containing music usually have good metadata. Video files, less so. With images, some formats have metadata, some don't, and amongst the ones that do the metadata can be in a variety of different formats. With text files ...it's a complete mess with text files.

The World Wide Web has, since its inception, been sometimes described as a publishing system. It is not. It is a distribution network, certainly, but it is not a publishing system.
---2016-08-09T02:30:07.686Z
Obviate Corporate Social Media

In conjunction with an open publication/subscription protocol like Pump.io or GNU social, Xanalogical hypermedia can replace closed social media services by making every document, every *portion* of a document, annotatable/commentable, and these comments themselves become commentable.

This doesn't just obviate Facebook and Twitter -- this obviates SoundCloud and Flickr and other media-sharing sites that feature comments/annotation of audiovisual media. 

With Xanalogical hypermedia, any sound file, any image, any video, any text, anywhere on the Web, can be commented-on by anyone, from their own site, without having to subscribe to any services.
---2016-08-09T03:30:48.599Z
I'm not sure anyone even thinks about the Web as a hypermedia system anymore -- because the "hyper" in hypermedia was never really understood by most people, I think.

In the very early days, the Web was, conceptually, a network of DOCUMENT SERVERS. But the documents were stupid. There was never anything "hyper" about them -- they were just files.

Then there was a shift towards dynamically-generated content. Documents went out the window as the Web became a network of SITEs, where each site was a DATABASE APPLICATION which generated a document-based USER INTERFACE.

About a decade ago, the capabilities of Web browsers -- originally just static document viewers -- matured *a lot*, and there was a third concept shift toward Web SERVICES and APPLICATION PROGRAMMING INTERFACES. That's where we are now: we have a Web of services and in-browser applications.

Where's the hypertext? 

That ball was dropped long, long ago. I don't know why. Because it's conceptually very simple:

Hypermedia is media that knows itself (has metadata), has a body (is deeply addressible), knows its place in the universe (has links to other media), and can respond to queries about any of these things.

An image file on a Web server is not hypermedia. It's network media, but it's not hypermedia. But an image file that can tell you its age, its dimensions and format, its title, its author, its relationships with other network media, and that can show you part of itself, or have part of itself referred-to -- THAT is hypermedia.
---2016-08-09T19:39:19.552Z
In the case of text, the "atom" is conceptually sound as an addressing unit. The text atom (codepoint) is an indivisible unit, and any interpolation performed on textual data will result in the destruction of the content, the symbolic value, of the text. The cases of PCM audio and raster images are different, however. A digital photograph has dimensions in a discrete, indivisible atom -- the pixel -- but those atoms are only samples of an analogue field which has no native "resolution". The same is true of PCM audio -- the "sample" is just that: a sample of an infinitely divisible waveform. This means that audio and images may be interpolated into different dimensions but retain their "meaning". They do incur information loss relative to the amount of interpolation done to them, but this is a change in the quality of content, not in the nature of content. With video, where the atom is the frame, interpolation of frame-rates generally takes the form of frame skipping or duplication, yet this, too, is a change in haeccity, but not in hypokeimenon. 

This raises some issues. 

Different stable-media versions of the same audiovisual subject will inevitably be made available. They might be different files and formats at different URLs (ex.: protocol://host.domain/path/file.format1, protocol://host.domain/path/file.format2, ...) or they might be a single stable URL that accepts a format query parameter. In the former case, we would want those different URLs to return the same ?describe and, critically, ?link data.

In XRD/JRD, in addition to a "subject" element, there are "alias" elements. We could utilize that in conjunction with a "use" element in a resource's metadata. For example:

A "master" meta/link record for p://h.d/path/file contains

{ "subject":"p://h.d/path/file",
  "alias":["p://h.d/path.file.format",
           "p://h.d/path/file.format2"]
}

While the meta/link record the alias URLs (p://h.d/path/file.format, p://h.d/path/file.format2) only contains:

{ "use":"p://h.d/path/file" }

This USE directive tells the client that all metadata and linking queries should be directed the specified URL. 

The "alias" object in the master meta/link record should be used as a sort-of authorization: clients, when redirected by a USE record, should check the alias object of the redirection target and make sure that the redirecting resource is named there. 

Why? I'm not sure. Proof of authorized content mirroring is one application, but ...I can't think of why else it's important right now, tho' it probably is. I'm riffing off of something I read in the WebFinger spec, I think.

This is ultimately moot though, because two formats of the same resource, when using "atoms" for addressing, even when linked to the same master resource, will result in different addressing schemes -- different "spans" being recorded in the link -- and that will complicate the process of querying fragment links considerably. Unnecessarily, I should think.

As much as I love the idea of atomic addressability, this situation simply demands that I abandon it for audiovisual media.

For audio and video, the addressing unit must be the second, and the server must accept arbitrary-precision floating-point values for origin and offset.
 
For images, the addressing unit must be a fractional representation of width and height, a floating-point value between 0 and 1 for X and Y. 
---2016-08-10T20:25:26.652Z
Project Goals:
--------------
1. Devise a simple protocol for Xanalogical hypertext on the Web.
2. Write a simple server implementation (alph.py) that can run on shared hosting without too much fuss.
3. Write a basic client program in Javascript (alph.js) that will enable transclusion of text into HTML and provide basic metadata and link querying; this needs to be something that anyone can include in a <script> tag on their Web sites/pages to allow them to start using Xanalogical hypertext *now*, without creating any kind of barrier ("you must install X to see this site") between authors and their audiences.
4. Create a full-featured Xanalogical authoring environment (The Docuplextron) with transpointing windows and easy HTML editing as a browser add-on.  
---2016-08-12T06:25:55.690Z
How to explain the difference in Web architecture and Xu/Alph architecture?

Web: client/server, request/response ← that remains true of Xu/Alph, but there's a conceptual middle-layer in the latter that is important.

Links are fundamentally different. In the Web, a "link" is merely a pointer to a resource. In Alph, we call that an anchor. In Alph, a link exists when two resources have reciprocating anchors.

---2016-08-12T07:44:56.838Z
There is this, from Possiplex (pp.356): "links (xanalinks or overlays) are free-standing entities, individually published or privately downloadable, separate from content."

Now... how does that play with what I've written earlier tonight about links? I've known Ted's position on this, that links are free-standing entities, separate from stabilized content, and I believe that we're on the same page here, except our terminology is slightly different. 

As I see it, an Alph-HTML file is a context mediator, a network media composition (and, if published to the world, a link-instantiator, as transclusions are a type of link that the server reciprocates) made from a hierarchy of semantically-tagged anchor-bearing structures. In this way, links (anchors, in my words) are individually published or privately downloadable. This is what I've always had in mind for the Docuplextron, that docuplexes could be published and shared, and these would be nothing more than big HTML files, big media compositions, big sequences of links (anchors), right? Right.

This all goes back to Bush's Memex, and his description of being able to share "trails" privately, or to publish them publicly. 
---2016-08-13T09:30:43.028Z
So, you'll have to refresh my memory on what happens when I commit some text to a new yaddayadadah 
---2016-08-14T19:45:23.229Z
How do me make HTML work to our ends? Well, with audiovisual media it's easy, because it's already handled in an essentially Xanalogical way (data is loaded from disparate network sources as the "page" renders); transclusion parameters have to be parsed out of the "src" attribute of these media, but it's not difficult/expensive to do that. Text is the sticky wicket; it's handled entirely differently (text is either stored inside of the HTML document on the server, or it is stored in a database on the server and packaged in an HTML container upon request) but it's easy to bend to our will by enforcing (with browser extensions/scripts) a policy of no Text nodes outside of a <span> element, with each <span> containing only one contiguous segment of text from a single source.

[2016-08-18T19:45:10.691Z]:I was working on fixing-up the copy/paste code in alph.js, and was testing my work in my ALPH Notes file, when I noticed that I'd created this little gem of gibberish: 

<http://localhost/xpub/alph-notes?fragment=640-687`790-803`911-927`1007-1020`687-764>

Now, this is a silly example, but this sort of thing has always been a point of interest with me: doing cut-and-paste with text, making new texts from old texts, and retaining the connections to source. *That* is hypertext! That is the sort of thing that Nelson is talking about when he speaks of new forms of literature. He could see it clearly in the 1960s — and could never, every be accused of not making an effort to share the idea with people — and here we are, a half-century later, and it's not being done, even though it's been possible for decades.

[2016-08-18T20:40:21.590Z]:Defining a query syntax for ?link is going to have to be part of the specification, but it's probably the thing I've thought about the least. It should support key/value searches — something like: prot://host.dom/file?link=rel:*foo* should return all links with a "rel" property that contains the string "foo" in its value. Then I'll need the usual booleans, so ?link=rel:*foo*`OR`rel:*bar`AND`*date:*2016* would give links with a relation containing "foo", or ending in "bar", and which have some kind of date property containing "2016" in the value. I should look into existing JSON query languages, if any exist.

[2016-08-19T22:29:10.441Z]:Web linking versus Xanalogical linking...

...the only way to refer to a portion of an HTML document is by using a fragment identifier...

...the fragment identifier is not (typically) sent to the server; it's a directive for the client (browser), which retrieves the entire resource, to display the named fragment once the resource is loaded.  The other shortcoming of this is that fragment identifiers must be placed, either manually or by some automatic mechanism (a Purple Numbers implementation, for example <https://en.wikipedia.org/wiki/Purple_Numbers>), into the HTML document -- there is no inherent mechanism for addressing a portion of a document in an HTTP URL (tho' powerful query languages do exist: XPath, XQuery, XSLT...).

...this works well for systems documentation, legal codes, technical writing, and the Bible, where sections and subsections are named and numbered in an outline.  For journalism, correspondence, and narrative prose, however, this is useless. No-one numbers the sections, subsections, and paragraphs of these kinds of compositions. 

...a Xanalogical content-link links to every instance of a particular media fragment in every document, everywhere.

[2016-08-20T03:29:19.163Z]:Just migrated the whole system from using <span> elements with custom attributes over to using custom <x-text> elements. Went quickly, and — so far — smoothly.