Quidquid latine dictum sit, altum videtur

More NML

By Dave Menendez
Tuesday, October 7, 2003, at 2:20 AM

Summary: An astute query from Mitch “Dread” H. leads to some background about NML and some consideration of optional element end-markers.

In response to my last post, Mitch writes:

Dave, is this NML designed to trim down automatic handling of *ML? Because I can see where a human user of a NML-type language would quickly get lost in the serried ranks of ambiguous closing-tags. At least in XML, you can keep track of what’s with what due to the forced redundancy.

XML’s closing tags can be handy for detecting errors. With NML, like Lisp, one must be careful to keep all the delimiters balanced, or else things get kooky. SGML had an advantage here, because it allowed one to abbreviate end tags when they were obvious from context, e.g. <p>This is a <emphasis>short</> paragraph!</p>. (In fact, the opening tag can be shortened to <> if it’s obvious from context.) In other cases, the tags could be omitted altogether, if they were obvious from context, but this requires knowledge of the document type to do any parsing. One of XML’s goals was to be parsable even when nothing was known about a particular document’s type.

But I digress. Yes, I do think it’s likely that NML’s minimal end tags would get confusing in longer documents. Intelligent support from document editors would be handy here (most of the ones intended for programmers can already balance parentheses and brackets), but some mental aides in the syntax would probably be useful as well. One possibility is to allow for an optional element closer which repeats the tag name. That is, you could write an element as <section lots of content…> or <section lots of content… /section>. If you wanted to actually end a section element with the literal text “/section”, you could just insert some whitespace: <section /section >.

As for why I came up with this, my thoughts on the matter were sparked by a bizarre rant I came across in a programming discussion. NML is a thought experiment to answer the question, How simple can an XML-like markup language get? NML is terser than XML, which is a double-edged sword, but my primary qualms relate to attributes. With XML, it’s pretty simple to create, say, a scoped language tag, e.g. <p xml:lang='en'>He asked what “<abbr xml:lang='de'>GmbH</abbr>” meant.</p>. Here it’s pretty obvious what text the xml:lang attributes apply to. In NML, we would have to work with sub-elements that are scoped to their parents, which may or may not be more confusing. (Viewed as a tree, it’s the same situation whether we use attributes or elements, but it doesn’t feel the same to me. But my programmer’s intuition is far from infallible.)

One wonders what the XML team would have come up with if backward compatibility with SGML hadn’t been a requirement.