ZedneWeb / Web threading
The Thread Description Language (TDL) is an RDF vocabulary for describing threaded discussions, including:
TDL represents discussion threads as collections of linked posts, where a post is a self-contained, atomic contribution to the discussion. The nature of posts varies somewhat in different types of discussions, but they are generally always the work of a single participant. Weblog and message board posts and Usenet, e-mail, and IM messages are a few examples of posts.
Different types of discussions will organize their posts in different ways. TDL is designed to represent as many styles as possible using common features. (Because “thread” means different things in different places, it is not used in the vocabulary itself.)
Usenet and inter-weblog discussions make use of implicit threading. Rather formally declaring threads, each post makes references to one or more posts that precede it in the discussion. With a collection of posts and information about their references, it is possible to arrange these posts in a tree or graph structure representing the discussion.
In TDL, the references between posts are indicated with the property refersTo.
Many message board systems use a similar style called forked threading. Unlike implicit threads, forked threads are explicitly declared, as the centralized control of a message board allows one to know exactly what posts are part of the thread. This style is called forked, because each post may be referenced by multiple posts, resulting in a branched, tree-like form for the thread. (Forked message boards are also frequently called “threaded”.)
In TDL, a specific collection of posts is represented as a Topic, and the lead post(s) in a topic can be indicated with initialPost.
The remaining message boards and IM conversations use linear threading. While forked and implicit threads arrange posts in graphs or trees based on what posts they refer to, linear threads arrange posts in a sequence, usually based on when the posts were created. (Linear message boards are frequently called “unthreaded”.)
In TDL, the property hasPosts indicates an ordered list of posts belonging to something, such as a Topic. [Issue 2]
These styles of threading are not mutually exclusive. The same thread may be viewable as a sequence of posts and a tree. Similarly, while individual discussions in Usenet or among weblogs are not explicitly declared, at another level newsgroups and weblogs can be viewed as explicitly-declared threads.
The connections between posts and topics are many-to-many. A given topic will usually contain multiple posts—perhaps thousands—and a given post may belong to any number of topics. There are a number of ways to associate posts with a topic, but they fall into two broad categories.
A topic which has a value for hasPosts is closed. Only those posts which are part of the list given by hasPosts are part of the topic. This does not mean that the membership of the topic is fixed for all time. An RDF graph containing hasPosts is simply asserting that this set of posts and no others belong to the topic. It is up to implementations to decide which assertions to believe at what time.
In many cases, it is not practical or even possible to list every post in a topic. A topic which does not have a value for hasPosts is open. With an open topic, the only possible answers to the question “Is this post part of this topic?” are “Yes” and “Unknown”.
In this document, several prefixes are used to abbreviate references to existing namespaces. A term such as “dcterm:isPartOf” is actually shorthand for the URI reference “http://purl.org/dc/terms/isPartOf”. Terms with no prefix (or the empty N3 prefix “:”) are from the TDL namespace.
Prefix | URI Reference | Comment |
---|---|---|
http://www.eyrie.org/~zednenem/2002/web-threads/ | TDL | |
dc: | http://purl.org/dc/elements/1.1/ | Dublin Core element set |
dcterm: | http://purl.org/dc/terms/ | Dublin Core qualifiers |
dctype: | http://purl.org/dc/dcmitype/ | DCMI resource types |
rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# | RDF Core |
rdfc: | http://www.eyrie.org/~zednenem/2002/rdfchannel# | RDF Channel |
rdfs: | http://www.w3.org/2000/01/rdf-schema# | RDF Schema |
As always, it is the URI reference which is meaningful, not the particular prefix.
The term “resource” is used in different ways by different standards on the web. To avoid confusion, this document uses the term item to refer any node in an RDF graph about which statements can be made (that is, everything except literals).
An RDF graph is a set of statements, such as the statements made by a single file. A graph asserts the truth of the statements it contains. This graph may be inconsistent if it asserts statements which contradict each other. This document does not describe how these inconsistencies should be dealt with.
TDL is defined in terms of RDF statements, not any particular serialization of RDF. Examples in this document are given using the N3 syntax, but TDL data may be expressed in any appropriate syntax. (An easily-parsed subset of RDF/XML could be described in an appendix, given interest.)
There are three ways to associate a post with a topic:
P
:partOf T.
” asserts that P is part of
T.T :hasPosts L.
” asserts that each post
in L is part of T and that any post not in L
is not part of T.S
:subtopicOf T.
” asserts that each post which is part of S
is also part of T.Given these rules, it is possible for a graph to assert that a post is and is not part of a topic. This indicates that one or more of the statements which gave rise to the inconsistency is incorrect. Determining which statements are to be believed will depend on the circumstances and possibly user preferences.
A point of clarification: It is possible for an item to be a post and a topic. For instance, in a weblog which allows user comments, a post in the weblog would be considered the topic of its comments. Asserting that a combined topic/post P is part of a topic T does not entail that its posts are part of T. Asserting that P is a subtopic of T does entail that its posts are part of T.
The value of hasPosts is an ordered sequence of posts. Depending on the nature of a topic, this order may or may not be significant at a user level. [Issue 2] A topic has only one ordering of posts. This means that post A may come before or after post B in topic T, but not both. A graph which asserts multiple values of hasPosts for a single topic is inconsistent unless all the values give the same posts in the same order.
The subtopic properties make different statements about how the order of posts in the subtopic is reflected in the larger topic. Given a topic S which contains only the posts A and B in that order and another topic T:
S :subtopicOf T.
” asserts that
A and B are part of T, but says nothing
about their order in T.S :categoryOf T.
” asserts that
A and B are part of T and that A
comes before B in T, but there may be other posts
between them.S :segmentOf T.
” asserts that
A and B are part of T and that A
comes before B in T and that no other post is between
A and B in T.Thus, it is possible to indicate whether one post precedes another in a topic without needing to list every post in the topic. For example, one might use an anonymous category to state that a post A precedes a post B in topic T:
[ a :Topic; :categoryOf T; :hasPosts ( A B ) ]
Or, to indicate that A is the immediate predecessor of B:
[ a :Topic; :segmentOf T; :hasPosts ( A B ) ]
Currently, dcterm:isPartOf associates a Post with a Topic, but does not imply a subtopic relation. That is:
A a :Post; dcterm:isPartOf B.
B a :Post, :Topic; dcterm:isPartOf C.
Does not entail:
A dcterm:isPartOf C.
However, the natural-language interpretation of dcterm:isPartOf implies that if A is part of B and B is part of C, then A is part of C. It may be more logical to reintroduce inTopic or some similar term to associate posts with topics.
Resolution: A new predicate, partOf, will be used instead of dcterm:isPartOf.
Currently, it is inconsistent to have different post sequences for a single topic. However, some types of topics such as Usenet news groups do not have a well-defined ordering of posts. In these cases, it is not inconsistent for one set of statements to assert that A precedes B and another to assert that B precedes A, because the ordering is unimportant.
On the other hand, it is possible to say nothing about the order of posts in a topic by not asserting a value for hasPosts and only asserting subtopics with subtopicOf. For a news group, the large volume of posts makes it unlikely that any value could be accurately asserted for hasPosts.
Perhaps it would be reasonable to create a Newsgroup subclass of Topic with the implication that its posts have no inherent order?
Favored resolution: A subclass of Topic will be created for Topics whose posts have a specific order. segmentOf and categoryOf will use this class as their domain and range. The name of this subclass is currently undecided; possibilities include “LinearTopic”, “OrderedTopic”, “Sequence”, and “Thread”.
In addition to a reorganization of the documentation, this document also includes major changes to the TDL vocabulary since the last version. These are: