ZedneWeb / Web threading

Thread Description Language

Current version:
<http://www.eyrie.org/~zednenem/2002/web-threads/>
This version:
<http://www.eyrie.org/~zednenem/2002/web-threads/20021101.html>
Previous version:
<http://www.eyrie.org/~zednenem/2002/web-threads/20020905.html>

Table of contents

  1. Introduction
    1. Conventions used in this document
    2. Styles of threading
  2. Core vocabulary
    1. Classes
    2. Properties
  3. Usage notes
    1. Associating posts with topics
    2. Determining post order within a topic
  4. Additional vocabulary for weblogs
    1. Classes
    2. Properties
  5. Further reading
  6. Issues
  7. Changes from previous version

Introduction

The Thread Description Language (TDL) is an RDF vocabulary for describing threaded discussions, including:

Styles of threading

TDL represents discussion threads as collections of linked posts, where a post is a self-contained, atomic contribution to the discussion. The nature of posts varies somewhat in different types of discussions, but they are generally always the work of a single participant. Weblog and message board posts and Usenet, e-mail, and IM messages are a few examples of posts.

Different types of discussions will organize their posts in different ways. TDL is designed to represent as many styles as possible using common features. (Because “thread” means different things in different places, it is not used in the vocabulary itself.)

Usenet and inter-weblog discussions make use of implicit threading. Rather formally declaring threads, each post makes references to one or more posts that precede it in the discussion. With a collection of posts and information about their references, it is possible to arrange these posts in a tree or graph structure representing the discussion.

In TDL, the references between posts are indicated with the property refersTo.

Many message board systems use a similar style called forked threading. Unlike implicit threads, forked threads are explicitly declared, as the centralized control of a message board allows one to know exactly what posts are part of the thread. This style is called forked, because each post may be referenced by multiple posts, resulting in a branched, tree-like form for the thread. (Forked message boards are also frequently called “threaded”.)

In TDL, a specific collection of posts is represented as a Topic, and the lead post(s) in a topic can be indicated with initialPost.

The remaining message boards and IM conversations use linear threading. While forked and implicit threads arrange posts in graphs or trees based on what posts they refer to, linear threads arrange posts in a sequence, usually based on when the posts were created. (Linear message boards are frequently called “unthreaded”.)

In TDL, the property hasPosts indicates an ordered list of posts belonging to something, such as a Topic. [Issue 2]

These styles of threading are not mutually exclusive. The same thread may be viewable as a sequence of posts and a tree. Similarly, while individual discussions in Usenet or among weblogs are not explicitly declared, at another level newsgroups and weblogs can be viewed as explicitly-declared threads.

The connections between posts and topics are many-to-many. A given topic will usually contain multiple posts—perhaps thousands—and a given post may belong to any number of topics. There are a number of ways to associate posts with a topic, but they fall into two broad categories.

A topic which has a value for hasPosts is closed. Only those posts which are part of the list given by hasPosts are part of the topic. This does not mean that the membership of the topic is fixed for all time. An RDF graph containing hasPosts is simply asserting that this set of posts and no others belong to the topic. It is up to implementations to decide which assertions to believe at what time.

In many cases, it is not practical or even possible to list every post in a topic. A topic which does not have a value for hasPosts is open. With an open topic, the only possible answers to the question “Is this post part of this topic?” are “Yes” and “Unknown”.

Conventions used in this document

In this document, several prefixes are used to abbreviate references to existing namespaces. A term such as “dcterm:isPartOf” is actually shorthand for the URI reference “http://purl.org/dc/terms/isPartOf”. Terms with no prefix (or the empty N3 prefix “:”) are from the TDL namespace.

Namespace prefixes used in this document
Prefix URI Reference Comment
http://www.eyrie.org/~zednenem/2002/web-threads/ TDL
dc: http://purl.org/dc/elements/1.1/ Dublin Core element set
dcterm: http://purl.org/dc/terms/ Dublin Core qualifiers
dctype: http://purl.org/dc/dcmitype/ DCMI resource types
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# RDF Core
rdfc: http://www.eyrie.org/~zednenem/2002/rdfchannel# RDF Channel
rdfs: http://www.w3.org/2000/01/rdf-schema# RDF Schema

As always, it is the URI reference which is meaningful, not the particular prefix.

The term “resource” is used in different ways by different standards on the web. To avoid confusion, this document uses the term item to refer any node in an RDF graph about which statements can be made (that is, everything except literals).

An RDF graph is a set of statements, such as the statements made by a single file. A graph asserts the truth of the statements it contains. This graph may be inconsistent if it asserts statements which contradict each other. This document does not describe how these inconsistencies should be dealt with.

TDL is defined in terms of RDF statements, not any particular serialization of RDF. Examples in this document are given using the N3 syntax, but TDL data may be expressed in any appropriate syntax. (An easily-parsed subset of RDF/XML could be described in an appendix, given interest.)

Core vocabulary

Classes

Post
A single part of a discussion, such as a weblog or message board post, or Usenet, e-mail, or IM message.
Subclass of: dctype:Text
Topic
A set of posts, usually connected in some way, such as being a discussion thread, a category in a weblog, or a query response.
Subclass of: dctype:Collection

Properties

agreesWith
Indicates an item which this post agrees with or confirms.
Sub-property of: commentsOn
Domain: Post
categoryOf
Indicates that this topic is a subtopic of the specified topic. The posts in this topic are all part of the larger topic in the same order relative to each other.
Sub-property of: subtopicOf
Domain: Topic [Issue 2]
Range: Topic [Issue 2]
commentsOn
Indicates an item which this post discusses or responds to.
Sub-property of: refersTo
Domain: Post
concludingPost
Indicates a post which concludes this topic.
Sub-property of: dcterm:hasPart
Domain: Topic
Range: Post
content
Indicates a text or XML fragment representing the content of this post.
Domain: Post
currentPosts
Indicates a collection of posts which are considered current, such as the posts mirrored on the front page of a weblog.
Sub-property of: rdfc:current
Domain: Topic, rdfc:Channel
Range: rdf:List (containing Posts)
disagreesWith
Indicates an item which this post rebuts or presents evidence contrary to.
Sub-property of: commentsOn
Domain: Post
excerpt
Indicates a text or XML fragment representing some portion of the content of this post.
Domain: Post
followsUp
Indicates a post, usually from the same author or publisher, which this post updates or corrects.
Sub-property of: refersTo
Domain: Post
Range: Post
hasPosts
Indicates the set of posts which are logically part of this item (usually a Topic). Posts which are not in the list are not part of this item.
Sub-property of: dcterm:hasPart
Range: rdf:List (containing Posts)
hasTopics
Indicates the set of topics which are formally part of this item (usually a Topic).
Sub-property of: dcterm:hasPart
Range: rdf:List (containing Topics)
initialPost
Indicates a post which initiates this topic.
Sub-property of: dcterm:hasPart
Domain: Topic
Range: Post
partOf
Indicates a topic to which this post belongs.
Sub-property of: dcterm:isPartOf
Domain: Post
Range: Topic
pointsTo
Indicates an item which this post refers to but does not discuss.
Sub-property of: refersTo
Domain: Post
quotes
Indicates an item which this post quotes.
Sub-property of: refersTo
Domain: Post
refersTo
Indicates an item which this post references.
Sub-property of: dcterm:references
Domain: Post
segmentOf
Indicates that this topic is a subtopic of the specified topic. The posts in this topic are all part of the larger topic in the same order contiguously.
Sub-property of: subtopicOf
Domain: Topic [Issue 2]
Range: Topic [Issue 2]
subtopicOf
Indicates that this topic is a subtopic of the specified topic. The posts in this topic are all part of the larger topic.
Sub-property of: dcterm:isPartOf
Domain: Topic
Range: Topic

Usage notes

Associating posts with topics

There are three ways to associate a post with a topic:

Given these rules, it is possible for a graph to assert that a post is and is not part of a topic. This indicates that one or more of the statements which gave rise to the inconsistency is incorrect. Determining which statements are to be believed will depend on the circumstances and possibly user preferences.

A point of clarification: It is possible for an item to be a post and a topic. For instance, in a weblog which allows user comments, a post in the weblog would be considered the topic of its comments. Asserting that a combined topic/post P is part of a topic T does not entail that its posts are part of T. Asserting that P is a subtopic of T does entail that its posts are part of T.

Determining post order within a topic

The value of hasPosts is an ordered sequence of posts. Depending on the nature of a topic, this order may or may not be significant at a user level. [Issue 2] A topic has only one ordering of posts. This means that post A may come before or after post B in topic T, but not both. A graph which asserts multiple values of hasPosts for a single topic is inconsistent unless all the values give the same posts in the same order.

The subtopic properties make different statements about how the order of posts in the subtopic is reflected in the larger topic. Given a topic S which contains only the posts A and B in that order and another topic T:

Thus, it is possible to indicate whether one post precedes another in a topic without needing to list every post in the topic. For example, one might use an anonymous category to state that a post A precedes a post B in topic T:

[ a :Topic; :categoryOf T; :hasPosts ( A B ) ]

Or, to indicate that A is the immediate predecessor of B:

[ a :Topic; :segmentOf T; :hasPosts ( A B ) ]

Additional vocabulary for weblogs

Classes

Weblog
A Topic containing the posts in a weblog.
Subclass of: Topic

Properties

hasLinksAt
Indicates a resource, such as a web page or RDF store, which lists the recommendations of this weblog.
Sub-property of: rdfs:seeAlso
Domain: Weblog
hasWeblogs
Indicates a collection of weblogs which are part of some larger entity, such as a weblog hosting service or corporate web site.
Sub-property of: dcterm:hasPart
Range: rdf:List (containing Weblogs)
recommends
Indicates an item which this weblog recommends. The set of recommendations associated with a weblog is often called a “blogroll”.
Sub-property of: dcterm:references
Domain: Weblog

Further reading

Issues

1. (Closed) Is use of dcterm:isPartOf consistent with Dublin Core?

Currently, dcterm:isPartOf associates a Post with a Topic, but does not imply a subtopic relation. That is:

A a :Post; dcterm:isPartOf B.
B a :Post, :Topic; dcterm:isPartOf C.

Does not entail:

A dcterm:isPartOf C.

However, the natural-language interpretation of dcterm:isPartOf implies that if A is part of B and B is part of C, then A is part of C. It may be more logical to reintroduce inTopic or some similar term to associate posts with topics.

Resolution: A new predicate, partOf, will be used instead of dcterm:isPartOf.

2. Is a method needed to indicate topics where the order of posts is not significant?

Currently, it is inconsistent to have different post sequences for a single topic. However, some types of topics such as Usenet news groups do not have a well-defined ordering of posts. In these cases, it is not inconsistent for one set of statements to assert that A precedes B and another to assert that B precedes A, because the ordering is unimportant.

On the other hand, it is possible to say nothing about the order of posts in a topic by not asserting a value for hasPosts and only asserting subtopics with subtopicOf. For a news group, the large volume of posts makes it unlikely that any value could be accurately asserted for hasPosts.

Perhaps it would be reasonable to create a Newsgroup subclass of Topic with the implication that its posts have no inherent order?

Favored resolution: A subclass of Topic will be created for Topics whose posts have a specific order. segmentOf and categoryOf will use this class as their domain and range. The name of this subclass is currently undecided; possibilities include “LinearTopic”, “OrderedTopic”, “Sequence”, and “Thread”.

Changes from previous versions

In addition to a reorganization of the documentation, this document also includes major changes to the TDL vocabulary since the last version. These are:

Subsequent changes

2002-11-19: Issue 1 resolved, adding predicate partOf.

David Menendez