ZedneWeb / RDF Collection Semantics

Collection Semantics for RDF

This document describes an RDF vocabulary intended to supplement RDF Schema and OWL. Often, when the intended value of a property is an RDF list, it is required to be well-formed (that is, each list node having exactly one value for rdf:first and rdf:rest). It would also be useful to be able to state that the collection has the semantics of a set or (ordered) list, or that its members belong to a certain class, or that it has a minimum or maximum size. This vocabulary provides a standard way of doing so, as well as a set of axioms allowing processors to determine whether a graph is consistent with its schema and to infer some information.

A typical use of this vocabulary might look like this:

<rdf:Property rdf:ID="hasThingamabobs">
    <rdfs:range>
        <CollectionClass>
            <minSize>3</minSize>
            <maxSize>5</maxSize>
            <allMembersFrom rdf:resource="#Thingamabob"/>
        </CollectionClass>
    </rdfs:range>
</rdf:Property>

This indicates that the value of hasThingamabobs must be a well-formed collection containing 3–5 Thingamabobs.

A more complex example, describing a chessboard:

<rdf:Property rdf:ID="boardState">
    <rdfs:range>
        <ListClass>
            <size>8</size>
            <allMembersFrom>
                <ListClass>
                    <size>8</size>
                    <allMembersFrom rdf:resource="#ChessSquareState"/>
                </ListClass>
            </allMembersFrom>
        </ListClass>
    </rdfs:range>
</rdf:Property>

Vocabulary

The namespace for this vocabulary is “http://www.eyrie.org/~zednenem/2003/rdfcollection#”.

Classes

These are actually metaclasses, in that their instances are themselves classes. For example, the class of sets containing 5 members is an instance of SetClass.

CollectionClass: CollectionClasses require their instances to be well-formed.
LooseSetClass: Loose sets are collections which allow members to be repeated without changing the interpretation of the set.
Subclass of CollectionClass
MultiSetClass: Multisets are collections which operate like sets, except that elements can be a member more than once. (Somewhat akin to rdf:Bag.)
Subclass of LooseSetClass
SetClass: A multiset where elements can be a member only once. It is an error for a member to be present more than once. (Otherwise, this is identical to LooseSetClass.)
Subclass of MultiSetClass
ListClass: An ordered multiset. Unlike other loose sets, changing the order of members in a list changes its meaning. (Somewhat akin to rdf:Seq.)
Subclass of MultiSetClass
SequenceClass: An ordered set. Unlike other sets, changing the order of members in a sequence changes its meaning. Unlike other lists, members can appear only once.
Subclass of SetClass and ListClass.

Predicates

allMembersFrom: All members of all instances of this class belong to the specified class.
Domain: CollectionClass
Range: rdfs:Class
minSize: All instances of this class must have at least this many members.
Domain: CollectionClass
Range: (positive integer)
maxSize: All instances of this class must have no more than this many members.
Domain: CollectionClass
Range: (positive integer)
size: All instances of this class must have exactly this many members.
Domain: CollectionClass
Range: (positive integer)

Common logical predicates

This section defines some predicates used to describe collections in the axioms. They extend the system described in RDF-MT. They aren’t necessarily important to understanding the system.

First, two abbreviations for the values of rdf:first and rdf:rest:

first(a, b) ≡ ‹a, b› ∈ IEXT(I(rdf:first)) ∧ a ≠ I(rdf:nil)
rest(a, b) ≡ ‹a, b› ∈ IEXT(I(rdf:rest)) ∧ a ≠ I(rdf:nil)

Second, a predicate distTo(a, b, n) which is true if one can follow n rdf:rest arcs from a to b:

∀a, b: distTo(a, b, 0) ↔ a = b
∀a, b, n: distTo(a, b, n) ↔ n > 0 ∧ (∃c: rest(a, c) ∧ distTo(c, b, n - 1))

Third, length(a, n), which is true if one can reach rdf:nil from a by following n rdf:rest arcs.

length(a, n) ≡ distTo(a, I(rdf:nil), n)

Fourth, memberAt(a, n, i), which is true if one can reach a node b by following n rdf:rest arcs from a and if b has the value i for rdf:first:

memberAt(a, n, i) ≡ ∃b: distTo(a, b, n) ∧ first(b, i)

Fifth, member(a, i), which is true if there is a node b which one can reach by following rdf:rest arcs from a and if the value of rdf:first for b is i:

member(a, i) ≡ ∃n: memberAt(a, n, i)

Sixth, a set of predicates which indicate whether a collection is defined as an instance of a specific type of collection class:

Collection(a) ≡ ∃b: a ∈ ICEXT(b) ∧ b ∈ ICEXT(I(CollectionClass))
LooseSet(a) ≡ ∃b: a ∈ ICEXT(b) ∧ b ∈ ICEXT(I(LooseSetClass))
Set(a) ≡ ∃b: a ∈ ICEXT(b) ∧ b ∈ ICEXT(I(SetClass))

Finally, RestrColl(a, b, c), which is true if a belongs to a collection class d which has the value c for the restriction b:

RestrColl(a, b, c) ≡ ∃d: a ∈ ICEXT(d) ∧ ‹d, c› ∈ IEXT(b)

Indicating a collection’s semantics

The specific meta-class to which a collection belongs indicates which semantic interpretations and syntactic restrictions apply. Members of a CollectionClass must be well-formed. Two instances of a LooseSetClass are considered equivalent if every member of one is also a member of the other. Two instances of a MultiSetClass are equivalent if they contain the same members the same number of times. Two instances of a ListClass are equivalent if they contain the same members in the same order. Instances of a SetClass can contain no duplicates.

Axiom 1 states that an instance of a CollectionClass must be well-formed. (It relies on the predicate WellFormed(a), which is true when a has exactly one value for rdf:first and rdf:rest.)

Axiom 1
∀a, b: Collection(a) → WellFormed(a) ∧ (∃b: rest(a, b) ∧ ¬(∃n: distTo(b, a, n)) ∧ (Collection(b) ∨ b = I(rdf:nil)))

Axiom 2 states that a collection is an instance of a SetClass if and only if it is an instance of a LooseSetClass and contains no duplicates.

Axiom 2
∀a: Set(a) ↔ LooseSet(a) ∧ (∀i, m, n: memberAt(a, m, i) ∧ memberAt(a, n, i) → m = n)

The equivalency semantics implicit in LooseSetClass, MultiSetClass, and ListClass are intended to allow processors to determine whether two values for a given property are the same. For example, based on this graph:

<#x> :hasThingamabobs ( <#a> <#b> <#c> ). <#y> :hasThingamabobs ( <#c> <#a> <#b> ).

If the range of hasThingamabobs is a LooseSetClass but not a ListClass, then #x and #y have the same value for hasThingamabobs.

The predicate SetEquivalent(a, b) is true when every element that is a member of a is also a member of b, and vice versa. Properties whose range is a LooseSetClass and not a MultiSetClass are considered to be equal when SetEquivalent is true for their values.

SetEquivalent(a, b) ≡ ∀i: member(a, i) ↔ member(b, i)

The predicate MultiSetEquivalent(a, b) is true when every element that is a member of a is a member of b the same number of times. Properties whose range is a MultiSetClass and not a ListClass are considered to be equal when MultiSetEquivalent is true for their values. (Note that MultiSetEquivalent is equivalent to SetEquivalent when duplicate members are forbidden, as in a SetClass.)

MultiSetEquivalent(a, b) ≡ something or other…

(Multisets are a late addition to this vocabulary, and it still isn’t clear whether they are actually necessary.)

The predicate ListEquivalent(a, b) is true when a and b contain the same members in the same order (e.g., the first element of a is also the first element of b, etc.). Properties whose range is a ListClass are considered to be equal when ListEquivalent is true for their values.

ListEquivalent(a, b) ≡ ∀n, i: memberAt(a, n, i) ↔ memberAt(b, n, i)

Indicating the class of collection members

The allMembersFrom property allow schema to state that all members of a collection belong to a class.

Axiom 3 states that all members of a collection belong to a class given as the value of allMembersFrom.

Axiom 3
∀a, b, i: RestrColl(a, I(allMembersFrom), b) ∧ member(a, i) → i ∈ ICEXT(b)

Restricting collection size

The minSize, maxSize, and size properties allow schema to state how many elements a collection may have. A graph containing a collection with fewer members than required by minSize or more members than allowed by maxSize is inconsistent with a schema making those requirements.

Axiom 4 states that a collection must have at least as many members as required by minSize.

Axiom 4
∀a, m, n: RestrColl(a, I(minSize), m) ∧ length(a, n) → m ≤ n

Axiom 5 states that a collection must have no more members than allowed by maxSize.

Axiom 5
∀a, m, n: RestrColl(a, I(maxSize), m) ∧ length(a, n) → m ≥ n

size is a sub-property of minSize and maxSize, so declaring a size is equivalent to declaring the minimum and maximum sizes to be the same.

When multiple values of minSize are asserted, the largest has effect. When multiple values of maxSize are asserted, the smallest has effect. If a value for minSize is smaller than that asserted for maxSize, then no collection can be consistent with the restriction.

Determining whether a CollectionClass is a subclass of another CollectionClass

A CollectionClass a must meet four requirements to be considered a subclass of a CollectionClass b.

Its type must be at least as specific (i.e., if b is a ListClass, then a must be a ListClass).
SType(a, b) ≡ ∀c: b ∈ ICEXT(c) → a ∈ ICEXT(c)
Its allMembersFrom restriction must be at least as restrictive.
SClass(a, b) ≡ ∀d: RestrColl(b, I(allMembersFrom), d) → (∃c: RestrColl(a, I(allMembersFrom), c) ∧ ‹c, d› ∈ IEXT(I(rdfs:subClassOf)))
Its minimum length requirement must be at least as long.
SMin(a, b) ≡ ∀n: RestrColl(b, I(minLength), n) → (∃m: RestrColl(a, I(minLength), m) ∧ m ≥ n)
Its maximum length requirement must be no longer.
SMax(a, b) ≡ ∀n: RestrColl(b, I(maxLength), n) → (∃m: RestrColl(a, I(maxLength), m) ∧ m ≤ n)

Axiom 6 states that, when b is a CollectionClass, a is a subclass of b if and only if all four requirements are met.

Axiom 6
∀a, b: b ∈ ICEXT(I(CollectionClass)) → (‹a, b› ∈ IEXT(I(rdfs:subClassOf)) ↔ SType(a, b) ∧ SClass(a, b) ∧ SMin(a, b) ∧ SMax(a, b))

Example

TDL defines two properties for associating tdl:Posts with a tdl:Topic, tdl:hasPosts and tdl:hasPostSequence. The first takes an unordered list of members, and the second takes an ordered list. They might be defined like so:

<rdf:Property rdf:about="hasPosts">
    <rdfs:range>
        <SetClass>
            <allMembersFrom rdf:resource="Post"/>
        </SetClass>
    </rdfs:range>
    <owl:maxCardinality>1</owl:maxCardinality>
</rdf:Property>

<rdf:Property rdf:about="hasPostSequence">
    <rdfs:subPropertyOf rdf:resource="hasPosts"/>
    <rdfs:range>
        <SequenceClass>
            <allMembersFrom rdf:resource="Post"/>
        </SequenceClass>
    </rdfs:range>
    <owl:maxCardinality>1</owl:maxCardinality>
</rdf:Property>

(Technically, Axiom 6 ensures that the range of tdl:hasPostSequence would be at least a SetClass with all members from tdl:Post, but it is unwise to assume that processors will be doing that level of inference.)

The owl:maxCardinality property indicates that a graph which gives more than one value for tdl:hasPosts or tdl:hasPostSequence is inconsistent with this schema. Thus, this graph would be inconsistent with the schema:

<#x> tdl:hasPosts ( <#a> <#b> ). <#x> tdl:hasPosts ( <#a> <#b> <#c> ).

The two collections represent non-equivalent sets, so they must be considered different values of tdl:hasPosts for #x, which is not allowed.

However, a graph with two equivalent values for a property can be considered consistent.

<#x> tdl:hasPosts ( <#c> <#a> <#b> ). <#x> tdl:hasPosts ( <#a> <#b> <#c> ).

Even though two different nodes are given as values for tdl:hasPosts, the sets they represent contain the same members are are therefore considered equivalent.

<#x> tdl:hasPostSequence ( <#c> <#a> <#b> ). <#x> tdl:hasPosts ( <#a> <#b> <#c> ).

This graph provides one value of tdl:hasPostSequence for #x, and also entails the previous example, which provides one value of tdl:hasPosts for #x. Thus, this graph is consistent with the schema.

<#x> tdl:hasPostSequence ( <#c> <#a> <#b> ). <#x> tdl:hasPostSequence ( <#a> <#b> <#c> ).

This graph gives two values for tdl:hasPostSequence which are not equivalent lists and is therefore inconsistent with the schema.

Dave Menendez