I am just about to board my Ryan Air flight FR176 back home, but I thought I would take a moment to priase Eircom for providing me with free Wifi to while away the time between checkin and boarding. May their goats be forever lactating, their loins fruitful and their coffee with milk and two sugars.
So I have been meaing to write up my notions of trust in FOAF for a while now… so here they are. For newer readers, information on FOAF is availabe as is an introduction to RDF also availabe is a N3 specific introduction to RDF.
Firstly what most people think of as trust is really more two things:
- Validated Identity
- Trust (proper)
In order to trust something to some degree you must first have some notion of identity for it. In other words, I cannot say I trust x and y differently and distinctly unless I know that x and y are different (or at least percieve they are different).
So how do we establish identity? In order to establish identity we must have some mechanism to show distinctness between two entities. At this point it becomes necessary to add information to documents stating who wrote them with some proof of authorship. The best proof of authorship is some kind of digital signiture. Personaly I advocate PGP for numerous reason that other people have expressed better than me. Do Google for “PGP” (or maybe “GPG”), if you don’t know what I am talking about.
However part of the problem we have is that we don’t want our triples tied to any particular format. What we need is a form of the informtion we can sign in order to verify the triples without tying it to say RDF/XML or N3 etc. Fortunately TriX is “a highly normalized, consistent XML representation for RDF graphs”. My suggestion is that people put their graphs in TriX and then sign a checksum of the TriX graph. This would be compact and easy to distribute as a literal. In order to check it, the reciever would just convert whatever format of RDF they had recieved into TriX create a checksum and then check it against the signed checksum.
We have now achieved a situation in which we can verify a particular graph has been authenticated to a particular PGP key. While we don’t have any implicit notions of trust from a simple digital signiture we have some form of identification. This means we can make judgements against the behaviour of that key. For example triples signed with a key that has provided verified accurate information, on numerous occasions, may require less vetting than triples signed with other keys or ones not signed at all.
The next step is to work out who belongs to which key. There are two main architectures to achieve this, although hybrid’s of both can be used. They are:
- Central Authority / Key server
- Web of trust
The essential point for FOAF in both of these architectures is to establish a connection between an IFP of a person and a signiture. For example, by hosting your PGP key on your personal homepage you are connecting yourself with the key. There is of course some debate about which properties should be IFPs. This is highly relevent since there is a potential for an attacker to use one IFP to fake the authenticity of another.
For example, an attacker could made a signed FOAF graph (file) which indicated a person, who they are not, had a homepage we shall call http://foo.com and an mbox of firstname.lastname@example.org. The attacker controls http://foo.com but not email@example.com. On http://foo.com they place a PGP key on it which they claim identifies the person who is the subject of the graph, this key is used to sign the graph. So far, what the attacker has done is identify a person using the homepage property as an IFP, and the key verified by it. This is not as yet a bad thing. However the trap is to assume the association between the key and the homepage is also an association between the Mbox and the key. This is not the case. If one takes the authenication of one IFP to authenicate the others, it allows an attacker to use a weaker IFP (ie one less commonly used) to maskercade with a stronger one (such as Mbox).
The correct behaviour is to treat each individual as identified according to the IFP they identified via. So if someone used the Homepage property to identify with, then that is what should be used to identify them. This means that if two graphs refer to the same Homepage they should use the say key (or one in an identified chain, more on that later). It does not mean that if someone is identified via Homepage then they can be indentified (smushed, etc) via the Mbox property. It is of course possible to provide verification against more than one IFP in a graph.
A quick note on keys. Why would we not use keys as an IFP? After all they seem to be so good at identifying people. Simply stated, keys are transient. An important element in key strength is the lack of sufficient entropy (information) about how the key works to be used in breaking it. As such the more you use a key the easier it is to break. This means many people rotate keys (although this can be because of lost passphrases, or private keys, among other things). If people rotate their keys then it makes it impossible to use as an IFP. One thing that is useful however is to provide a chain of keys, from original to current where ever possible. Whenever a key is about to expire signing the new public key with the old key provides a chain allowing people to track the changes. This means if people have old documents signed with old keys they can validate the ownership of new documents with new keys, to older documents.
Now down to the specific implementations: