Community Page
- phildawes.net/blog/ Jump to website »
-
Subscribe -
Community
-
Top Commenters
-
Popular Threads
-
Recent Comments
- Hi, Do you feel that your agility in Factor has improved since this post? Roger
- Thanks for the pointer - I've cleaned up the spam and regrettably added some moderation
- I'm loving the comments thread for this post. Can't decide whether to get my upholstery cleaned or do something about my fast food obesity.
- Cool - thanks Eric
- I pasted some code that does the moving sum in factor. http://paste.factorcode.org/paste?id=569#282
Jump to original thread »
With all the buzz around the possibility of an ‘RDF-Lite’, I feel compelled to list a few barriers that I think URIs raise for a new user trying to get to grips with RDF metadata creation.
Here they are, in no particular order:
(1) URIs don’t allow ... Continue reading »
Here they are, in no particular order:
(1) URIs don’t allow ... Continue reading »
3 years ago
Yes:
http://laurentszyster.be/blog/public-names/
of course ;-)
Public Names provide a data model that:
1. Captures simple text articulation as unique
sets of strings in a single semantic field,
for instance (with CRLF added):
17:
6:Public,
5:Names,
,
15:
4:data,
5:model,
,
1:a,
7:provide,
4:that,
2. Allow a simple computer system to validate
a string of bytes as an *unambiguous* text
articulation, for instance:
5:Dawes,4:Phil,
and use them as Unique Resource Identifier
with the required properties for a semantic
application.
Kind Regards,
3 years ago
(1) Why can't you use blank nodes if you can't use URI References? Resources don't need to be named, and sometimes (like in a database-like environment) most resources will be unnammed.
If you are willing to step up to OWL, then with inverse-functional properties you can still identify things with a "public key" like structure. However, you can do that anyway with any practical RDF application too.
Also, minting URI References are easy. Here's a URIRef: "data:Jimmy_Cerra". It is a little different from english, but we are working with computer languages not english. Would those people complain about writing their words in languages like Japanese; so are those people having reasonable expetations?
Also, the only requirement is that URI References are semantically uniform across the graphs you use it in. Problems happen when you merge graphs that have different semantics with the same URI Reference, but sometimes the types of graphs merged are small and managable.
If you have to merge with large numbers of graphs or with the whole Semantic Web for all of eternity, then I can see where minging URI References is a problem. But that is a social problem with naming itself and not RDF.
(2) Yes, and no. I've come to the conclusion that the only way to understand the semantics of anything is to ask the author (i.e. human documentation). There is no way to do so via computers. This is the same since the dawn of internet time (from the RFC specs to XHTML to the Atom Publication Format).
That's one reason nobody likes DTDs, RELAX NG, XML Schema, OWL (sometimes), and others to specify semantics. You can't do so completely for most non-trivial applications, and all those validation technologies are only hints. That's also why everyone loves XML Schema Datatypes: those elements specify semantics rather than provide a framework for specifying semantics.
(3) Just because some people get confused doesn't mean that others don't. I understand the differences, as to the people I explain them to. Should we throw away calculus because some people don't understand it?
(4) See (1).
I used to be really bugged by those problems... but I think I've found enlightenment. The best way to write semantic web software is to assume, like Socrates and Decartes, that "To know that you do not know is true wisdom". I.E. Assume the semantics of nothing in any context and look it up or ask the URI owner.
3 years ago
"URIs are globally scoped, which means they need to mean the same thing in any context."
isn't true, for RDF. URIs don't have meaning they have denotations; denotations are assigned ("distributed") and that can be done in a local scope. In theory, when you merge data, you determine that the same URI has different referents via logical inconsistencies; in practice you have domain experts and data modellers look analyse the data (just like you do with relational database integrations).
For me, you left out an most important thing, which is lots of URIs in the same place are hard to read. QNames win the readability argument.
3 years ago
Jimmy Cerra writes:
Why can't you use blank nodes if you can't use URI References? Resources don't need to be named, and sometimes (like in a database-like environment) most resources will be unnammed.
If you are willing to step up to OWL, then with inverse-functional properties you can still identify things with a “public key” like structure. However, you can do that anyway with any practical RDF application too.
Actually I attempted to follow this approach at work for a while (ala foaf), and was indeed willing to step up to OWL - my veudas triplestore supported inverse-functional properties for this reason (via a forward-chaining reasoner e.g. see circa sep 2004 if you're interested!).
It did make things complicated though - IFP smushing was slow, and unless you're going to give people cookie-cutter examples then they really do need to understand IFPs.
e.g. people don't naturally write:
<pre>
<project>
<name>My Application</name>
<maintainer>
<foaf:Person>
<foaf:mbox>foo@example.com</foaf:mbox>
</foaf:Person>
</maintainer>
<project>
</pre>
Unfortunately cookie-cutter examples kind-of miss the point - you might as well be translating people's data into RDF for them. The real goal for me at work was that people could come up with their own data (from their own systems) that could be aggregated and merged usefully, otherwise it's not really worth the trouble.
...
Also, the only requirement is that URI References are semantically uniform across the graphs you use it in. Problems happen when you merge graphs that have different semantics with the same URI Reference, but sometimes the types of graphs merged are small and managable.
If you have to merge with large numbers of graphs or with the whole Semantic Web for all of eternity, then I can see where minging URI References is a problem. But that is a social problem with naming itself and not RDF.
I think it's a problem with globally scoped naming. - The RDF model doesn't allow for any skewing of meaning with context. You can't change society, and global adoption is one of the aims of the semantic web.
To be honest I think this sort-of illustrates a wider point - if you're just going to work on small manageable sets of data then why bother with complex URI and RDF machinery that inhibit adoption? - It strikes me as quite ironic that the very RDF machinery that was intended to facilitate this large-scale aggregation of data actually ends up inhibiting it.
3 years ago
Strictly speaking, (4):
“URIs are globally scoped, which means they need to mean the same thing in any context.”
isn’t true, for RDF. URIs don’t have meaning they have denotations; denotations are assigned (”distributed”) and that can be done in a local scope. In theory, when you merge data, you determine that the same URI has different referents via logical inconsistencies; in practice you have domain experts and data modellers look analyse the data (just like you do with relational database integrations).
Ok - that makes sense (although I haven't read that anywhere before - but then I'm starting to fall behind with the literature ;-) ).
Which means that there's probably a lot of scope for simplifying RDF - you can't throw a baby out with the bathwater if it wasn't in the bath to begin with.
3 years ago
This will make the distinction clearer to people and will also avoid wasted network traffic when attempts are made to retrieve the resource.
I realise I'm in a minority with respect to this opinion on the use of HTTP URLs but I've yet to see a coherent argument against it.
3 years ago
Surely, if two or more datasets use the same URI to denote different resources then at least one of them is simply wrong - it is not using the URI in the way that the URI's original minter intended. In practice, you need to have your domain experts fix up the data before the merge.
3 years ago
> (1) URIs don’t allow you to use existing identity schemes.
Exactly because to do so would be ambiguous. How do you know what identity scheme is being used? You could, say, prefix it with the name of the scheme (i.e. myscheme:12345) -- but then you have to unambiguously identity the scheme name. If the scheme name is unambiguous, then you have a URI anyway.
> (2) HTTP URIs have a load of implicit baggage
It's not a requirement that people use HTTP URIs. I'd be all for throwing away these, but that doesn't mean throwing away the entire URI concept.
> (3) URIs are URLs
Aren't URLs URIs? Same as 2.
> (4) URIs require a level of precision in ‘meaning’ that is hard to attain. URIs are globally scoped, which means they need to mean the same thing in any context.
If this weren't the case, no two RDF documents could ever be merged because you would never know if the authors intended their nodes to denote the same thing. But, like it was pointed out, it's not necessarily a problem if this doesn't occur in practice.
> using URIs collaboratively and successfully requires a non-trivial amount of upfront thought, documentation and proactive consensus building.
Every naming scheme is going to be like that, to some degree. Do URIs actually require more upfront thought than other schemes, though?
3 years ago
> (1) URIs don’t allow you to use existing identity schemes.
Exactly because to do so would be ambiguous. How do you know what identity scheme is being used?
Context tells you this.
> (4) URIs require a level of precision in ‘meaning’ that is hard to attain. URIs are globally scoped, which means they need to mean the same thing in any context.
If this weren’t the case, no two RDF documents could ever be merged because you would never know if the authors intended their nodes to denote the same thing. But, like it was pointed out, it’s not necessarily a problem if this doesn’t occur in practice.
I think when it doesnt happen in practice it's because the people doing the merging know something of the context under which the document is written. You need this anyway - otherwise how do you know that the author of the RDF graph is a reliable source, or even competent in RDF?
Besides - I think this problem does happen in practice.
3 years ago
> using URIs collaboratively and successfully requires a non-trivial amount of upfront thought, documentation and proactive consensus building.
Every naming scheme is going to be like that, to some degree. Do URIs actually require more upfront thought than other schemes, though?
More than localized, context bound schemes - yes. E.g.
PhilDawes name "Phil Dawes"
PhilDawes email pdawes@users.sf.net
didn't require much thought, because it is bound to the scope of this blog comment. It's a bit throwaway, but you still understand what I mean to some degree because you understand something of the context under which I wrote it.
3 years ago
More than localized, context bound schemes - yes.
But the really cute thing about URIs are that they form a sort of federation of separate localized context bound schemes. Each URIRef carries around inside itself both the global name of the local scheme and the name within that scheme - all in a reasonably familiar, readable and compact sequence of characters. So no, I don't think URIRefs can require more upfront thought than localized schemes other than the trivial issue of deciding on the first part of the URI used to prefix names in the scheme - to make them globally unique.