Community Page
- phildawes.net/blog/ Jump to website »
-
Subscribe -
Community
-
Top Commenters
-
Popular Threads
-
Recent Comments
- Hi, Do you feel that your agility in Factor has improved since this post? Roger
- Thanks for the pointer - I've cleaned up the spam and regrettably added some moderation
- I'm loving the comments thread for this post. Can't decide whether to get my upholstery cleaned or do something about my fast food obesity.
- Cool - thanks Eric
- I pasted some code that does the moving sum in factor. http://paste.factorcode.org/paste?id=569#282
Jump to original thread »
I wrote a bit about representing structured data in the last post. Here’s some ideas for how I plan to index the data.
Indexing graphs as subject ranges
In indexing triples I need to provide indexed lookups for all 6 of the possible triple query patterns:
s->po
sp->o
p-% ... Continue reading »
Indexing graphs as subject ranges
In indexing triples I need to provide indexed lookups for all 6 of the possible triple query patterns:
s->po
sp->o
p-% ... Continue reading »
2 years ago
2 years ago
Didn't you say that at least the subject will be a sequential identifier, though, and so not susceptible to that optimisation?
How many actual indexes will you need to efficiently support that set of queries, given your heirarchial index structure? Only 3?
2 years ago
The latter. Subject identifiers aren't exposed to the client so there's no way to make statements using them specificially. Instead to join data from two subjects in different graphs you must use identity by discription (i.e. the subject that has these property values..) and the person/agent doing the query must know about them.
Internally the subject IDs can be in the 'object' position to support things like containment. E.g. the XML:
<pre>
</pre><person>
<name>Phil Dawes</name>
<email>phil@example.com</email>
<knows>
<person>
<name>Steve</name>
<email>s@example.com</email>
</person>
</knows>
</person>
Internally indexed as:
<pre>
</pre>#1 name "Phil Dawes"
#1 tag Person
#1 knows #2
#2 name Steve
#2 email steve@example.com
#2 tag Person
but externally you can't refer to them. Does that make sense?
2 years ago
Opaque subject identifiers are even easier to index because they can be picked to be sequential in the index. I.e. subject 3 is at position 3.
Re. number of indexes: I think I'll need at least the following.
s->p->o
p->o->s
o->s->p
So 3 index hierarchies for searches. The subject-id-in-the-object-position mentioned above is a special case, and will probably require its own (relatively small) index o->sp.