Thursday, 27 May 2010

Progress review with Epimorphics

Our first progress review meeting was held a little later than intended, due to a combination of technical problems with the development and test environment inhibiting progress, and scheduling conflicts.
Notes from the meeting are at http://code.google.com/p/milarq/wiki/20100526_Meeting_Project_Progress.

The scoping experiments generally confirmed the premise on which the project is based, namely that it is the use of ordering in complex queries that gives rise the the most serious query performance problems. Additionally, some unanticipated (though, in hindsight, unsurprising) results were also noted:
  • Even without the cost of sorting, some of the original queries do not meet the sub-second query execution criterion. Engineering solutions have been tested for these cases.
  • Queries involving "joins" impose significant increased cost. That is queries that involve discovering chains of triples, rather than combinations of triples with a common subject, impose a significant query performance penalty.
The plan for the next month will focus on determining how to provide additional indexing to accelerate queries that depend on ordering information:
  • as a priority, examine ARQ processing with a view to understanding how index ordering info can be used
  • look at current external index mechanisms used by ARQ (currently Lucene plus property function hooks).
From this, we hope for a well-understood and easily implemented plan for software enhancements to deliver the required query performance.

No comments:

Post a Comment