After what seems like weeks of struggle (a few days, actually), I have managed to replicate a copy the CLAROS demo server that passes all test cases using newer Jena libraries.
My problems, it seems, were down to a failure of dataset configuration management. I really should have known better, but sometimes it seems the fundamental lessons need reinforcing. Ho hum.
Anyway, now I can look to loading up the latest data from LIMC, which promises to add some interesting capabilities. And I will maintain (compressed) copies of the test dataset in subversion!
Saturday, 12 June 2010
Friday, 11 June 2010
JISC project working with outside contractors
I've been reflecting on the ramifications for the MILARQ project of working with outside contractors.
While the initial motivation and justification for working with Epimorphics was for access to their technical expertise concerning Jena, I've noticed that working with an experienced external team is providing valuable views and insights into how to run this kind of project, which I'd like to think can be propagated to other JISC projects over time.
While the initial motivation and justification for working with Epimorphics was for access to their technical expertise concerning Jena, I've noticed that working with an experienced external team is providing valuable views and insights into how to run this kind of project, which I'd like to think can be propagated to other JISC projects over time.
Labels:
devcsi,
implementation,
JISC,
methodology,
MILARQ,
productivity,
progressPosts,
rapidInnovation,
VRERI
MILARQ technical review and planning meeting
This meeting was arranged at relatively short notice (i.e. unplanned) as we were facing some technical questions which we thought would benefit from face-to-face contact. But we also took the opportunity to treat the meeting as a mini sprint review.
The technical issues concerned (a) some identified optimizations which, while effective on a limited test query, we felt might not be effective across the range of queries CLAROS needed to perform, and (b) difficulties in replicating the updated server environment at Oxford.
Important outcomes are:
- The hypothesis that specialized indexes can resolve query performance problems is supported by some concrete evidence
- Substantial performance gains, to the extent that sub-second query performance may be achievable on current data, can be realized by materializing query results and appropriate arrangement of the queries used, but will probably not scale well to even larger data volumes (probably linear). Isolating values from about 66,000 object records takes over 0.5s
- More care is needed to ensure consistency of source data used for development and testing purposes
More detailed notes from the meeting are at http://code.google.com/p/milarq/wiki/20100611_Meeting_Project_Progress
Labels:
implementation,
JISC,
MILARQ,
Planning,
productivity,
progressPosts,
rapidInnovation,
VRERI
Tuesday, 8 June 2010
Sprint 3 plan
I prepared this a couple of weeks ago, but forgot to announce it: http://code.google.com/p/milarq/wiki/SprintPlan_3.
Following the query performance scoping experiments, the general plan for sprint 3 is to analyze existing software and plan ways to include additional indexing information, and for Oxford to replicate Epimorphics' running version of the Claros query service using a more recent version of Jena.
Following the query performance scoping experiments, the general plan for sprint 3 is to analyze existing software and plan ways to include additional indexing information, and for Oxford to replicate Epimorphics' running version of the Claros query service using a more recent version of Jena.
Labels:
JISC,
MILARQ,
Planning,
progressPosts,
rapidInnovation,
VRERI
Subscribe to:
Comments (Atom)
