2012.02 WP9: Data Management Facilities Development Monthly Activity By Task and Beneficiary
From IMarine Wiki
This WP9 Activity Report described the activities performed in February 2012 by Beneficiary and Task.
It is part of February 2012 Activity Report.
T9.1 Data Access and Storage Facilities
FAO has completed the first prototype of the SPARQL plugin for the Tree Manager services, with the latest snapshot available in Nexus repositories. Based on a Jena backend, the prototype supports arbitrary tree-based lookup and queries of RDF sources that expose a standard SPARQL query interface. For both classes of operations, the current prototype translates tree patterns into SPARQL queries that preserve a subset of the mandatory and optional constraints on edges that may be specified in patterns. It then translates the subset of the graphs returned by queries intro trees, and returns them to clients. The translation of patterns into SPARQL queries and the translation of graphs into trees raise a number of technical issues. The key issues and the solutions implemented by the plugin include:
- RDF graphs may include cycles and shared nodes which have no standard encoding in the tree model of the Tree Manager.
- the plugin addresses this issue by creating trees with 'reference' nodes, i.e. nodes with a distinguished attribute which points back into the tree using local node identifiers.
- RDF graphs have no entry points that may be mapped onto tree roots.
- this raises no issues in lookup operations, as the plugin can map the identified resource as the tree root. For queries, the plugin uses the top-level edge constraints of the tree patterns provided by clients. In particular, it maps resources that have predicates that match the constraints onto tree roots. For queries whose patterns do not specify top-level constraints, the plugin falls back to heuristic behaviour and maps resources that are not objects of predicates (i.e. do not close cycles) onto tree roots.
- RDF graphs do not limit the triple-closure of resources that map onto tree roots and SPARQL queries cannot express this closure in a generic fashion (no recursion).
- the plugin addresses this issues by synthesising SPARQL queries for the 'bounded closure' of all triples directly or indirectly rooted in a given resource.
Future versions of the plugin will be able to translate a broader set of patterns and will allow root identification and radius queries to be configured, either when the plugin is bound to SPARQL endpoints or on a per-request basis.
None to report.
Completed first prototype of the SPARQL plugin for the Tree Manager services.
CNR has completed the first prototype of the Species Manager Service, starting also the implementation of the related plugins (Obis, Gbif, Catalogue of life) . The service is under testing and the prototype version is running on the development infrastructure and it can be accessed using the species discovery portlet in devportal.
Access to external repositories of biodiversity data such as Obis, Gbif and catalogue of life.
T9.2 Data Transfer Facilities
During the project M4 CERN continued to edit the document describing the T9.2 task. A new version is now available at . In particular the document now includes an analysis of the crucial points of integration of FTS2/FTS3 with respect to gCube Data Transfer, and a first proposed architecture for the gCube Data Transfer.
In Parallel the implementation of the data-transfer.agent has continued in order to implement a first interaction with StorageManager and Tree Manager services.
Some feedback has been given to Storage Manager developers in order to improve the API and fix some bugs. Lastly the URL Library Resolution implementation released for 2.8.0 by NKUA has been validated, some changes on the URL definition has been suggested.
- New version of the T9.2 Overview Document edited.
NKUA has been working on incorporating gRS2 into the new gCube Data Transfer Service The following tasks are in progress in order to achieve this goal:
- Multiple transfer protocols support: The implementation of the general purpose URL resolution library for the protocols http, ftp, sftp, ftps, gridftp, and bittorrent is almost completed. The functionality of library is implemented as a standard URL mechanism of Java in order to handle all URLs uniformly, as we documented the last month. The changes that were suggested by the CERN partners are in progress.
- Supporting HTTP as gRS2 transfer method: The implementation of the point to point proxies for gRS2 is in progress.
Readers is now configurable. In particular, readers can now override the capacity of the buffer specified by the corresponding writer. The overriden capacity argument acts as a hint, as whether it will be honored or not depends on the implementation of the underlying reader proxy.
The URL Resolution Library has been released in 2.8.0. The URL Resolution Library is implemented as a standard URL mechanism of Java in order to handle all URLs uniformly and it can be used for every possible file transfer.
During February, Terradue pursued the integration of the Hadoop framework behind the OGC WPS implementation of 52 North. Prototyped during January and February, this framework has proven to be a valuable solution to support geospatial data processing scenarios. The bathymetry data retrieval for 250.000 simulated points was tested with high performances. A demo has been setup on Terradue's development environment with the implementation of a resampling WPS process. MyOcean data (netCDF-CF) passed by reference as a WCS online resource exposed by Thredds can be resampled to a given spatial resolution. A more complete demo will be available for the TCOM in FAO. Work on a MyOcean data visualization WPS process is on-going to publish the layers on a given MyOcean product as WMS layers in GeoServer.
Applying the resampling on the bathymetry has proven to be a too CPU intensive operation to be done in a typical web session application. It has been agreed to store several resolutions of the bathymetry dataset
A prototype of the WPS-Hadoop has been demonstrated with the bathymetry data extraction and MyOcean data resampling.
T9.3 Data Assessment, Harmonization and Certification Facilities
The new Gis Viewer component has been integrated with the Time Series Application. After a project refactoring the Gis Viewer widget has been wrapped with an extension called GCube Gis Viewer.
The GCube Gis Viewer extension retrieves the Gis Viewer configuration information from the Runtime Resources stored in the gcube infrastructure. Moreover the widget has been integrated with the user workspace: the user is now able to save generated images of the visualized layers in the own workspace.
- Use of Runtime Resource instead of hard coded configurations for Gis Viewer component;
- Integration of Gis Viewer component with Workspace environment.