2012.01 WP9: Data Management Facilities Development Monthly Activity By Task and Beneficiary
From IMarine Wiki
This WP9 Activity Report described the activities performed in January 2012 by Beneficiary and Task. It is part of January 2012 Activity Report.
T9.1 Data Access and Storage Facilities
FAO has extensively documented the plugin framework of the new Tree Manager Service, describing the main concepts, design patterns, and individual components of the framework. The documentation has been circulated among interested developers. It will be later migrated to the Developer Guide, in synchronisation with the migration of other components of the Content Management subsystem.
FAO has been working on a Tree Manager plugin for remote SPARQL endpoints. The implementation is ongoing and already under source revision. We hope to have a working prototype early next month.
- the documentation of the plugin framework of the Tree Manager service.
- the SPARQL plugin prototype.
The CNR team is creating a prototype for the new Species Manager service. The service will be responsible of merging the different sources of information about taxonomy, species occurrence points and other species data. The current sources under analysis are the GBIF network and the iObis information system.
None to report.
- Species Service prototype
- User Interface for Species Service prototype
- GBif Network plugin prototype
- iObis Information System plugin prototype
T9.2 Data Transfer Facilities
During the project M3, CERN continued the study the possible integration of the EMI FTS service in gCube, for this reason a new version of the service have been installed at https://imarine1.cern.ch:8443/glite-data-transfer-fts/services/FileTransfer ( Firewalled), and it has been configured for the d4science VOVirtual Organization; . In the meantime a document has started to be prepared to give an overview of the task ( http://bscw.research-infrastructures.eu/bscw/bscw.cgi/d241094/iMarine%20T9%202%20Overview_v0.1.docx). The document has been presented to the fist Task conference call and it will be the basis for the Wiki for the Data Transfer facilities specification which has to be delivered at month 6. During the conference call the first 2 month activities and a plan for the M4 have been also reported. A skeleton of the gCube Data Transfer Agent has been developed following the new gCube Maven project structure. This component will be needed both in case of a EMI FTS integration or of a new gCube Data Transfer service.
- New FTS version installed and configured for the D4Science VOVirtual Organization; .
- T9.2 Overview document drafted.
- First Task conf call organized
- gDTA service skeleton developed.
NKUA has been working on incorporating gRS2 into the new gCube Data Transfer Service
The following tasks are in progress in order to achieve this goal
- Adding support for multiple transfer protocols: The implementation of a general purpose URL resolution library for a variety of protocols (http, ftp, sftp, ftps, gridftp, bittorrent etc..) is currently in progress. The functionality provided by the latter will be employed by the gRS2 method used to open streams on field payload. In this way, a client will be able to transfer data using the supported protocols by communicating the URL of the objects in corresponding gRS2 fields. The functionality of the library can also be incorporated into the standard URL mechanism of Java in order to handle all URLs uniformly. The implementation of the library is currently in progress and is expected to be completed by next month. There is no need to support protocols used for the transfer of data objects at the transfer level of gRS2 itself, since they are implemented over the current TCP layer and are sometimes in conflict with the nature of gRS2. All protocols other than HTTP(S) (see 2) will be implemented at the application layer.
- Supporting HTTP as gRS2 transfer method: Point to point proxies for gRS2 which will be used to transfer the result set over http(s) are under development. The result set will be delivered in XML format and will also be exposed as an http endpoint. An example of the http endpoint "http://hostname/gRS2Broker?result_set_uri".
Following the architecture shown in the picture http://wiki.i-marine.eu/index.php/File:THREDDS_Architecture2.jpg, the activity has returned a first feedback about the use of a Thredds server, either using WCS or OpenDAP protocol. Then the focus moved on to the WPS block, provided by http://52north.org/maven/project-sites/wps/52n-wps-webapp/ , and the goal is to obtain a full integration between this WPS server and the MapReduce provided by Hadoop (http://hadoop.apache.org/mapreduce/). This combination should provide a more scalable and reliable computing architecture for the processing side of the Data Access.
The first tests showed a good integration that looks promising for the management of geospatial data. A prototype running with Bathymetry data should become available by mid of February.
T9.3 Data Assessment, Harmonization and Certification Facilities
The tickets about Time Series components left on D4Science II project have been analyzed in order to produce requirements tickets on iMarine project. The new tickets will be soon open on the iMarine issue tracker.
- Tickets about Time Series Environment components coming from D4Science II project have been digested.