2012.10 WP9: Data Management Facilities Development Monthly Activity By Task and Beneficiary
From IMarine Wiki
This WP9 Activity Report described the activities performed in October 2012 by Beneficiary and Task.
It is part of Monthly Activity Report.
T9.1 Data Access and Storage Facilities
FAO has been involved in the following activities:
- For the 3rd TCOM, FAO has presented a plan for the integration of Tree Manager in the system, liaising with CNR and NKUA to gather all the required information.
- Following the plan above, FAO has supported CNR in the development of Tree Manager plugin for the SPD Service, releasing an update to the
tree-repositorycomponent to better support the plugin. FAO has similarly supported NKUA for the development of a Tree Manager-specific Data Source component of the DTS service, illustrating the use of the Tree Manager client API.
- As part of the plan above, FAO has presented a cross-layer architecture for the HTTP URI resolution of data which is otherwise accessible through ad-hoc SOAP interfaces in the data management layer of the system. Following the meeting, FAO has engaged with a discussion with relevant partners, such as CNR and Athens. As a result FAO has implemented the entire architecture and released in gCube 2.11 a set of components that fall in the scope of WP8, WP9, and WP11. Among these, FAO has released
trees1.1.0 which supports URI generation for the trees returned by the Tree Manager service, and tested its interactions with the other components of the new HTTP URI resolution architecture.
- FAO has released version 2.0.1 of the
tree-manager-library, which fixes a problem with queries for Tree Manager services.
None to report.
None to report.
Species Discovery Service
CNR improved the data transfer between Species Discovery Service and its plugins.
New minor release of Storage Manager 1.0.0 has been released in gCube 2.10. The new version of Storage Manager has a new storage area called "volatile" used to store temporary files.
More details available at : https://gcube.wiki.gcube-system.org/gcube/index.php/Storage_Management_NEW#storage-manager_v_1.0.1
Species Tree Manager Plugin
CNR developed a plugin of the Tree Manager services that defines and maintains tree views of biodiversity data sources exposed by the Species Manager services.
CNR fixed bugs reported by using the cache to speed up the response time of the Species Discovery Service.
- Bugs fixed
- Better data transfer between Species Discovery Service and its plugins
- storage-manager-wrapper 1.0.1 release
- storage-manager-core 1.0.1 release
- Species Tree Manager Plugin
T9.2 Data Transfer Facilities
CERN active developed a first version of the Data Transfer Scheduler portlet ( v 1.0.0-SNAPSHOT). The first version is able to :
- connect to an available Data transfer Scheduler service by querying the Information System
- data sources ( FTP for now) browsing
- schedule/cancele/ retrieving transfer outcomes from Data Transfer Agents
more details available at :
During the debugging phase of the portlet, some fixes and enhancements have ben performed both on Data Transfer Scheduler ( v1.1.0-SNAPSHOT) and Agent side (v 1.2.0-SNAPSHOT)
First version of the Data Transfer Scheduler portlet
NKUA has been working on fixing a bug that caused TCPConnectionManager to hang when an empty request was received, as reported at the ticket #481. The bug fix is available in the version MadgikCommons-1-3-0 at the 2.11.0 release.
Also, gRS2 has been extended to expose changes on window size of the RandomReader to RecordReaderDelegate API, which will be used from the ResoultSetConsumer. This extention is available in the version gRS2-2-1-0 at the 2.11.0 release.
The following components have been released in gCube 2.11.0:
We have concluded the delivery of the WPS-hadoop and WPS Client libraries (see below). We have attended the iMarine TCOM in FAO.
software components done by Terradue in the context of WP9 for geospatial data processing:
Geospatial Data Processing takes advantage of the OGC Web Processing Service (WPS) specification as web interface to allow for the dynamic deployment of geospatial processes. While the OGC WPS provides a clear interface it does not provide the computing resources scalability. To address this, Apache Hadoop MapReduce was selected as the computing technology for the processing resources. The software component org.gcube.data-analysis.wps-hadoop acts as a complete framework to host geospatial processes exposed as OGC Web Processing Services and ran on Hadoop clusters (or pseudo-cluster). The framework includes demonstration algorithms that can be used as basis for other developers or integrators. - Bathymetry Algorithm: a process to retrieve bathymetry from a netCDF file containing the Gebco bathymetry; - Resampler Algorithm: a process that performs a resampling of a geospatial layer in netCDF-CF ; - Intersection Algorithm: a simple process based on the 52 North WPS algorithm, to make an intersection of two Polygons in input; - TIFFUploader Algorithm: a visualization purpose process to upload each layer from a map file, to a GeoServer WMS instance.
In order to better test and exploit the WPS-Hadoop algorithms, a Java WPS-Client library was developed. It also will be very useful to allow external services/client to exploit WPS side of projects. This client is based on the 52 North Java Client API, containing some convenient classes to interact with WPS. In the package called "demo", you can find an example of use of this client with IntersectionAlgorithm. This client provides methods to perform GetCapabilities, DescribeProcess and Execute requests to the WPS-Hadoop server. Furthermore, the WPS-Client includes demonstration classes to run the processes Bathymetry Algorithm, Resampler Algorithm, Intersection Algorithm and TIFFUploader Algorithm. To complement the library, a Command Line was provided in the package too.
T9.3 Data Assessment, Harmonization and Certification Facilities
Study of FAO's OpenSDMX project. Meeting with Erik Van Ingen at FAO Headquarters (Meeting report), discussed the following subjects:
- SDMX specifications
- Technologies and registry implementations availaible in the SDMX scenario
- Deployment diagrams and architectures of systems capable of disseminating SDMX documents
- FAO's work around SDMX (OpenSDMX, data.fao.org)
- Fusion Registry capabilities and drawbacks
- Metadata technology sdmxsource library
- Study of already available SDMX registry implementations: Metadata Technology Fusion Registry, Eurostat Registry.
- Analysis of the capabilities of Fusion Registry and Eurostat Registry.
- Test of Fusion Registry through the usage of SDMX documents, made available from FAO's opensdmx project page.
- Fixed several errors on FAO's SDMX documents in order to make them conform to the standards.
- Study of Metadata technology sdmxsource library in order to see if it can be leveraged in order to meet CNR requirements.
- Study of Spring Inversion of Control framework, required in order to use sdmxsource.
- Started to make contacts with Metadata Technology and Eurostat developers.
Tabular Data Widget:
- Added support for multiple datasourcefactories