2012.05 WP10: Data Consumption Facilities Development Monthly Activity By Task and Beneficiary

From IMarine Wiki

Jump to: navigation, search

Contents

This WP10 Activity Report described the activities performed in May 2012 by Beneficiary and Task.

It is part of the Monthly Activity Report.

T10.1 Data Retrieval Facilities

NKUA Activities

During this period, NKUA has been involved in the following activities:

  • A conference between WP5 and WP10 was held to discuss the new configuration scheme of the revised Resource Registry. The new versions of all components was initially planned to be released in gCube 2.9.0. However, given that a number of services, such as PE2ng and the Data Transfer Agent, will use the Resource Registry in the near future, a dynamic configuration solution was needed in order to avoid configuring the Resource Registry manually at deployment time. The solution agreed upon in the conference was to have separate components containing the appropriate configuration for Resource Registry which is needed at runtime by different services. Since that in order to support the new scheme the release of the search and index subsystems would be required, the decision to postpone the release of Resource Registry has been made. This decision did not cause any adverse effects for other components, as the new versions of Resource Registry components did not include critical fixes, and the enhancements that they incorporate are not required by the current versions of other components released in gCube 2.9.0.
  • A conference on the integration of X-Search with the infrastructure and the gCube Search System has been held. As reported also below, in this conference it was decided that (i) X-Search will be closely integrated with the gCube Search System, i.e. it will use ASL-Search and contact the gCube Search System Service via its web service API; and (ii) FORTH and NKUA will collaborate with WP8 in order to see where resources exploited by X-Search fit into the resource model. In this conference it was also suggested by CNR that the computational requirements of X-Search should be analysed in collaboration with portal experts in order to select the optimal deployment scheme of X-Search.
  • Further discussions in ticket #363 to address the requirement of FORTH in annotating specific fields as useful to drive certain features of X-Search led to a jointly accepted solution. This solution translates to a set of enhancements identified in ticket #412. More specifically, such annotation will be performed using the already existing "Presentation Info" placeholder of presentable fields. The enhancements which were identified are the following:
    • The Resource Registry should support the retrieval of field annotations by providing the corresponding queries.
    • The Resource Registry should propagate annotations for presentable fields corresponding to data sources newly introduced into the Information Retrieval process.
    • A GUI will be developed as part of the Search Manager portlet in order to provide an administrative interface for the easy annotation of fields, in the context of WP6.
  • Implementation of the first two of the activities mentioned above has started.
  • Authoring deliverable D10.1: gCube Query Language Specification.
    • The deliverable has reached the very last stages of the reviewing phase at the end of this period. A number of comments by the reviewer have been addressed in order increase the quality of the document.
  • Authoring deliverable D10.3: iMarine Data Consumption Software.
    • The authoring and editing of the deliverable, as well as the reviewing phase have been completed. A revised version was created addressing the comments of the reviewer. At the end of this period, NKUA was awaiting the approval of the revised version by the reviewer.
  • Release of D10.1 is delayed due to a more extended reviewing period than was initially anticipated, in order to meet the expected quality standards. Delivery is expected to take place during the first days of the next period.
  • Release of D10.3 is delayed. Approval of the final version is pending.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

FORTH Activities

A conference about the integration of XSearch in the infrastructure has been performed. In this conference it has also been discussed how XSearch will communicate with the search system of gcube. A relevant ticket has been opened (ticket #361).

Our first intention was to use the ASL HTTP API. However we agreed that the most efficient way to communicate is by using the client of search system directly. This means that XSearch will be developed on top of several components (i.e. ASL, search system service, Resource Registry, gRS2). The provision of support from NKUA for the development over these components is required.

Furthermore FORTH will perform an analysis of the underlying tasks of XSearch in order to identify which are the processing requirements. This analysis will identify the resource-consuming components. If the incurred cost is heavy then we should consider the case wrapping the XSearch libraries and deploying it as a service in the infrastructure.


none


none

Terradue Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

T10.2 Data Manipulation Facilities

NKUA Activities

After the integration of the Workflow Data Transformation Service Adaptor to the Data Transformation Service, a high scaled testing period followed. During this period many minor bugs were fixed along with some improvements.

Particularly, Data Transformation Service was extensively tested on image transformations including various combinations of mime types as source and target. Also, different Data Sources and Data Sinks were involved in tests. Scenarios from production environment were tested, among them, the thumbnails creation, the rowsets creations for fulltext and forward index feeding from Metadata Collections and Aggregate Metadata Collections.

Moreover, after this testing period came to an end, many fixes of the Data Transformation Service had taken place. Few tweaks and improvements, such as when no transformation process needed, Data Elements are just forwarded to Data Sink. DTS was switched to secure gCore. Moreover, Adaptor was made independent to gCore.

Finally, new implementation of Data Transformation Service has been integrated and will be released along with gCube 2.9.

Future steps remain the same. There will be focus on fault tolerance of the transformation process and fault recovery, and also on concurrent execution of the same transformation processes on different execution nodes.


none


The following components will be available on next release of gCube 2.9:

  • DataTransformationService
  • DataTransformationLibrary
  • DataTransformationHandlers
  • DataTransformationPrograms
  • WorkflowDTSAdaptor

CNR Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

FAO Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

T10.3 Data Mining and Visualisation Facilities

CNR Activities

During the month of May CNR activities went in several directions:

  • the activities on the Statistical Manager implementation were reviewed and concentrated on tests about generators and modelers;
  • the interface to the SM is currently in design phase. CNR is investigating on several interfaces for Data Mining systems in order to take the best ideas from each. The set of systems that were analyzed included: OpenModeler, RapidMiner, Yabi, Mahout;
  • CNR concentrated even on environmental data management to be supplied to the SM. In this scenario the Thredds software was used for retrieving environmental information associated to a certain point on the world; Thredds manages netCDF files containing environmental information over time. A library (Environment Explorer Library) was implemented for retrieving physical or chemical features associated to some points by asking to a Geo Network installation. It is transparent to the user if the feature is present on a GeoServer or on a GeoNetwork layer. More information is available here;
  • CNR implemented the Geo Explorer facility, which is part of the Geospatial Data Visualization domain.


there have been delays due to the recent gCube 2.9.0 release


  • Statistical Manager was tested with local computation
  • SM Interface study and design was stared
  • Thredds was installed and configured
  • The Environmental Explorer library has been implemented
  • Layers included in netCDF files on Thredds can be indexed on Geo Network by an automatic procedure
  • Retrieval of feature values on a certain point by Geo Network is possible
  • Retrieval of feature values on a certain point by Geo Server is possible
  • Retrieval of feature values on a certain point by Thredds is possible
  • GeoExplorer was released in gCube 2.9.0

NKUA Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

FAO Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

T10.4 Semantic Data Analysis Facilities

FORTH Activities

A conference on WP10 activities has been performed. In this conference the issues regarding the integration of XSearch in the infrastructure have been discussed. In particular:

  • We agreed that the best strategy for the communication between XSearch-gCube search system is to use directly the clients of the search system.
  • We will start contributing XSearch code to the repository of iMarine. For this reason a dedicated folder has been created ( https://svn.d4science-ii.research-infrastructures.eu/gcube/trunk/semantic-search).
  • The new component will also use Maven conventions.
  • FORTH will revise the specification of MS45 and the description of subsystems (XSearch) in order to be compliant with the decisions made during the conference. Additionally some sequence diagrams will be included to illustrate the involved components.
  • We created several tickets so that the activities will be better coordinated and visible.
  • Regarding the interaction of XSearch with SPARQL endpoints that are currently external to the infrastructure (i.e. FLOD), this can be driven by registering a SPARQL endpoint through the Runtime Resources.

A discussion regarding the availability of an “extended” description document has been done. The idea was to be able to identify several fields of the results that contain information useful for post-search analysis (i.e. text clustering, textual entity mining). Since the agreed approach is to use the clients of search it cannot be done in the “HTTP-level”. NKUA anticipated that they can provide a mechanism to semantically annotate the result with specific keywords that will be identified by XSearch.


none


none

FAO Activities

The beneficiary should report here a summary of the activities performed in the reporting period


The beneficiary should report here major issues faced in the reporting period and the identified corrective actions, if any.


The beneficiary should report here a bullet list highlighting the main achievements of the reporting period

Personal tools