1st TCom Meeting: 15th March 2012 Discussions and Notes

From IMarine Wiki

Jump to: navigation, search

Contents

Meeting Agenda

Meeting Participants

GIS infrastructure and GIS Viewer

Gianpaolo Coro (CNR)

Slides: pptx file

FAO is hosting an instance of GeoNetwork. Emmanuel is interested in knowing if it is possible to federate GeoNetwork instances rather that GeoServer instances.

  • This has to be verified;

Emmanuel Comments

  • Federating Geonetwork nodes rather Geoserver instances is already possible by the Geonetwork technology
  • The technology built in iMarine should consider the use of Geonetwork nodes as pre-requisite/condition.
    • In the example of FAO, it should actually rely on the layer metadata set by FAO in the FAO Geonetwork for their layers, and not generate metadata in iMarine harvesting the FAO Geoserver directly.

The "standard" GIS viewer has some limitations (namely scalability). That is the motivation leading CNR to develop the proposed approach. Also GeoServer has severe limitations in terms of scalability.

OBIS experienced the same issues and the approach was to extend the technology and dynamically generate the layers, generate the layers on demand.

GIS Viewer is conceived to be a standalone application, it has to be instructed on the GeoServer to use.

The new version of GeoServer introduced the possibility to have the getCapabilities per namespace rather that just a getCapabilities per instance.

Re GeoExplorer, it is conceived to be a single access point to all the layers residing in the infrastructure. It is oriented to layer discovery. This view is obtained by federating the GeoServer instances.

To check if there is an atom feed behind GeoNetwork.

Layers should be enriched with rich metadata allowing to evaluate the authoritativeness of the data published. Moreover there might be a problem of visibility.

  • Visibility issue can be mitigated by publishing layers in diverse storage / workspaces;

Re security, GeoServer supports only the visibility oriented approach and this is not by user, i.e. the notion of user is not supported.

THREDDS infrastructure and WPS

Fabrice Brito (Terradue)

Slides: ppt file (same slides presented the previous day)

One of the needs of Edward scenario is to enrich their data (biological observations) with physical oceanography information.

  • they are interested in managing time, eg enrich the discovery mechanism with time-oriented aspects;

SPREAD

Emmanuel Blondel (FAO) via Skype

Slides: pptx file

The presentation focused on:

  • D4Science technology requirements, including:
    • TimeSeries geo-curation
    • Intersection generation & discovery
  • GIS technology requirements, with the case study of accessing Aquamaps default layers in SPREAD

D4Science technology requirements

TimeSeries geo-curation

Marc:

  • please notice that the curation phase is expected to be interactive, ie the user should have control on it.

Intersection Engine

GeoBatch should be compared with the approach discussed by Terradue.

  • this has been selected because of the intersection engine that is based on GeoBatch


GIS technology

GIS technology overlaps

The consideration of using existing OpenSource technologies was also mentioned as part of the presentation, with the example of Geobatch interaction libraries (Geoserver-manager, Geonetwork-manager) vs. GeoserverInteraction.

Aquamaps case study

In the case of accessing Aquamaps default layers in SPREAD, the use of Geoserver SQL view was suggested:

  • to reduce the complexity of Aquamaps default data access & consider a common way to access probability-based layers (single feature source)
  • to mention the need of GIS technology enhancements, e.g. WFSDatastore improvement to support View parameters

Comments

  • CNR highlight the fact that tables can be huge and this approach can be affected by limitations;
  • Edward affirmed that there is no limitation with this approach. Huge tables can be exposed;
  • This is the approach OBIS website relies on;

Action: to organise a technical discussion on this;

CNR commented that there is a continuous evolution of the requirements and desiderata characterising this scenario. It is fundamental to reach a shared understanding before planning any activity and / or workplan.

FAO (Blondel) affirmed there is no change in the requirements. The case of accessing Aquamaps default layers through Geoserver SQL view layer is a proposal to the requirement of optimally accessing Aquamaps default layers (both WMS & WFS) in SPREAD. FAO highlighted again that the definition of high-level technology requirements requires access to technical documentation of developed components (agreed action at 21/02/2012 meeting)

Minutes of the last meeting is available here.

Biodiversity cluster: status, opportunities and plan

Federating biodiversity resource providers: the Species Service

Pasquale Pagano (CNR)

Slides: pptx file

Anton: A way to capture the "fuzziness" of the filtering should be added, e.g. it should be possible to add a sort of threshold enlarging the possible replies like my query specifies a point while I can accept results ranging far from the point to a given extent.

Re GIS visualization, the tool should make it possible to specify the number of layers to be generated, i.e. just one layer, one layer per data source, one layer per data set.

Re credits, it should not be by data source, e.g. it is not appropriate to have the same credit for all datasets resulting from a data provider like OBIS, there is the need to cite the real originator of data.

Fabio (FAO): the design of the service should be conceived to promote re-usability. If the service contains "functions" that are per se meaningless /useful they should be designed as a single component.

IRD is another potential data provider.

PESI regional catalogue should be there.

There is no programmatic way for accessing Fishbase and SeaLifeBase.

Facilities for comparing two datasets (including species names) should be envisaged. The criteria for comparing two datasets should be identified, since they are data specific.

  • FAO will show the approach proposed for Vessels;

Performing Data Analysis: the bioclimate case

Gianpaolo Coro (CNR)

Slides: pptx file

Discussion moderated by FAO

Statistical cluster: status, opportunities and plan

Codelist Management and Timeseries Management - Discussion moderated by FAO and CNR

Erik Van Ingen (FAO)

Slides: pptx file

The requirements stemming from BoI / MDM should be validate by the project.

Entity Mapper

Fabio Fiorellato

Slides: []

The proposed approach is very flexible. To some extent it is complementary with FLOD (i.e. it can be used to populate FLOD) while to some extent it is competitor with FLOD (a client can either ask to FLOD or to such a tool for mappings discovery).

From the "mapper" POV, the infrastructure is conceived as a computing power provider.

Semantic cluster: status, opportunities and plan

FAO/IRD: RDF production from IRD’s large databases

Julien Barde (IRD)

Slides: pdf file

Marc: an important aspect is to clarify the policy governing ontology development and publishing;

  • Yannis: this is a duty of the CoP, it is not a technical stuff;

Fabio: is this discussion alluding to the fact that there is ONE place where all the data should go and be represented in RDF, namely it should be FLOD;

Overall: it is fundamental to start discussing these aspects and involve the technical team, eg by sending emails through the TCom mailing list.

  • this discussion should be part of the activity taking place in the context of WP3;

FAO/FORTH: FLOD-xSearch interaction

Discussion moderated by FAO

Parallel session on Security

Participants:

  • Lucia Bonelli, Ciro Formisano (ENG)
  • Fabio Simeoni (FAO)
  • Andrea Manzi (CERN)
  • Manuele Simi (CNR)

15 March from 14:30 to 17:30

Authorization

  • how is it possible to match the call to a service with a particular resource and action and then find their related policy?
    • the call carries only the information about the identity of the caller
    • the information related to the resource is extracted from the body of the call with a regular expression found in a configuration file (format to be defined) associated to the called service
    • the information related to the action and/or attributes is extracted from a configuration file (format to be defined) associated to the called service
    • finally, the gCube Security Handler (to be renamed?) creates the request for the authorization interface
  • in the spirit of the new Resource Management with zero dependency, we need to understand how to pass the information needed to SOA3 transparently.
    • for managed clients to managed service calls:
      • only the identity of the caller is required to be carried by the call (taken from a configuration file)
      • in SOA3, an authorization policy defines the possibility to perform an arbitrary action on a certain resource basing on a set of attributes, where the resource is whatever has a resource identifier
      • the action and attribute values are meaningless for the Authorization framework
      • in the context of managed services the action is an operation to be invoked on a service (which acts as target resource in this case) and the parameters of the operation can be mapped into attributes
      • the service could use a file to specify the parameters of the call that have to be ported as attributes
      • the information about the roles that are passed now into the soap header, have to be passed into the HTTP header in order to be not soap specific.
      • when applicable, the possibility to pass the information about the roles in the soap header will be preserved.
      • since managed services have no direct knowledge of the gCube resources (apart from themselves, which they do not even know to be), other types of authorization are not foreseen
    • for managed clients to unmanaged service calls:
      • it is not possible to support authorization
  • gCube Security Handler is the connector between gCube and SOA3
    • gCube Security Handler creates the request for the authorization interface basing on the parameters of the message received by gCube
    • the instructions on how the parameters are extracted could be defined in a configuration file
    • gCube Security Handler is will be presumably composed by two sub-modules:
      • the first one, a Java library independent from any Globus/gCube stuff, which performs the parameters extraction, then envelopes the authorization request and forwards it to SOA3
      • the second one depends on gCube for the current stack, on new foundations when they will be available
    • an interface between the two modules will be defined, the best solution seems to be the use of a String representing the HTTP message as a parameter to be passed between the two modules
    • appropriate hooks has to be provided at client side
      • GCUBERemoteProxy for gCube clients
      • client-managers for managed clients
    • appropriate hooks has to be provided at service side
      • Globus handler
      • Listener compliant with Servlet Specifications

Authentication

  • calls coming from the clients need to be authenticated
  • for authentication purpose, each call has to carry the credentials (username/password or DN) of the caller, not the role
  • we cannot rely on a previously verified role
    • each call can arrive from any caller now
    • at service side:
      • we need to associate the credentials to role(s) for each call
      • the credentials can be extracted from the call in the same way we extract parameters, values and resource, i.e. thought a regular expression taken from a config file
      • in order to reduce the number of calls to LDAP, a cache of credentials -> roles' associations is needed for previously authenticated credentials
      • all the roles associated to an identity have to be considered for each call
    • at client side:
      • for managed clients: credentials could be found in system properties or in a config file
      • for gCore clients: credentials are propagated from the previous call
  • about the transport mechanism
    • we can pass the credentials inside HTTP headers, but we need to encrypt them if no HTTPS is used

Integration in the production environment

  • the solution that has been released for D4science II is not yet integrated in some points
  • it is unconvenient to complete the integration of that version now and then adapt it to the new solution
  • it has been agreed to integrate directly the new solution

Liferay DB and sync with LDAP

  • CNR to check if it is possible to use directly LDAP as backend for Liferay.

SLA vs Authz Policies

  • there is the need for a component that checks the level of SLA
  • the concept of SLA management is different from the concept of authorization
  • SOA3 provides the possibility to manage only authorization where the authorization policies are those described above
  • SOA3 could support a limited SLA management under two assumptions:
    • an external module should keep the resource usage attributes up to date
    • the only set of accepted responses from the limited SLA management are "true" or "false"
  • under these assumptions the SLA management can be supported by the authorization framework

Conclusions

  • ENG will elaborate the discussed requirements and the ones that may outcome from the iMarine board next week and then will provide a workplan
    • AuthZ and AuthN have not to wait for the new foundations to go in production
      • apart from described technological hooks (that is the Security Handler), the rest of the implementation is not foundations-specific
      • it will be even better to test it on the stable gCore platform
  • M34 will be provided by the end of March

 

Personal tools