Use Cases for EA-CoP Data Access and Sharing Policies

From IMarine Wiki

Jump to: navigation, search

Contents

Here follows a list of use cases for which the EA-CoP Data Access and Sharing Policies may apply.

Template for the use cases

Strategy The Strategy chapter positions the Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

  • Define the initiative and set the Goal from the user perspective
  • List specific benefits and comparative advantages of using the iMarine platform
  • Describe how the use case can be sustained in the future

Policy The Policy chapter frames the use case in the community exploitation perspective:

  • List general policy principles related to data resources, access, sharing, and storage
  • Align the use case with the iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and extend these where needed
  • List any negative potential impact of this use case

Guidelines The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

This can be aligned with FACP practices on:

  • Objectivity;
  • Reliability and timeliness;
  • Length versus comprehensiveness;
  • Hard copy – versus electronic format; Languages and translations; Partnerships

Code List Management

Code List management is split in 2 topics to address the difference between general policies related to the principle of code list management within a community and policies related to the implementation of code list management in a tool (Cotrix). As described in the Cotrix use case below, code list manager policies are quite limited as, for instance, the policies in terms of sharing are directly linked to the user profile authorized to create/publish/disseminate code lists.

We will in this use case, address more general policy principles related to code list management within a community regardless of tools and implementation concerns.

Strategy

Code List management is a key component when dealing with data management. It is meant to provide a central repository of quality code lists (reference data, master data) to be shared within a given community. Code list management answers several needs of the community:

  • Some code lists are authoritative to the community. These code lists are maintained and shared by the authoritative institution. Community needs to have access easily to these code lists in different formats (simple one as CSV, more complex one as SDMX). Community expects to be alerted on time when code lists are updated and get a comprehensive description of the changes.
  • Community needs to have the code list evolve to address new needs (new species caught in an area, the ASFIS needs to be updated regularly): they need to know which communication channel is opened to the institution responsible for the code list maintenance.
  • Some authoritative code lists needs to be enriched by community members as it doesn’t completely answer the needs in terms of classification. Either the code list is design to be enriched (CPC classification for instance with reserved block of codes to define the user own classification based on the CPC model) or the code list is not designed for it (Case of CWP vessel type classification, a national institution will create its own local vessel type list and map it to the CWP classification).


Policy

Code list management exposes code lists and their mapping to the community in several electronic formats. These resource are open data, there is no restriction on the use and the sharing of code list.

Code list quality is enforced with a set of comprehensive metadata describing the code list and insuring identification of source, owner, communication channel, authoritativeness and validity range. Frequency of update varies from one code list to another. Some are very stable and are update every 10 – 20 years (Gear type, vessel type), some will have a yearly version.

Given some complex distribution mechanism, these code list metadata must be strongly attached to code list data. An example of complex distribution mechanism is the following: ASFIS code list is published by FAO, collected by DG-Mare which validates changes (or not) and then distributes it the EU countries. Each national institution is then in charge of distributing code lists to their partners (local professional organizations, research institutions like IRD, eLog book software companies). At each level, a request could be sent to FAO to add new species. The communication channel to FAO/ASFIS (Owner e-mail) needs to be clearly identified for all actors and available at each level. The example above illustrates the complexity of the responsibility in the code list dissemination. Each level has its own responsibility of sharing the code list. But as soon as code list is extracted for the code list management tool, the responsibility of using and sharing the code list lies with the users neither with the code list management nor the code list owner.

Code list evolution is deeply linked to a good collaboration with code lists users. Community submits requests for item insertion to the code list.

[policy for Enrichment/mapping]

Guidelines

Code list management requires a set of business metadata that are already covered by the iMarine general policies with the 12 standard Dublin Core publication metadata. Additional metadata are required to address the specific needs expressed previously:

  • Version: a code list evolves in time and versions are issued regularly
  • Authoritativeness: there is a need to extend the DC source metadata to capture if this code list is considered as authoritative by the community. A context or scope would be sufficient (Scope of the current list is international, regional, national or local).

Code list management should comply to the general iMarine data sharing policy, especially making sure that code list metadata are associated and distributed with the code list.

Cotrix Code list management

Strategy

The Code list management is envisioned to be the natural extension to several existing code list management tools to offer specific functionality not yet delivered through other tools.

Code list management in iMarine aims to deliver through Cotrix; a versatile and flexible environment for the management and versioning of code lists and the provisioning of existing code lists in repositories with quality code lists in their native format.

The product delivering the services is described here: CLM

The process to develop and implement the Cotrix components is described in a separate wiki, as this is done in collaboration with external entities without access to the iMarine wiki environment: cotrixrep

The initial role of the iMarine platform for Cotrix will be that is consumes the output of the code list manager as SDMX code lists, and provisions the code list manager with SDMX code lists where a new version is required.

In later iterations, and under the aegis of WP6, relevant elements of the code list manager will be brought in the iMarine ecosystem, and advanced functionality is expected to consume e-infra resources. Future mapping use cases are described in the code list mapping wiki page.

Policy

Specific policies for the operation of code list management are not required because:

  • The properties/set of quality required to achieve the specific objective are all based on open and accessible resources
  • The principles for code list management in the realm of Cotrix are beyond the policies offered through the e-infra and other structures. Therefore, no direct impact on CLM development is expected.
  • The principles of the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) need extending when the output of the CLM is considered. These are met by enriching the CLM products (i.e. code list with a set of well-defined metadata) on which iMarine policies can be based.
  • The CLM has authentication and security components to which the enforcing of responsible use by the users is outsourced. The various actors involved in CLM are the Cotrix manager and the Cotrix users. These depends on access rights granted trough other systems, and maintained by the Cotrix manager to share produced code lists. However, these formally make no part of Cotrix.
  • Cotrix has a specific user-group defined elsewhere.
  • Users of Cotrix will need support to ingest, manage, version and share codelists. The support will mostly be provided through on-line and tool-tip texts in English. Since the WF is rather narrow, the support can be limited.

Guidelines

The policy can not yet be extended with Guidelines, as it is too early to tell how the tool will behave.

FLOD

Strategy

The FLOD content management focuses intitialy on enriching the KB with code lists and other master data. KB management is envisioned to be the related to the repository management paradigm currently under review in FAO. FLOD would rely on specific functionality not yet delivered through other tools for its content generation and maintenance.

The product delivering the services is described here: [1]

The initial role of the iMarine platform for FLOD will be to consumes it's content as rdf through it's api and SPARQL end-points. In addition, a human interface is availble to dicover and browse content. There are no specific restrictions or controls incorporated in the data access services.

Policy

Specific policies for the operation of FLOD are not required because it is intended to serve as a Linked OPEN Data.

For the development of FLOD content, other pardigms may apply, e.g. to have QA and QC on the ingestion and maintenance WF's:

Guidelines

The policy can not yet be extended with Guidelines, as it is too early to tell how the tool will behave:

iMarine EA Linked Open Data Initiative

content to be provided by Claudio Baldassarre, Julien Barde, and Anton Ellenbroek

Strategy

The Strategy chapter positions LOD data access and sharing policies in the broader context of iMarine objectives:

The expected products and services are described here: Ecosystem_Approach_Community_of_Practice:_EA-LOD. To summarize:

  • Define the initiative and set the Goal
The EA-iMarine-LOD, initiative promoted by FAO, is meant to develop the necessary capacities in the infrastructure, to instantiate a network of scientific datasets accurately interlinked within and beyond the iMarine infrastructure, with focus/core in the EA domain, and to deliver the services easing the exploitation of this network of interlinked datasets.

The goal of this initiative is to overcome the challenges of the demanding task of LOD engineering and exploitation, by supplying what providers lack: the resources, and the technical expertise.

  • Identify the benefits
LOD datasets bring full fruition when as densely interlinked as possible innerly, and with external LOD dataset. The return on investment on good quality LOD engineering is the possibility to become part of fast growing network of datasets in the EA domain, created by institution or even citizen scientists. Scenarios of interoperable systems, integrated information retrieval environment, information mash-up environments, data harmonization, dissemination of public URIs for content annotation at web scale, can be engineered on top of the network of distributed LOD datasets, with minimum obstrusion in existing systems.
These benefits are reflected in two products developed in the context of iMarine: SmartFish Regional Information System (RIS), and IRD Tuna dynamic fact sheets.
  • Position the role of the iMarine platform in respect of the LOD
The EA to fisheries and conservation of marine living resources requires the combination of data gathered from multiple contexts, and most of these data reference each other in the same way ecosystems relate organisms. When the co-referencing is captured in a consistent, reusable and shared network of semantic computational relationships, this produces a knowledge asset which didn’t use to previously exist. The key role of the iMarine platform is that of mediator for iMarine partners that want to ship their data into the LOD cloud. Because of its catalysing role for data sharing in the EA, the iMarine platform is ideally positioned to produce itself a node of EA LOD meant to become the reference hub of scientifically accurate data mapping in the EA. The production of the core set of LOD builds and extends the iMarine code lists management facility. Last but not least, the iMarine platform can offer scalability for storage and computing performance, a welcomed asset for a demanding technology in such resources.

Policy

The Policy chapter What follows is the first Policy proposal that corresponds to the vision for an EA-iMarine-LOD centered in a core of mapping relationships among the LOD datasets contributed by project partners.

  • Define the properties set of quality(*) required to achieve the specific objectives
Scientific accuracy: The EA-iMarine-LOD should enable to deliver semantically meaningful connections and mash-up services with the required scientific accuracy. This can be realized thanks to a trustable core set of linked entities, generated under the iMarine Governance and policy. The core set will have to implement quality rules which a sub-set of iMarine VREs might be able to comply with given the scientific value of the VRE operations, operators and governing policies (e.g. the Code list manager / mapper, the iMarine Geonetwork catalog, the iMarine SDMX registry). The core (or first realm) will hence consist of mapping relationships generated as output of processes taking place in selected iMarine VREs.
Comprehensiveness within the EA: the interlinking and aggregation services should not be limited to the core LOD data published and will fruitfully benefit of the much wider offer of EA LOD data sets published by iMarine partners or by external organizations, such as IRD's Ecoscope, FAO's FLOD, FAO's SmartFish. Those EA LOD data sets produced outside of the iMarine governance constitute the second realm aiming at comprehensiveness within the EA. The iMarine governance will aim at identifying those data sets of interest, or means to generate them including through iMarine LOD generation services. Comprehensiveness within the EA will be achieved by linking the entities of the core to exposed URIs of the second realm.
Global LOD partnerships: The EA-iMarine-LOD available on the LOD cloud will offer a unique opportunity for iMarine partners to draw interoperability linkages with external data sets (defined in this policy as third realm) available on the LOD cloud (e.g. Agrovoc), thus offering a great return on investment to the iMarine partners. New partnerships such as iMarine-AgInfra can open up as the LOD itself has the principle to catalyze relationships towards reference datasets in a specific domain like the EA.
Reliability: reliability is essential for EA-iMarine-LOD as it is inner part of the scientific data processes promoted by iMarine. Reliability builds on a set of factors, including a trusted LOD engineering process, inherited quality of the provenance data, responsibility/reliability of the originator of relationships cross datasets. Quality assurance rules can contribute to reliability, such as not imposing mutual responsibility of relationships established unilaterally: isolation can be guaranteed by the technology itself, while mutually implemented reference identifies acknowledgement and generally a shared vision on the domain, that can translate to a higher reliability.
Stability of references: when a LOD dataset is updated to a new version, the data owner must guarantee the same set of URIs for the entities pre-existing the update, and preferably must guarantee the persistence of the URI so that clients will continue to find a description of those entities via web protocols (for instance when it is referred by other LOD providers via mapping). Versioning information should be part of the dataset itself, as explicit as possible to enable machine clients to distinguish data across time.
Timeliness: If the evolution of a LOD datasets impact on the incoming referential relationships, as a good policy/practice the data owner should have in place mechanisms of notification that can monitor broken links drawn from external issuers, and alert the issuer. This should be a shared responsibility between the referral and the referred data provider.
  • Define the principles prevailing for the implementation of LOD
Attribution and traceability: meta information on ownership, publishing rights, copy rights, that are applicable to the entire datasets, should form explicitly part of it, so that machine client can recover such information in automated process. Also if the datasets is linked to a number of other LOD datasets it should include the map of such linking explicitly together with other metadata.
Flexible ownership options: LOD dataset creation calls for specific expertise which Partners do not necessarily master, albeit requires compatibility with Partners policies. Flexibility is part of EA-iMarine-LOD policy and 3 options of involvement are proposed when creating LOD dataset:
  • full delegation: the data provider opens free access to its data repository. LOD engineering and storage runs in the e-infrastructure. Once created the publication of the result can be through iMarine web channel, or through the data provider web channel.
  • semi-delegation: the data provider opens free access to its data repository. LOD engineering runs on the e-infrastructure, while storage resides in the data provider’s infrastructure, where the result of engineering is deployed after completion. Once created the publication of the result will be through the data provider web channel.
  • no delegation: the data providers completes all the tasks of LOD engineering in house, stores and publishes its linked dataset trough its web channel.
For each of the three participation option, the data provider is guaranteed to keep the authority on the generated linked data by the technical mean of domain name authority (direct), or the metadata expression of ownership, rights holder, publication rights etc. (indirect).
  • Define the responsibilities of the various actors involved
The data provider: committing to the initiative, the data provider seeks for publishing a selected dataset in open and linked format. This gives the opportunity to join a bigger network of datasets in the EA domain, thus enriching its own data. To achieve this RoI, the investment envisioned consists of: engineer LOD data from its existing source (with support of iMarine LOD engineering service); instantiate an access mechanism to the LOD dataset created (e.g. SPARQL endpoint, or dereferenceable URIs ); assuring good quality of LOD engineering by adopting standard ontology models; intensify the network of relationships with other LOD datasets, attach business metadata to the LOD dataset, provide support to cross dataset linking (for instance evaluate proposed mapping, and link back, or exposing interface for linking discovery)
The Board: generally oversees the EA-iMarine-LOD policy and its evolution. Decides the scopes in which the EA-iMarine-LOD can play a key role for supporting external application development. Decides the liaison with technology providers, that may improve the quality of the network (e.g. infrastructure scalability, network design, LOD engineering process). Decides the inclusion of new LOD dataset in the first of the second realm.
The (future) Secretariat: is responsible for proposing the policies of participation to the EA-iMarine-LOD initiative. Is responsible for selecting the technologies for LOD engineering, and in so doing will exert quality control. The secretariat will also assure the QoS for serving the EA-iMarine-LOD (e.g. will maintain tools up and running), finally it will create opportunity for training and capacity building around the EA-iMarine-LOD (e.g. application design and development, or development of an LOD dataset from existing data sources).


  • Define the type of collaborations
Collaborations are required among data owners to cross-linking the datasets in building the second realm: crosslinking is responsibility of the source dataset, and linking back from the target dataset improves the reliability. Collaboration may also include the identification of the ontologies that provide the semantic relationships to establish the linking.
Communication between the iMarine LOD data owner and the LOD owner in the cloud is required regarding cross-linking of the second data realm towards the LOD cloud, in order to achieve link-back. In this communication, if a seed of mapping already exists, it can be presented for evaluation, and endorsed as link back. Alternatively preliminary discussions and candidate mapping can be identified, to then produce a seed of cross linking, and closing the loop.
Collaboration is required between the Board or the Secretariat and the LOD technology providers
  • Define the kind of support required
This initiative requires that at least the project partners that are already data providers for the VREs, or are involved to provide reference data for the processes in the VRE, are also involved in the creation of the EA-iMarine-LOD.

Also the initiative needs support to realize a vision plan so that the EA-iMarine-LOD is used also outside the project by a liaison management among iMarine and other (EU) projects.

Guidelines

The policy will be extended with Guidelines:

  • Metadata and models

Build on RDF and OWL Metadata standards.

These will implement the EA-iMarine-LOD centered in a core of mapping relationships among the LOD datasets contributed by project partners. The configuration of the EA-iMarine-LOD can develop across 3 areas or data realms.

In the first realm (inner area in Figure 1) are the linked data attributed to iMarine. By this we mean data in linked format (i.e. RDF) modeling knowledge surfacing from processes run in the VRE, or by other original activities. These data being original cannot be claimed ownership, but will explicit provenance has they will usually refer entities from the second realm.

The second realm includes linked data as they are produced with a LOD engineering methodology (supported by iMarine tools), and from existing data repositories. These linked datasets are subject to the policies of access and publication as the respective data owner issues them. The datasets in this realm can include relationships referring entities inward to the first realm, or outward to the third realm.

The third realm is representative of the LOD cloud , i.e. the data space on the web of interconnected linked datasets. The EA-iMarine-LOD will be a small scale LOD cloud, interconnected with it, from its core and through the second realm data, and will be specialized on EA. Linked dataset in the third realm prescind any form of control or policy applicable by the project, as they exist independently as any other resource on the web.


  • Editorial workflow

LOD dataset creation in the first and the second realms will be supported by the e-infrastructure with a number of services for LOD engineering. The services invoked in a pipeline, carry out the dataset generation. A second iteration of the same pipeline will produce a new version of the dataset. Executing the services requires obviously an access to the data repository to be LOD-fied.

The editorial workflow for this initiative is instantiated by a specific configuration of execution in pipeline of gCube services for LOD engineering. The execution will process a number of transformation sequentially from the data access exposed by the data provider, and produce a LOD dataset. During this ETL process the workflow includes provenance metadata or linage metadata to inform of the executed processes, sources and target of transformations.

To support the crosslink activity both, source and target, datasets must share sources for linking discovery. Public SPARQL endpoints exposing the content of the dataset to programmatic query; structured document containing entities from both datasets (e.g. OGC metadata layer of species distribution); access through API to matching discovery. Any other sort of off-line collaboration that can prepare the data for cross linking.

  • Roles and responsibilities in the workflow

The committed member may want to be participative across a number of levels of involvement when creating their LOD dataset

The workflow starts with a preparatory phase of LOD dataset engineering for those who do not have resources, expertise or the tools to underpin such LOD production.

WORMS data use

Strategy

The Strategy chapter positions the concerned Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

  • Define the initiative and set the Goal
  • Identify the benefits
  • Position the role of the iMarine platform in respect of the concerned Use Case

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

Species Products Discovery

Strategy

The Strategy chapter positions the concerned Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

  • Define the initiative and set the Goal
  • Identify the benefits
  • Position the role of the iMarine platform in respect of the concerned Use Case

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

Applifish - FAO Species fact sheet

[AppliFish] is an iMarine mobile App based on the FAO Aquatic Species Fact sheets mashed up with information from other consortium partners (AquaMaps, Fishbase, SeaLifeBase, WoRMS, OBIS, IUCN).

It was released late December 2012, and nearly 1000 downloads were registered as of March 1, 2013.

AppliFish was instrumental in the development of a set of default documents that may be re-used for other products.

The documents for Applifish that relate to the iMarine access and sharing policies are:

  • The About page;
  • The Credits page;
  • The Disclaimer.

Strategy

The Strategy chapter of AppliFish concerning data access and sharing was that only open, free, global and scientifically based data are contained in the App, and that all data owners provide prior consent to the use of and context of their data in a mobile app.

The strategy for data access and sharing

  • aims at informing a as wide as possible public on important marine species with publicly shared data,
  • with the benefit of free information access to users that can be carried around, without any costs to the contributing organizations
  • building on the iMarine platform as the orchestrator and container of data in the App.

Policy

The AppliFish Policies bring open data to a controlled Application; all content is under the control of the App manager. It is thus an example of a derivative work. The policy concern when preparing the app was to ensure no copyright infringements were made, that each contributor agreed to have its data disseminated jointly with the other sources, and that the fair use was recognized and acknowledgments were included. AppliFish was built on the understanding that all mashed up data are open, and extracted from controlled and quality sources.

The principles are implemented through:

  • Disclaimer, which was developed with the assistance from the FAO Legal Office;

The following issues were debated during the compilation of the Disclaimer: - Permissions to use, re-use/re-distribute the information - Identification of the legitimate copyright holder (legal body) and consequently the contact point (although this has not legal implication) - Disclaim any liability of the data owner (e.g. in regards of country boundaries) - Full list of sources of information - Update of the disclaimer in case the list of sources vary

  • Copyright, which was derived from the FAO copyright statement;
  • Data Citation is ensured through the AppliFish credits page;
  • Posting Content is not enabled, but will become relevant in future versions.

See below the actual text for credits, disclaimer and copyright notice. A citation mechanism has not been implemented yet. A proposed text is indicated below. A CC possible license is also provided.

AppliFish was conceptualized with the following uses in mind:

  • The only actors are users interested to learn more about an aquatic species, but not for scientific or political uses;
  • For the normal usage of the tool, no specific requirements wrt collaborations are required. However, the tool evidences the loose collaboration with FIN, WoRMS, OBIS and other at data integration levels.
  • The operation of the tool requires no significant support structure. Regular updates of the tool are under the control of the developers.

Guidelines

AppliFish, being fully controlled by FAO, requires the following guidelines for:

  • Editorial workflow: Information is produced in large part by the FishFinder team, and using the Species fact sheet module, the FIGIS team ensures that the information is properly transferred to iMarine and consequently to AppliFish.
  • Roles and responsibilities in the workflow: FishFinder is the responsible actor for giving clearance to the release of new information through AppliFish. OBIS, AquaMaps, and FB-SLB provided consent to use their data, IUCN public data were used, but no consent was sought.

Source texts for Credits, Disclaimer and Copyright notice, Citation (Draft)

Credits

iMarine services help to combine data coming from different global and authoritative data providers, into informative fact sheets on over 550 marine species. AppliFish species information is built by meshing-up data from various sources:

  • FAO aquatic species fact sheets
  • AquaMaps species probability maps
  • AquaMaps species 2050 probability maps
  • Local names from Fishbase, SeaLifeBase and WoRMS
  • Links to data sources, such as OBIS, Fishbase/SeaLifeBase, FAO
  • IUCN species conservation status

The following FAO criteria were used for selecting species available in this app:

  • an annual catch exceeding 10 thousand MT
  • importance for local industry or social groups
  • commercial value
  • endangered or vulnerable species
  • importance to aquaculture
  • by-catch
  • potential future importance to fishery

www.i-marine.eu

Diclaimer and Copyright notice

AppliFish is provided "as is" without any warranties of any kind, either express or implied, including but not limited to, warranties of title or implied warranties of merchantability, accuracy, reliability or fitness for a particular purpose. The Parties to the iMarine Consortium make no warranty, either express or implied, as to the accuracy, reliability, or fitness for a particular purpose of the data disseminated through AppliFish.

The designations employed and the data assembled in this application are taken from multiple sources (FAO-FishFinder, WoRMS, Fishbase, SeaLifeBase, IUCN, Aquamaps, OBIS) and may sometimes be contradicting or overlapping. In no event shall the Parties to the iMarine Consortium be liable to you or any other person for any loss of business or profits, or for any indirect, incidental or consequential damages arising out of any use of, or inability to use, AppliFish, even if previously advised of the possibility of such damages, or for any other claim by you or any other person.

Information, text, graphics, etc. provided to you through AppliFish are provided solely as a resource and a convenience to you and do not imply the expression of any opinion whatsoever on the part of any of the Parties to the iMarine Consortium, including but not limited to the legal or development status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.

AppliFish contains copyrighted material and/or other proprietary information and thus, is protected by copyright laws and regulations worldwide. Use of any material copied from AppliFish is restricted to informational, non-commercial or personal use only; no modifications to the information are permitted and the source of information must be fully acknowledged; all references to AppliFish data must mention the iMarine name and URL and specify the AppliFish version number. Use for any other purpose is expressly prohibited without written permission of the copyright holders. To obtain this permission, iMarine may assist in identifying and contacting the legal owner. Permissions to use the information provided by AppliFish must be made in writing to the Chief, Fisheries Statistics and Information Service, FAO, Viale delle Terme di Caracalla, 00153 Rome, Italy; or by e- mail to FI-Inquiries@fao.org.

Citation - DRAFT

Here follows a possible example of Species fact sheet citation:

© FAO, 2013. AppliFish. Albacore. In: iMarine (Data e-Infrastructure Initiative for Fisheries Management and Conservation of Marine Living Resources) [online]. Updated 12 march 2013. [Cited 14 March 2013]. http://www.i-marine.eu/AppliFish/ .See "Disclaimer and Copyright Notice" for the whole list of contributors to AppliFish".

CC License - Proposal

Attribution-NonCommercial-NoDerivs 3.0 Unported

AppliFish by iMarine is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Permissions beyond the scope of this license may be available at http://www.i-marine.eu/AppliFish/Disclamer.aspx.

Geospatial data and OGC Web-Services

An alternative title of this Policy use case is "iMarine Geospatial data discovery, access and sharing Policy". This Policy Use case has been initially mainly promoted by IRD and FAO through the Geospatial cluster.

Strategy

The Strategy chapter positions the geospatial data access and sharing policies in the broader context of iMarine objectives.

The products intended to be shared with and discoverable through the infrastructure are summarized here: Geospatial Cluster - Data Sources. To summarize:

  • Define the initiative and set the Goal

The sharing and discovery of geospatial data products, is aimed to:

  1. instantiate a catalogue of relevant and well-described GIS data coming from organizational Spatial data infrastructures, for access within and beyond the i-Marine web portal, with focus on marine and fishery domain
  2. deliver the services to properly discover, access and process the data made available through the catalogue
  3. when applicable, deliver entity-based sharing and discovery of geospatial data products through semantic repositories/warehouses (Linked Open Data);

The goal is to lead to MoUs on the description, access and sharing of GIS products, whether shared with or produced within the infrastructure, in close connection with initiatives such as the Open Geospatial Consortium and the EU INSPIRE directive

  • Identify the benefits

The main benefits of the iMarine Geospatial data discovery, access and sharing Policy are as follows:

    • take advantages of the existing standard methodologies in term of data description & cataloguing, and underlying implementation solutions to facilitate the discovery and access to geospatial data sources published through their respective infrastructures, thus ensuring access to most up-to-date geospatial information from data owners.
    • provide users with quality metadata that guarantee a suitable use
    • ensure the retrieval, confrontation and processability of all geospatial data in a common and standard way
    • when possible, ensure a semantic-based discovery of geospatial data products
  • Position the role of the iMarine plaftform in respect of the concerned Use Case

The role of the iMarine plateform is to ensure that both external data shared with the infrastructure, and i-Marine products themselves, are well-described, sharable, accessible and processable and in a common and internationally-recognized standard fashion. Thanks to its transformation and derivative products services, iMarine should also contribute to the generation and publication of Geospatial datasets compliant with the INSPIRE directive. For such paper, iMarine has to assist data providers that want to share their data within the iMarine, facilitated by guidelines and appropriate services. At this stage of the Policy use case development, one important challenge and key role for the i-Marine plateform would be to ensure that geospatial data are searchable based on thematic domain categories (e.g. species distributions) as classified by INSPIRE initiative (INSPIRE spatial themes).

Policy

  • Properties

The properties required to achieve a suitable GIS data sharing are as follows:

  1. updateness: ensure that the data and related metadata are updated on a regular basis, and that the iMarine plateform is supplied with the update data
  2. discoverability: ensure that the descriptive metadata is annotated with the relevant information in order to be efficiently discovered, and that the discoverability is operational.
  3. scientific accuracy: ensure that the descriptive metadata is documented with the required completeness for satisfying scientific usage, therefore to inform sufficiently the target users on the data content, the methodology / protocol followed to obtain it, and their potential fields of use for derivative works
  4. data accessibility: ensure that the online resources provided by the metadata, or the services themselves, are active and operational to deliever suitable and validated datasets.
  • Principles prevailing for the geospatial data sharing

All GIS datasets shared with the iMarine plateform are by principle open data.

The principles are implemented through:

  1. Disclaimer: is appended to the metadata by the data owner/provider if it deems necessary
  2. Copyright: is appended to the metadata by the data owner/provider
  3. Data Citation: is appended to the metadata by the data owner/provider. The citation should be mentioned in ways (i) using the Citation form inherent to the metadata standard being implemented, (ii) using a textual bibliographic form.

Considerations for Geospatial data sharing through semantic repositories/warehouses: When geospatial data products are associated in semantic-based repositories/warehouses - thus identified by Unique Resource identifiers (URI) -, the data sharing of such geospatial data products with other semantic repositories/warehouses should be done by relying on the source semantic repository/warehouses where geospatial data products have been associated initially. Hence, in order to ensure the uniqueness of the identification of such geospatial data products, such geospatial data should not be directly associated with other semantic repositories/warehouses.

  • Responsabilities of the various actors

The fulfillment of the above properties and principles is under the main responsability of the data owner and provider. The metadata discoverability has to be ensured by both data provider and the iMarine data catalogue manager.

  • types of collaborations required:

A collaboration between the data provider and the iMarine Geospatial Manager and/or Technical working group is required with good flow of communication on the protocols and the updates. A collaboration with LOD managers can be required, with the objective to use LOD services to annotate in a suitable way the metadata, in order to guarantee their discoverability. As an application, such collaboration was done in FAO with the Fishery Linked Open Data for sharing the FAO aquatic species distributions.

  • The data sharing operation requires the support of an iMarine technical working group to interact with the data provider, for giving data sharing recommendations, and ensuring the data sharings are operational.

Guidelines

The policy is extended with Guidelines:

  • The following page gives the technical publishing guidelines for GIS data & services providers: Publishing guidelines for GIS Data and Services providers
  • Editorial workflow: The GIS products shared within the iMarine are produced, maintained by the data provider (including if the data provider is the iMarine plateform itself), ensuring that each GIS data collection, dataset and/or related service is properly described and discovered by iMarine through the data catalogue. If substantial changes are applied to the data description and/or access, the data provider should inform the iMarine Geospatial Technical working group.
  • Roles and responsibilities in the workflow: In addition of the data maintainer role, the data provider has the role of expressing initially their willingness to share (and promote) their data with the iMarine plateform. This action should be done by informing the iMarine geospatial technical working group, that will first analyse whether the data and/or services intended to be shared fulfill sufficient conditions for enabling the sharing, mainly established by guidelines. If the conditions are sufficient, it will therefore instantiate the sharing with the i-Marine catalogue. Where applicable, the technical working group should give recommendations to the candidate data provider.

Specific Rules

This section highlights a set of rules applicable to certain categories of data products. Those data products should then comply with the general rules mentioned above, and with additional rules that complement the policies with specific domain information.

GIS derived products

DRAFT

Strategy

Policy

  • Additional properties
    • Discoverability: As the result of processing one or more GIS products, the GIS derived products should be described with metadata including the terms, URIs, etc used to discover the source GIS products
    • Scientific accuracy: the descriptive metadata should contain complete Lineage information including: i) source GIS product(s) information (e.g. metadata URL(s)), ii) processing information (e.g. WPS DescribeProcess request URL)

Guidelines

Taxa-based products

Strategy

Policy

  • Additional properties
    • Discoverability:
      • GIS taxa-based products should have metadata indexed with one or more taxa identifier(s), including codes and URIs (for machine-reading), and terms (for human-reading)
      • In principle, the data provider should be free to use the relevant authoritative taxa reference codelist to index the metadata, with the condition that this codelist is well-known by the iMarine infrastructure, and mapped to other codelists. In practice, for the time being, it is recommended to index GIS taxa-based products with the World Register of Marine Species


Guidelines

ICIS

Strategy

The ICIS related products are described elsewhere on the wiki: ([ICIS]).

The Core Component of ICIS data management solutions is StatsCube, but it also can use resources from other Cubes.

  • Initiative and Goal:

The StatsCube aims to provide statisticians a working environment to upload (ingest), manipulate (Curate) and validate time series. The ICIS goal is to provide a 'personalized' subset of Cube Resources to Fisheries data managers to harmonize, standardize and process time series coming from different sources. One key exploitation scenario aims to serve the Tuna Atlas community.

The output of ICIS could be anything from high levels indicators or simply validated time series standardized in a common format.

  • Identify the benefits
    • An open platform to harmonize and standardize time series coming from heterogeneous sources.
    • A cost-effective and secure cloud environment to upload and share time series, especially in context where risks of data loss are high.
  • Some specific benefits and comparative advantages of using the iMarine platform are:
    • A cheap virtual environment for data storage, sharing, analyis and back up; this eliminates the infrastructure costs for individual organizations that are probably better off with a rented solution;
    • A standards based storage solution based on SDMX lowers the threshold for many data owners to start sharing their data in a standard format. Similar storage solutions (FishFrame) can be developed re-using existing components;
    • Unbounded reference datasets, also through semantic technologies can be made available for data managers;
    • A complete database solution that can support advanced data harmonization procedures, e.g. to convert regional fisheries bodies formats to a global grid referenced format;
    • Very powerful computing resources can be mobilized for specific tasks on demand, such as indicator extraction;
    • Flexible toolsets to link time series dimensions to data and allow migration from a set of reference data to another. This can be used to convert a local format into any data exchange format; e.g. from RDF to FishFrame;
    • Extensive capabilities (Statistical Manager, R, WPS) to run statistical processes;
    • Vizualization and map-display of Time series with OGC tools;
    • Reporting and Fact-sheet capabilities based on iMarine out-of-the-box solutions.
  • The ICIS use case can be sustained in the future if the solution offers:
    • Low costs; a solution based on ICIS promises advantages in infrastructure costs;
    • Flexible format transformation functinos to align local formats with international standards;
    • Open computation environment, where algorithms can be loaded to transform and / or analyze dataset;
    • Reporting tools to extract dynamic and / or on-demand reports in a variety of formats;
    • Automation or at least extensive support to data-flows and data validation;
    • A community can be identified in need of such cost-effective solutions.

Policy The Policy chapter frames the use case in the community exploitation perspective:

  • The general policy principles related to data resources, access, sharing, and storage that can be mentioned are
    • All data are private, unless otherwise stated by the owner. The owner can decide who can access the data;
    • An action of sharing implies a sharing of responsibility. No enforcement of a owner policy is enabled after a sharing action.
    • All data-set have a set of metadata that describe provenance, ownership, management, and validity.
    • No mechanism exists to ensure the update of metadata once a data-set is modified.
    • Only data-sets have metadata policies, no explicit metadata policies exist for data;
    • Only some metadata policies are enforced. Most policies however are reliant on responsible use.
    • Since no metadata are attached to data, no extensive data-trails can be kept.
    • After sharing, the owner loses all authority to manage access and manipulation on the shared set.
  • Align the use case with the iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and extend these where needed
    • No adaptation of the iMarine policy is required.
  • List any negative potential impact of this use case
    • Converting an existing data intensive application to rely on ICIS has not yet been undertaken, and no negative impacts have been encountered. Some issues to consider could be:
    • High conversion costs of data;
    • High training costs for staff if not used to using on-line resources;
    • Internet reliability may reduce availability of the resources in local offices;
    • Existing confidentiality and security arrangement may impede a supple conversion to an on-line solution;
    • The risk of staff leaving projects may severely infer with actions that require programming or advanced user-skills, e.g. indicator extraction or data transformation require adequate knowledge of the system.
    • Drawing benefits from an on-line resource requires a paradigm shift from local focus to sharing resources. This is not easy to achieve, especially if the focus is on local data and users.

Guidelines

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

Policy

The Policy chapter

  • ICIS being mainly a tool for Statisticians to work on time series, accessing and sharing data falls under the general iMarine data sharing poilicy. There is no additional particuliar policy.
  • The statistician uploading and analysing the data within ICIS is the only person responsible for sharing data. Once data are validated, these data are not publicly available. They can be made available to a dissemination tool. This will then fall under the dissemination tool policies.

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

TRENDYLYZER

Strategy

The Strategy chapter positions the concerned Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

Policy

The Policy chapter

  • At this stage, no specific policy is needed for Trendylyzer

Guidelines

The policy is extended with Guidelines:

  • Trendylyzer relies on OBIS data. The tool is introspective, and thus no guidelines need to be produced

Species Fact Sheets VRE

Strategy

The Strategy governing the FishFinderVRE product are described here

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

Environmental Enrichment

Strategy

The Strategy chapter positions the concerned Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

  • Define the initiative and set the Goal
  • Identify the benefits
  • Position the role of the iMarine platform in respect of the concerned Use Case

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow

SmartFish

Strategy

3 existing information systems provide information and statistics on South West Indian Ocean fisheries: WIOFish, a regional knowledge database on fisheries; FIRMS, the Fisheries and Resources Monitoring System which globally integrates regional knowledge on the state of resources, including SWOI resources; and StatBase, a fisheries statistical database. Lot of information on Fisheries is available, but scattered across these source systems. The creation of a Fisheries Regional Information System (F-RIS) aims to provide an improved access to these data in a portal from which users will be guided to the information/data depending on their interest, in a seamless way. iMarine semantic tools (http://wiki.i-marine.eu/index.php/Top_Level_Ontology) appeared to be a good opportunity to develop a F-RIS based on indexing and annotations features providing generic and contextual search engines to the user, facilitating data and information browsing, this being done without developing a centralized database exchanging data with the 3 source Information Systems (IS). iMarine also provided sustainable infrastructure solution to host the portal.

Policy

  • Given the architecture foreseen for the F-RIS (a simple portal providing access to data scattered in 3 source Information Systems through advanced search capabilities), the main issues in terms of data access and sharing policies are:

- Respect of the 3 source IS data and sharing policies: all data are public and each source IS requests in their policies that source is cited when their information/data are used in another context. SmartFish will make sure that sources are clearly indicated in the results page. A link to the page in the source IS will be always available as only annotations are available at F-RIS level, the rest of the information/data lies with the source IS. In that respect, SmartFish F-RIS falls under the general principle of iMarine data sharing and access policies; - Access to 3 source IS by the indexing and annotations services: the situation differs from one system to another, mainly given technical constraints. In theory, these indexing and annotations services are free to access information / data, some restrictions from WIOFish have been discussed and cleared. Technical solutions are being discussed to leverage the technical limitation to directly access WIOFish and StatBase data. - Close collaboration and support with the 3 source IS data managers are required to reach a good level on integration of all the systems in the SmartFish network with the following responsibilities:

  • Source IS data manager: should keep SmartFish update of any change in the structure of the data / policies;
  • SmartFish F-RIS manager: should ensure the 3 source IS date sharing policies are enforced in the F-RIS.

There is a need to credit iMarine (or not?) for hosting the portal and for providing indexing and annotation services. This could be done in the about page where iMarine could be mentioned.

Guidelines

Given that the F-RIS doesn't upload or create any data, there is no need for any guideline on metadata definition, data upload and publication. This is under the 3 source Information Systems responsibilities.

SPREAD

Strategy

The SPREAD product and links to background documents are available here.

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow


Resources

Personal tools