2012.01 WP8: iMarine Data e-Infrastructure Enabling Technology Development Monthly Activity By Task and Beneficiary
From IMarine Wiki
This WP8 Activity Report described the activities performed in January 2012 by Beneficiary and Task. It is part of January 2012 Activity Report.
T8.1 iMarine Data e-Infrastructure Enabling-technology Development
CNR has designed and developed the following new component:
common-utils-encryption: a general purpose library for encrypting and decrypting XML documents (or part of them) and Strings. It uses a symmetric key based on the AES standard algorithm for cryptography. It does expect that such a key is available on the local classpath. Optionally, the key can be programmatically passed to the methods exposed by the Encrypters. The library builds on top of the Apache XML Security for Java library and the XML Encryption standard.
The primary usage of the facilities offered by this library is to protect sensible data inside the resources published in the Information System. To this, the library managing the gCube Resource's serializations,
common-resources, has been changed to exploit the encryption and decryption functionalities. Among other benefits, this will allow services to move their local configurations to the IS by supporting in this way the dynamic nature of the infrastructure and remote management of services.
Both libraries will be part of the forthcoming gCube 2.8.0 releases.
Brainstorming activities towards the definition/distribution of the new responsibilities within the enabling technology continued with FAO along all the reporting period. Cross-task conferences (between T8.1 and T8.4) have been held and recorded). Integration paths for introducing the new Resource Model in the existing enabling services have been identified. ResourceClientPublisher API and ResourceDiscovery API have been redesigned and IS-Registry and Collector partially rethought accordingly. Implementations will start in the next period.
Following the activities in the WP7 (also supported by WP8) on the gCube mavenization, the Software Repository underwent a deep analysis by a new developer at CNR. The service will be redesigned and refactored in order to meet the new requirements emerged by the mavenization. In particular, it will remove any proprietary form of packaging (Service Archive) and of storage inside its backed. Instead, it will be integrated with the cluster of Maven repositories identified by WP7.
- design and implementation
- developer's guide for
- new version of
common-resourcesfor encrypting sensible data inside gCube Resources released
- definition of integration paths for introducing the new Resource Model in the existing enabling services have been identified.
FAO has developed two new prototype libraries for functionality that falls within the scope of 8.1:
common-ghn-client: a generic lightweight frameworks for management of software clients, from pure clients running in a bare JVM to services that acts as clients (i.e. issue outgoing calls) from within servlet containers. The framework is centred in the notion of a
ClientContainerthat acts as a lifetime manager for an open-ended set of client services, where each client service offers a specific management function. The client container finds its configuration in the file system (indicated by a system property) or, alternatively, in a
client-config.xmlclasspath resource. The configuration is comprised of the JAXB serialisation of one or more client services. The container scans all the classpath archives (jars and folders) that include a
service clientmarker file (possibly empty) for implementation of the
ClientServiceinterface. It then feeds these implementation to a JAXB context and uses it to deserialise them from the container configuration. Finally,it starts and stops all the discovered services when it is started or stopped in turn. The client container can be bootstrapped in a number of ways. In a servlet container, it is expected that a standard listener associated with the Web Application will start it programmatically (see
common-ghn-servicebelow). In a plain JVM it can be started from the command line by an instrumentation agent automatically invoked before application code (using standard Java mechanisms). The library include the instrumentation agent. A first key client service is described below, future ones may be handled within this generic framework.
common-ghn-proxy: a client service for the client container (as discussed above), which uses an embedded HTTP proxy to manage outgoing calls issued by the client code. When started by the client container, the service starts the proxy at a given port (default or custom) and then configures the JVM to pass all socket-based communication through it. A list of interceptors within the proxy can then transform the outgoing calls. A first interceptor takes care to inject scope in the outgoing call, others will follow (e.g. to inject security information). The ability to access information from within the proxy which is only available in the caller's thread is obtained by configuring a custom
SocketFactorythat returns proxies of the standard socket implementation in the JVM. The proxy intercept calls to the
connect()method and communicates to the proxy the local port of the socket. The proxy callbacks all its interceptors in the caller;s thread, which can then access information in that thread (e.g. the scope as the value of a (inheritable)
ThreadLocal. When the call reaches the interceptors, the information associated with the client port can be fetched and used to transform the call. The approach has been tested to work with 0-dependencies on client code with a number of popular HTTP APIs including the standard
java.netAPI, and higher-level API that can be connected to it:
RestEasy. The approach has been shown not to work with Apache's
HttpClientAPI, due to the fact that this API disregards JVM-wide proxy settings. However it has been shown to work when Apache's API is connected to higher-level APIs, which compensates for this problem, particularly
RestEasydoes not compensate when connected to Apache's API. Overall, the approach appears to support 0-dependency management of outgoing calls in a large number of scenarios. The scenarios that are not supported will require changes to the code, though these will not raise gCube dependencies (just the enabling of proxy settings in the HTTP API used by the code).
- analysis and experimentation for service's transparent resource management at run time
- prototype libraries for
T8.2 iMarine Data e-Infrastructure Policy-oriented Security Facilities
The list of activities to be done in the Task 8.2 has been defined. In particular the definition of a pluggable Security Module starting from D4Science's Security Module has been considered as major activity of the task. The priorities are:
- integration of the Scoping module with the Authorization module
- definition of the feature of the policies in coordination with WP5
- bringing the secure infrastructure into production environment after that the service integration will be completed
- Data Encryption according to the requirements provided by WP9
These priorities have been agreed in T8.2 Telco on 20/1/2012: more information available here.
The integration of the Scoping module with the Authorization module is considered by ENG the starting point of the definition of a complete, pluggable Security Module. Part of January activities has been devoted to define a technical proposal for the implementation of a module providing Authorization functionalities, Scoping functionalities (both based on gCore current modules) and, potentially, other security related functionalities. An extensible solution could start from the Security Controller mechanism, which was based on the sequential call of an Authentication Controller and an Authorisation Controller. This model could be generalized becoming a Security Workflow which calls a sequence of Tasks performing security related operation. Every task could implement an atomic operation such as authentication, authorization or scoping (which could be considered a kind of authorization). The old Security Controller interface must be separated from Java SOAP API and must obtain the needed information by a data structure agnostic of the transmission protocol used.
No changes are proposed for role attributes which should be carried by SAML assertions: an exploration activity could be performed in order to understand if a good standard to transport these assertions in the HTTP header exists.
- activities planning and prioritization
- sketched design for evolving the Security Subsystem as a pluggable and extensible module integrated with the Scope Management
T8.3 Workflow Management Facilities
NKUA has been working on the implementation of a node selection and collocation policy library, by exploiting the existing limited functionality previously embedded in the PE2ng adapter employed by the search service and implementing additional methods and policies. The purpose of this library is to:
- Support the upcoming execution abstraction layer which will be integrated into the PE2ng engine.
- Support the enhancement the execution planner of the Search System by providing options for additional execution policies and by supporting execution optimizations.
All implemented functionality exploits information on infrastructure resources, more specifically hosting nodes, which is described in an implementation independent manner and is made available by components such as the Resource Registry or an Information System.
A client can either use a Node Selector to immediately select the most suitable execution node or to perform an assessment of the set of candidate nodes. This assessment orders nodes in a sequence of decreasing suitability for a given task and enables the client to select any node among the candidates based on its own criteria. Collocation Policies can be as simple as to assign all tasks to the local or to a single remote node, or more complex in which case they can be used in conjunction with node selectors. In the latter case, and depending on the behavior of each policy, additional parameters which regulate how much the policy can diverge from the optimal selection as defined by the underlying node selector can be used. As an example, one of the policies implemented which tries to assign as many tasks as possible to the same node, uses a penalty for each task and also a global threshold. When the score of a node falls below the threshold, a different node has to be selected. This type of parametrization enables policies to be flexible enough to be used by applications with workflows governed by a variety of conditions while guaranteeing that they still operate in the general direction specified in their behavior.
Up until now, the following have been implemented and tested
- Node Selectors
- Random, which selects nodes at random and assigns the same score to all of them
- LRU, which selects the least recently node and scores nodes based on their timestamps
- Cost based, which scores nodes based on a cost function. The cost function consists of cost factor, each of them being comprised of a coefficient, the hosting node property and information used to determine whether the best value is the minimum or the maximum. Examples of cost factors are available memory, CPU speed, CPU load, etc.
- Best, which is simply a specialization of Cost based node selector with a set of commonly used cost factors.
- Collocation policies
- Local, which assigns all processing elements to the local node
- Single node, which assigns all processing elements to a single node (regardless of being remote or not)
- Maximum, which attempts to assign as many processing elements as possible to the same node
- Minimum, which attempts to assign processing elements to different nodes, ideally spanning the whole set of available resources
T8.4 Resource Model
FAO has produced a first draft of the classification of software resources in the new Resource Model, along with a high-level strategy to manage them under the new foundations. Some technical pointers on the implementation of the strategy are also included. Overall, the draft illustrates the new vision of a transparent and standards-based management for a broader class of software resources within and outside the system.
- a first draft of the classification of software resources in the new Resource Model, along with a high-level strategy to manage them under the new foundations;
CNR kept supporting FAO in the drafting of the software area of the new Resource Model. As in the previous reporting period, CNR gave feedback and posed requirements coming from the actual and future enabling layer and validated the new directions.