next up previous contents
Next: Web Services and VRE Up: SAKAI EVALUATION EXERCISE (A Previous: CopperCore Integration   Contents

WP 4 - Issues involved in extending the Functionality of Sakai/ CHEF

This work package was aimed at establishing the issues involved in extending the functionality of Sakai/ CHEF particularly to use Web services for distributed development and deployment. There are some key VRE questions that need to be clearly formulated and which, to our knowledge, are outside the scope of the current Sakai project. One involves the skill set required to extend Sakai/ CHEF functionality and whether this is available in the UK. The degree to which Sakai/ CHEF is standards compliant also needs to be investigated, with a particular attention to standards relating to integration of external tools and content. Sakai is focussing on the OKI OSIDs, but will need to also work with other interfaces. It is anticipated that there will be an inverse relationship between such compliance and the need for substantial extensions to the framework software. A final and key question is how Sakai can be used as a front end to distributed services as envisaged in a service-oriented VRE. This work package contributed to a report (Evaluation Report 2) and developed an architecture as explained in Section 6 and prototype software wrappers to extend the functionality of the Sakai/ CHEF framework. These wrappers could serve as useful templates for other extensions.

The architecture has been proposed following discussions with Sakai technical director Charles Severance and others and was partly instantiated for some of the test services listed in WP 3, e.g. the InfoPortal Web service portlet. The outcomes of these discussions are described along with our conclusions in Section 6.

Work carried out includes:

Task Title Responsibility Description
4.1 WS Architecture Both Develop and agree WS-based Open Service Architecture linked to Sakai. Collaboration with Charles Severance (U. Michigan, USA). Input from Geoffrey Fox (U. Indiana) and other members of the GGF Grid Computing Environments research group.
4.2 Implement Both Instantiate this architecture to access some of the Grid tools in WP 3.
4.3 Technical Spec Both Write up technical specification of this work package so that other services can be incorporated via portlets

An initial technical specification following a number of discussions with relevant developers was presented to the JCSR Committee in a paper on 16th June 2004 [45]. The outcomes of additional discussions with the Sakai developers and the Indiana Xportlets group are described below.

Based on this and work on the ReDReSS and other e-Science projects has indicated the need for extensions to the existing Sakai framework and tools additional to those already provided in CHEF/ Sakai. Our bids to the JISC VRE call focused on achieving these goals and importing a large suite of tools to create a VRE.

This report has focussed on CHEF/ Sakai as a user-facing delivery mechanism for services in a Virtual Research Environment. It has described the underlying Java technology and how it is evolving. The portal framework is however only one component of the e-Research environment. During the course of our work we have therefore needed to address other aspects of VRE architectures which dictate how Sakai or any other 2nd generation portal framework is to be used alongside other components and delivery mechanisms in a Grid such that the many and diverse services supported by JISC and research institutions can be made available.

VRE Architecture

We provide a top-down approach to a possible VRE architecture.

Figure 8 shows how major components in a federated VRE architecture could be linked. In developing this architecture we coined the acronym HIVE: Highly Integrated Virtual Environment. This inherits many aspects from CCLRC's prototype Integrated e-Science Environment (IeSE) [16,26].

Figure 8: Federated Components in a VRE


User/ Application:
Consumer of delivered services via tools and applications;
Tool Server:
User facing part of the system. Browser, programmig library, desktop icons etc.
Tool Host:
The tools server can be Web or desktop based. It will delegate authentication to HIVE server and thus permit single sign-on across remote toolsets;
HIVE Server:
HIVE server provides access to integration services such as authentication, workflow, registries. It can handle federated services;
Shibboleth Server:
Will provide the authentication services to the HIVE server. It could be part of a federation and thus provide trust-based access to all the tools hosted for all researchers in the federation's institutions;
VO Management:
Provides information about users, their roles and project affiliation. It can extend to resources and services;
Registry holds details of services and provides template to access them along with relevant semantic information. There may be a number of registries handling different types of services. ETF's UDDI and JISC's IESR are examples;
Multiple services provide access to end resources and applications. They should be language agnostic and can wrap heritage applications and facilities.

In this architecture there can be multiple instances of each component serving slightly different functions. Web or Grid services using simple protocols are used to connect components with more complex protocols for data delivery.

Integration Services

Within a portal or other framework a number of internal services are needed to address of issues of coordination of tools (portlets) within an overall framework. Methods can be provided as an ``internal'' class library which sits alongside the portlet API and service APIs (the model part of the MVC paradigm). Each framework could have the same or a different set of tools, but the way they are integrated may differ between user groups - similar services are required to allow different frameworks to inter-operate. These services could be federated and available via Web services calls to specialised servers elsewhere in a Virtual Research Environment.

Research issues are implied with most of these services. Some simple ones, such as managing the look and feel of the portal, personalisation and accessibility are provided directly by the portlet container. This is a specialisation of portals and not required in other frameworks. Services which are not specialised, e.g. single sign-on, should not be limited to the portlet API. Some example integration services are now listed:

Session Management:
management of a Session Key and issues related to single sign-on and session activities. Involves database access for storing and retrieving other items relevant to the session. User can authenticate and start a new session or revert to a previous one. Service can open and close sessions and log state of a session from state handle. Rollback and replay including peronal workflow can be available.

Authentication using MyProxy:
MyProxy is a repository of valid proxy certificates for authenticated users. The portal can download these for delegation to trusted external services. Service can also check that certificates etc. are still valid and refresh them if not. Part of the integration API would enable storing and retrieval of the proxy in the portal database for later use. This will be done using the Session Key and uid (e.g. DN or unique e-mail address). Having the cert associated with the session key enables authorisation issues to be tackled, e.g. using subsidiary cert or other method. Same user but different session $=>$ working in different role?

VO Management:
a virtual organisation could be based around a project (as described by UDDI [18]) which would typically have its own portal and mini-Grid. VO users are real people who have been authenticated and have received a digital identity (certificate). They are then given rights based on the roles they are taking in this VO and thus can be authorised to access services. A prototype schema is given in a separate paper [23].

Integrated State:
manage database storage and retrieval of state information by portlet id. Research needs to develop concepts of integrated state. State can be used as an event trigger. State needs to be logged for session manager/ workflow. What states can portlets and services have which are meaningful for rollback and replay?

Service and Portlet Location:
registry input, query and lookup of remote services and portlets. This requires semantic support, i.e. what does the service do and why?. It also supports identification and location of internal portlets with unique keys (portlet id) for use in IPC etc.

Portal Preferences:
build up a "preferred set" of services, portlets etc. based on usage, e.g. from registries. This service can also log semantic information and build a related ontology. It extends the idea of a workspace toolset enabling dynamic and semantic/ function-driven choice.

Semantic/ Ontology Support:
semantic and knowledge-based information about services and portlets in the framework. These are used for decision support and choice augmenting stored preferences. This does not cover generic semantic issues which would need separate tools.

directed links between components (typically graph based). Event mechanism used to trigger actions within portals and attached services. Graphs in the portal will be mostly pre-defined, but with constrained facilities to swap in and out components and provide additional inputs at decision points. Again, not completely generic workflow, but based on instances in the preferences list and their states.

Trails and Personalisation:
logging of usage for off-line mining and analysis, e.g. for developers to improve presentation, ease of use, and optimisation, e.g. by aggregation of low-level tools and services. This tracks state and component changes for each session. Basically same information as session management service.

IPC - Inter Portlet Communication
and Event Management: a message-based communication mechanism between portlets, possibly with event triggers and asynchronous handler. Could be interrupt driven? Use a message queue in the database with associated triggers. Example: Completion of a computational job or background query triggers portlet to send SMS or SMTP message or view results. Could therefore give the user a flag, a chat entry, mobile phone message or an e-mail.

Some key research issues in implementing the above services include:

Identification of user/ session/ portlet/ services: name value pairs $=>$ sessionKey, uid, portletId, serviceId. Same stuff as typically put in a cookie in 1st generation portals. Can however also be used by non-portal based tools by building into the method calls.

State definition: a pre-defined set of states needs to be identified. Can this be done? This could be the key to using the event mechanisms and session logging. Do portlets and/ or services make clear state changes?

Extending the Portal Architecture

This evaluation and consideration of the wider implications and how to implement a Virtual Research Environment for the UK have raised some interesting architectural questions. Similar questions have been raised within the JISC VLE programme, e.g. see the paper by Bill Olivier [61]. The key to both VLE and VRE deployment is to ensure the maximal use of existing resources via a re-usable set of distributed services delivered through a variety of mechanisms such as portals and desktop applications.

Figure 9 shows a modification of the portal tool pluggability picture from Charles Severance. It gives the additional API for the integration services and link to remote resources viw Web services.

Figure 9: Pluggable portal components

Over the few months that we have been involved in disussions with JISC and the Sakai developers, in particular identifying and classifying services and debating the use of an open service architecture the picture has changed. In line with our suggestions the Sakai team are now proposing a stronger emphasis on WSRP as shown in Figure 10.

Figure 10: New Big Picture

Ultimately this permits aggregation of portal content, tools and worksites managed autonomously as shown in figure 11. The fact that WSRP is used also permits portlet content to be rendered in other environments, e.g. via a Swing interface using WSRP4J. Examples of interfaces could then be built in Swing, Matlab, VTk, Iris Explorer, etc. which are all polular.

Figure 11: Portal Integration Architecture

This design work was presented in the joint VRE-VLE discussion group at the JISC-CETIS Annual Conference, Oxford, 3-5/11/04. It was fed back to the conference in a plenary session by Charles Severance. Work is ongoing to establish a shared framework with an agreed set of service names and definitions for the JISC areas and to develop the technologies to deliver them to end users. See and below.

Appropriateness of Sakai in a VRE

We here note some comments from the commercial sector about portal deployment and usage. Key to this is a consideration of Return On Investment (ROI). Several reviews have indicated that this is a critical business factor in adopting portal technology as access to services and their provision can then be consolidated.

According to Gartner in 2003 [89], portal technology will become a key component of software suites. While Gartner identifies two types of suites, an article by C. White [88] suggests there will be three types, see Figure 12:

Figure 12: The Evolution of the Enterprise Portal - Software Suites

Intelligent Business Suite - This suite is a packaged solution that integrates the key features of an independent portal (categorization, search and personalization) with business intelligence tools, collaboration services and a content management system. This suite places emphasis on out-of-the box solutions for information access and sharing, content management and collaboration. It is similar to what Gartner calls a Smart Enterprise Suite (SES). For some organizations, a key requirement for such a suite would be that it can run on and exploit the integration features of an application server suite in areas such as application integration and Web services.

Application Package Suite - This suite provides an out-of-the box solution that integrates an application vendor's operational application and business intelligence packages into a portal environment. The suite also provides collaboration services and an integration bus that enables third-party product integration. This suite puts a strong emphasis on pre-packaged solutions. If the application vendor also markets an application server suite, then it is likely that the application package suite will be developed and integrated with that product.

Application Server Suite - This suite brings together the four key integration infrastructure technologies (i.e. user interface, business process, application and data) in a single package that is combined with an application server and collaboration services. This suite places emphasis on application integration and on an infrastructure for building an integrated business environment. It is similar to what Gartner calls an Application Platform Suite (APS).

According to White, portal technology will become a key component of other software solutions. In developing a portal strategy, the needs of the business should be the primary focus; but it is also important to take into account other IT strategies in areas such as business intelligence, content management, collaboration and application integration. New and evolving technologies such as XML and Web services are also likely to play key roles as portals evolve into software suites.

These statements clearly reflect our investigations into the use of portals in a Virtual Research Environment. In particular the ability to combine application suites in a single portal environment, now using portlets and an additional integration API as shown in Figure 9, mirror the need to combine services from e-Learning, Digital Information and e-Research.

Portals have for a long time also been identified as a focus for delivery of a Virtual Learning Environment. An interview with Chuck Severance discussing the similarities and differences between the approach adopted by Sakai and the JISC Framework Programme is recorded on the CETIS Web site: This includes a high-level comparison created by Mark Norton in discussion with CETIS staff, Figure 13.

The HIVE approach to e-Research presented here can be applied in many other contexts. In e-Learning we could for example use it to construct a Grid of distributed content that could be aggregated in the ways required dynamically by the user for each learning situation they face. The content of each HIVE server could be watermarked to identify its origin. This use of the HIVE would require the development of new tools, e.g. cross searching tools. Eventually we could use the HIVE to coalesce the appropriate combinations of information, e-Learning, e-Research, e-Collaboration, e-Management, e-Authoring and e-Publishing, e-Leisure tools as required by our current activity.

Services in other Frameworks

Clearly a VRE is more than just a portal! We believe that portal and portlet technology has a major role to play in delivering VRE services and tools, particularly the collaboration tools, to end users, but existing applications, e.g. GUIs must also access VRE functionality. In order to do this a lightweight toolkit is required to be downloaded onto workstations and PCs which will avoid the ``client problem'' of bloated and difficult software installations and associated firewall issues.

A need has been identified within the e-Science community for a client toolkit which can provide very light-weight but extensible access to Grid resources. GROWL: Grid Resources on Workstation Library is a prototype library which could be used for this evaluation work. We are initially creating libraries in C/ C++ and R, interfacing to a set of existing services derived from HPCPortal, DataPortal and InfoPortal which are part of IeSE. It should be possible to install this library on a variety of client workstations with a minimum of additional software. The library is targetted at existing applications in physics, chemistry and statistics.

Presentation of services through both portlets into the portal framework and also as a programming library can be achieved using language-agnostic Web service client interfaces to the VRE exposed in a service oriented architecture. Early work with GROWL at Daresbury and Lancaster has shown that this is feasible in C and R languages in addition to Java. ``Heritage'' applications can in this way be Grid enabled or themselves exposed (wrapped) as remote services. Other examples exist, such as GEMLCA from University of Westminster [75], WSRF::Lite from RealityGrid [76] and gLite from EGEE [77]. These are responses to the recent discussion of lightweight toolkits by existing and potential Grid users. They are also crucial steps in bringing specialist applications into play for a wider research community and for achieving inter-disciplinary research agendas.

Back-end Resources and Services required in a VRE

A lot of effort in this project has gone into considering the integration of existing resources and services throught the architecture proposed here. This proposal has emerged alongside very similar ones for the JISC VLE and IE programmes. All teams are in contact with the same American and European developers. A list of UK resources which should be integrated into the VRE were included in paper [45] and are repeated in Appendix G. Similarly the services which we believe should be available via a VRE are listed in Appedix H and a fuller version is being kept up to date on the ETF Web site Interestingly the Integrative Biology project has identified a very similar set among its project deliverables to enable a new insight into how chemical and biological phenomena affect the behaviour of whole organs [78].

It is important for the tools in the VRE to be able to access all the underlying resources that a particular group of collaborating researchers would require. These include computer systems, databases, data/ information collections, application codes and instruments for on-line observation and data recording and annotation. Some of the resources to which the tools that we are developing will provide immediate access include:

$\bullet$ HPCPortal and InfoPortal functionality applied to NGS, JCSR Grid clusters and researchers' own facilities;
$\bullet$ Network Monitoring tools;
$\bullet$ DataPortal cross search tools for scientific data;
$\bullet$ CCLRC's Atlas Data Store;
$\bullet$ HPCx capability compute facility;
$\bullet$ Sample experimental facilities on Daresbury Synchrotron Radiation Source;
$\bullet$ NGS Nodes;
$\bullet$ ReDReSS training and awareness content and services.

Indirect access would be provided to:

$\bullet$ RDN data via SPP cross search tools;
$\bullet$ MIMAS, Manchester Information and Associated Services;
$\bullet$ CSAR computing and data facilities.

We would also seek to bring in other resources and services over a longer period. Further information is provided in Appendix G.

next up previous contents
Next: Web Services and VRE Up: SAKAI EVALUATION EXERCISE (A Previous: CopperCore Integration   Contents
Rob Allan 2005-05-09