We could define the term ``Campus Cloud'' to embrace the use of ICT on the Campus including computational resources, e-research portals, virtual meetings and information management. Networking, e-mailing, video conferencing and calendaring are considered separately by the Campus ICT group who have also set up a ``connections'' portal. The Cloud is defined differently by its frequent use in the context of utility computing. It however shares some characteristics of a Campus Grid by providing common user interfaces and catalogues for selection of resources and applications (services).
This document gives an overview and introduction to the DSIC project and is provided as input to the Campus Grid SIG.
© STFC 2008-10. Neither the STFC nor its collaborators accept any responsibility for loss or damage arising from the use of information contained in any of their reports or in any communication about their tests or investigations.
There is a strong history of computer use for research purposes at Daresbury Laboratory  for accelerator design and control, data analysis and more recently modelling and simulation. Development of expertise in all these areas is exemplified by the advanced proposal for the Hartree Centre , which is a gateway technology centre for computational science and engineering focussing on research grand challenges and knowledge exchange. Such developments are underpinned by expertise in networking which is particularly strong at the Laboratory. Other related development work was carried out in the e-Science Programme and included deployment of Access Grid, Computational Grid (NW-GRID) and Portal technologies. A recent book  explains the philosphy of using all these technologies in what is now known as a Virtual e-Research Environment.
There have been a number of discussions about extending ICT services for research, information management and innovation across the whole of the Daresbury Science and Innovation Campus. Plans are under way to deploy an ``intelligent'' network fabric with the capability to provide roaming access across site for both commercial and academic users. The network will route them via the appropriate providers. A separate document describes this and explores ways in which it might be used .
We here describe and explore the growing set of requirements to exploit this advanced fabric for diverse computational and collaborative research purposes.
The North West Grid (NW-GRID) service includes clusters of high performance computers and a group of experts in operational support and computational science offering customer support and capable of working with a wide variety of simulation, modelling and data analysis applications enabling you to use and access unrivalled resources for your projects and research.
NW-GRID is a collaboration between Daresbury Laboratory and the Universities of Lancaster, Liverpool and Manchester plus the Proudman Oceanographic Laboratory (POL) and University of Central Lancashire (UCLAN). With the support of NWDA funding for the core sites, we together established a computational Grid comprising high performance computing systems coupled by a high speed private fibre network. The original infrastructure was deployed in 2006, with additions in 2007 and in the spring of 2008. Sun systems with dual core and quad core AMD Opteron processors integrated by Streamline Computing provide the core of the infrastructure. The NW-GRID offers world class services founded in the deployment and exploitation of Grid middleware technologies, enabling the capabilities of the Grid to be realised in leading edge research applications, primarily in computational science and engineering. Over the past few years, services were offered to many research projects in the region and have resulted in publications in high profile journals such as Nature. With the confidence gained from these successes NW-GRID is now being made available to non-academic users in the region .
NW-GRID is a important infrastructure for the North West science strategy and the project resonates strongly with the key elements of the NWDA's regional strategy, in particular in working with targeted emerging sectors in the environment, bio-technology and pharmaceutical and complex materials areas, establishing the North West as a global player in Grid technologies, e-research and in embedding e-competencies across the region's business, academic and industrial base.
The high performance compute clusters at the core sites are complemented by a high speed private network which can be enhanced and configured to meet all the requirements for secure access and data transfer between clusters and storage systems. All systems are supported by appropriate local disk storage and data backup.
Computational power in itself is of no direct benefit. Where NW-GRID creates real value for projects is the combined access to hardware, open source and commercial applications and expert knowledge from the partner sites. NW-GRID offers a pay-as-you-go service for commercial access to computational resources, application licenses and expertise with a number of pricing models to meet customers' growing requirements.
Currently NW-GRID has completed simulation and modelling projects in the following sectors. Separate technical case studies are available for each and some additional information is available from the Web site.
By subscribing to NW-GRID you can use its high performance computer systems for your projects. For applications and further information, please visit the NW-GRID Web site at http://www.nw-grid.ac.uk. Information about some of the work carried out on NW-GRID is described in two publications [19,20] and a report .
A separate document is available which lists regional initiatives using high performance computational resources .
The following computing systems are available to NW-GRID users as a multi-institutional Grid or to local academic users (and commercial subject to discussion) on DSIC. For further information about commercial access and the services available contact John Bancroft on 01925 603148 or Michael Gleaves on 01925 603710 or visit the DaComS Web site http://www.dacoms.ac.uk.
For more information about the Condor Pools and related resources see http://www.grids.ac.uk/twiki/bin/view/GridAndHPC.
The overall architecture of the DSIC Campus Grid is shown in Figure 2. This depicts the main cluster resources and also the pools of linux and windows workstations accessible via the Condor master nodes. We currently take a ``federated'' approach to Campus Grid deployment with departments responsible for their own pools. Flocking between them is however permitted and encouraged. This architecture is closely based on work done in the NERC funded e-Minerals e-science project in collaboration with University of Cambridge.
Data storage is not currently a high priority on DSIC except for the HPCx service which had its own disc sub-system, tape robot and off site backup archive. Discussions are however under way to host similar services for POL and VEC and potentially for commercial users. There are huge gains to be made via economy of scale and we see ourselves ideally placed, working with appropriate vendors, to offer future data storage services over the network infrastructure already in place.
Sakai is our framework of choice for delivering portal services. It is the second most widely used open source portal framework and has been designed to support both learning and research involving up to tens of thousands of users. It is thus an ``enterprise class'' service [10,18,11,2]. Sakai complements our other Campus Grid activities by acting as a single sign-on container for a range of Web 2.0 style shared tools such as: resource folders, e-mail archive, blog, wiki, calendar, online chat, RSS news reader, search, and interfaces to project specific applications. Most importantly Sakai can be used to support ``virtual organisations'' through its worksites and role based access control .
A number of Sakai instances hosted on blade2 are currently in use as follows. These consortia mostly use Sakai as an information management and collaboration system for collaborating user groups their projects.
A number of other servers are running Sakai for development and demonstration purposes such as bonny.dl.ac.uk (NeISS development); clyde.dl.ac.uk (Sakai build and test for Steve Swinsburg); dee.dl.ac.uk (test site for Daresbury drawing office and programmes group); congo.dl.ac.uk (gateway centres oversight group);
Similar services have been offered to other gateway centres and to the North West Virtual Engineering Centre.
All core NW-GRID sites and many UK and overseas Universities are equipped with Access Grid rooms for virtual meetings. The A1 AG room is used for the fortnightly NW-GRID Tehcnical Board and Operations Board meetings, for meetings with the NGS, HPCx, HECToR and other project partners, for instance the National Centre for e-Social Science.
The T22 combined Access Grid and video conferencing suite in the Tower at Daresbury is a state-of-the-art facility. It has an AG enabled training room next door (T23) and a conference/ training suite nearby which can hold 60 people (or 2x groups of 30 if partitioned). AG and video conferences can be broadcast into this space. This facility was part funded by NWDA and is available for use on the Daresbury Science and Innovation Campus.
It is important to include information management in this discussion. This includes the library services, many of which are now on-line with subscription and delivery managed as appropriate using resolvers. Publications of STFC staff, collaborators and facilities users are available through the ePubs repository which is a valuable science research knowledge base at http://epubs.cclrc.ac.uk. Discovery and access tools need to be provided alongside research and commerce tools, for instance ePubs must be accessible from portals and other information systems. We explored these issues in consultancy carried out for JISC during 2007 . Institutional Repositories are discussed in a book by Cathy Jones from the e-Science Centre .
During the period when e-Science was a strong research topic on the Daresbury site we supported an Oracle DB service. The systems to do this are still in place, but we no longer have DBA staff so have moved critical services to other sites, for instance the National Grid Service runs an Oracle server from University of Manchester. We believe that DSIC should have a campus wide Oracle service again to support longer term service developments, particularly in information management.
The name Grid or Cloud could be chosen. They are similar in meaning but subject to differences of interpretation. Cloud has been associated with the provision of services ``on demand'' as offered by Google, Amazon, Microsoft, etc. who host massive server farms for this purpose. Our meaning is clearly different, but we could use the same term to capture all the services offered over the campus ICT infrastructure as described here.
Cloud computing gained attention in 2007 as it became a popular solution to the problem of horizontal scalability . Our use of Cloud computing naturally evolves from our experience of NW-GRID and the Campus Condor pools and portals described above.
Matching technology to applications.
The purpose of a Cloud interface for DSIC is to allow systems as described above to be introduced and removed dynamically and made accessible independently of where they are physically located, e.g. they may be at vendor sites or university partners sites such as NW-GRID or located on the Campus, such as in the Tower, the main Computer Room or the Cockcroft Institute. The interface should allow access to a rich variety of computing and storage systems.
The term Cloud Computing derives from the common depiction in most technology architecture diagrams, of the Internet or IP availability, using an illustration of a cloud. The computing resources being accessed are typically owned and operated by a third party provider on a consolidated basis in data center locations, in our case typically somewhere on the Campus. Target consumers are not concerned with the underlying technologies used to achieve the increase in server capability. The Cloud simply provides services on demand. In our case however consumers will be concerned with the architecture and will target their applications to the most appropriate system available at the time, usually to get best performance. Grid computing is a technology approach to managing a Cloud, and one with which we have a lot of experience building on NW-GRID and projects such as eMinerals . In effect, all clouds are managed by a Grid but not all Grids manage a Cloud. More specifically, a Compute Grid and a Cloud are synonymous, while a Data Grid and a Cloud can be different. We also use the term Campus Grid through which we extend the Cloud to cover pools of desktop systems possibly using novel scheduling algorithms such as using spare cycles and back fill. We could also refer to this as Integrated Computing.
Critical to the notion of cloud computing is the automation of many management tasks. If the system requires human intervention to allocate tasks to resources it's not a Cloud.
A compute cluster can offer cost effective services for specific applications, but may be limited to a single type of computing node with all nodes running a common operating system. Alternatively, the canonical definition of Grid is one that allows any type of processing engine to enter or leave the system dynamically. This is analogous to an electrical power grid on which any given generating plant might be active or inactive at any given time. This can be achieved by physically connecting or removing distributed servers or by virtualisation. Since we support many ``heritage'' applications which are of the traditional MPI parallel type we will keep the notion of clusters and currently support physical rather than virtual resource dynamics. This can however include dual booting of certain servers.
Potential advantages of any Cloud or Grid computing approach include:
The architecture behind cloud computing, see Figure 4, is a massive network of ``cloud servers'' interconnected as in a Grid. Virtualisation could be used to maximize the utilisation of the computing power available per server, e.g. to better match the overall workload.
A front end interface such as a Portal allows a user to select a service from a catalogue. This request gets passed to the system management which finds the correct resources and then calls the provisioning services which allocates resources in the Cloud. The provisioning service may deploy the requested software stack or application as well, e.g. via licensing on-demand.
We have considered the use of MOAB from Cluster Resources for some of the above tasks .
Cloud storage is a model of networked data storage where data is stored on multiple virtual servers, generally hosted by third parties, rather than being hosted on dedicated servers. Hosting companies operate large data centers, and people who require their data to be hosted buy or lease storage capacity from them and use it for their storage needs. The data center operators, in the background, virtualise the resources according to the requirements of the customer and expose them as virtual servers, which the customers can themselves manage.
We have achieved this in the past using SRB, the Storage Resource Broker from SDSC , which provides a virtual file system interface to distributed storage ``vaults''. Physically, the resource may thus span across multiple servers. In our case storage services are provided for users of DSIC compute resources and other local initiaties such as POL and NW-VEC, e.g. via NW-GRID. We note that SRB will in the future become iRODS and that other solutions, such as AFS, are available.
The middleware infrastructure used on the DSIC Campus Grid is a combination of Globus , Condor  and SRB . We have made a large investment in developing a ``lightweight Grid infrastructure'' building on this middleware and allowing users to do data management and submit computational Grid jobs from their desktop workstations (which might be also resources in the Condor pool). The software which integrates this infrastructure is now referred to as the G-R-Toolkit .
The G-R-Toolkit combines the best software developed at SFTC during its e-Science Programme from 2001-2007. It allows users of many applications in computational research to manage their high performance computing and data and information management tasks directly from their desktop systems. Components of G-R-T, some of which are available separately include:
|GROWL Scripts - facilitate management of digital certificates and access to datasets on remote Grid resources.||SRB Client and RCommands - desktop tools to manage stored datasets and metadata.|
|RMCS - uses Condor DAGMan to create and enact workflows to integrate data management and remote computation.||R, Perl and Python framework - scripting interfaces suitable for many research domains from bio-informatics to chemistry.|
|AgentX - a sophisticated semantic toolset using domain specific ontologies to link applications with ASCII, XML and DB formats.||G-R-T C library - Web service clients appropriate for application programming.|
G-R-T uses Grid middleware to perform its tasks on behalf of the user. Well known technology is re-deployed on a dedicated intermediate server (rmcs.dl.ac.uk), including Web Services, Condor, Globus, SRB and MyProxy.
G-R-T will work alongside and extend existing toolkits. It has a ``plug-in'' capability allowing Grid client functionality to be imported into applications such as Matlab, Stata, Materials Studio and others. G-R-T is written in a modular style using Web services to achieve a Service Oriented Architecture, a widely adopted pattern in software engineering. This enables its client side to be re-factored or extended to suit most research requirements.
Components of the G-R-Toolkit were developed by Rob Allan, Adam Braimah, Phil Couch, Dan Grose, John Kewley and Rik Tyer in STFC's Grid Technology Group and their partners, see http://www.grids.ac.uk/twiki/bin/view/GridAndHPC/GRToolkit.
GROWL is the Grid Resources On Workstation Library  development of which was funded in a JISC VRE-1 project.
GrowlScripts is a set of useful command line scripts which were developed by John Kewley during and after the GROWL VRE-1 project.
RMCS is the Remote My Condor Submit developed in the NERC funded e-Minerals project .
AgentX was developed by Phil Couch in the e-CCP project funded by STFC.
RCommands were developed by Rik Tyer to enhance RMCS by facilitating logging of metadata records associated with computational jobs.
MultiR and SabreR were developed by Dan Grose at University of Lancaster based on GROWL but written in the R language. They have been applied to longitudinal statistical analysis , bio-informatics and geography.
RMCS and RCommands have also been deployed by Jonathan Churchill on the National Grid Servic, see http://wiki.ngs.ac.uk/index.php?title=Category:Community_Software.
The software is currently deployed by hand as outlined on the Wiki pages at http://www.grids.ac.uk/twiki/bin/view/GridAndHPC.
We thank Jonny Smith, formerly at the Cockcroft Institute and now with Tech-X, who set up most of the CI Condor infrastructure and contributed to previous versions of this document.
We thank Mark Calleja and Martin Dove at University of Cambridge for continuing inspiration and encouragement.
This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -local_icons -split 3 -html_version 4.0 dsic_grid
The translation was initiated by Rob Allan on 2010-08-31