Version 3.1
We could define the term ``Campus Cloud'' to embrace the use of ICT on the Campus including computational resources, e-research portals, virtual meetings and information management. Networking, e-mailing, video conferencing and calendaring are considered separately by the Campus ICT group who have also set up a ``connections'' portal. The Cloud is however defined differently by its frequent use in the context of utility computing. It does share some characteristics of a Campus Grid by providing common user interfaces and catalogues for selection of resources and applications (services).
This document gives an overview and introduction to the DSIC project and is provided as input to the Campus Grid SIG.
© STFC 2008-11. Neither the STFC nor its collaborators accept any responsibility for loss or damage arising from the use of information contained in any of their reports or in any communication about their tests or investigations.
There is a strong history since 1962 of computer use for research purposes at Daresbury Laboratory [7] for accelerator design and control, data analysis and since the late 1970s for modelling and simulation. Development of expertise in all these areas is exemplified by the advanced proposal for the Hartree Centre [25], which is a gateway technology centre for computational science and engineering focussing on research grand challenges and knowledge exchange. Such developments are underpinned by expertise in networking which is particularly strong at the Laboratory. Other related development work was carried out in the e-Science Programme and included deployment of Access Grid, Computational Grid (NW-GRID) and Portal technologies. A recent book [2] explains the philosphy of using all these technologies in what is now known as a Virtual e-Research Environment.
There have been a number of discussions about extending ICT services for research, information management and innovation across the whole of the Daresbury Science and Innovation Campus. Plans are under way to deploy an ``intelligent'' network fabric with the capability to provide roaming access across site for both commercial and academic users. The network will route them via the appropriate providers. A separate document describes this and explores ways in which it might be used [17].
We here describe and explore the growing set of requirements to exploit this advanced fabric for diverse computational and collaborative research purposes.
Web site: http://www.nw-grid.ac.uk.
The North West Grid (NW-GRID) service includes clusters of high performance computers and a group of experts in operational support and computational science offering user and application support and capable of working with a wide variety of simulation, modelling and data analysis applications enabling you to use and access unrivalled resources for your projects and research.
NW-GRID was originally a collaboration between Daresbury Laboratory and the Universities of Lancaster, Liverpool and Manchester plus the Proudman Oceanographic Laboratory (POL) and University of Central Lancashire (UCLAN). With the support of NWDA funding for the core sites, we together established a computational Grid comprising high performance computing systems coupled by a high speed private fibre network. The original infrastructure was deployed in 2005, with additions in 2007 and in the spring of 2008. Sun systems with dual core and quad core AMD Opteron processors integrated by Streamline Computing provided the core of the infrastructure. The NW-GRID offers world class services founded in the deployment and exploitation of Grid middleware technologies, enabling the capabilities of the Grid to be realised in leading edge research applications, primarily in computational science and engineering. Over the past few years, services have been offered to many research projects in the region and have resulted in publications in high profile journals such as Nature. With the confidence gained from these successes NW-GRID was made available to non-academic users in the region [1].
NW-GRID was a important infrastructure for the North West science strategy and the project resonated strongly with the key elements of the NWDA's regional strategy, in particular in working with targeted emerging sectors in the environment, bio-technology and pharmaceutical and complex materials areas, establishing the North West as a global player in Grid technologies, e-research and in embedding e-competencies across the region's business, academic and industrial base.
The high performance compute clusters at the core sites were complemented by a high speed private network which could be enhanced and configured to meet all the requirements for secure access and data transfer between clusters and storage systems. All systems are supported by appropriate local disk storage and data backup.
A number of collaborative initiatives have been established on the Daresbury Science and Innovation Campus which complement or build upon the NW-GRID.
The Cockcroft Institute
Web site: http://www.cockcroft.ac.uk/.
The Cockcroft Institute is an international centre for Accelerator Science and Technology (AST) in the UK. It was proposed in Sep'2003 and officially opened by the UK Minister for Science, Lord Sainsbury, in Sep'2006. It is a joint venture between the Universities of Lancaster, Liverpool and Manchester, the Science and Technology Facilities Council (STFC at the Daresbury and Rutherford Appleton Laboratories) and the North West Development Agency (NWDA). The Institute is located in a purpose built building on the Daresbury Science and Innovation Campus adjacent to the Daresbury Laboratory and the Daresbury Innovation Centre, and has established satellite centres in each of the participating universities.
The Institute provides the intellectual focus, educational infrastructure and the essential scientific and technological facilities for accelerator science and technology research and development, which will enable UK scientists and engineers to take a major role in innovating future tools for scientific discoveries and in the conception, design, construction and use of the world’s leading research accelerators for the foreseeable future.
Computational modellers based in the Cockcroft Centre have access to NW-GRID resources and contribute to the Condor pool of the DSIC Campus Grid.
VEC: The Virtual Engineering Centre
Web site: http://virtualengineeringcentre.com/.
The VEC is a University of Liverpool project partially funded by NWDA and ERDF located at Daresbury Laboratory. It was established to assist the North West UK aerospace sector and wider industry by providing a focal point for world class virtual engineering technology, research, education and best practice with the aim of improving business performance throughout the supply chain.
The VEC was thus set up to help both small companies and larger organisations embrace ICT in the rapid design life cycle. Companies bringing their expertise together and using computational modelling can reduce the time from idea to product. The manufacturing process and test phases can be modelled prior to implementation and some validation carried out in silico to avoid expensive and time consuming prototyping and testing. Use of high definition 3D visualisation is a key element to share information and allows parties to walk through all the stages in the process. Numerical models are computed using NW-GRID resources and a computational cluster at University of Liverpool. The visualisation facility is also made available to other NW-GRID partners subject to discussion.
The VEC plays multiple roles and is pivotal to a successful collaboration and uptake of virtual engineering methodology. It negotiates and provides a suite of licensed software running on top end computational (NW-GRID) and visualisation resources thus providing a facility which small firms can access. It provides consultancy through its in house researchers and software experts, plus it can call upon other experts at University and the Laboratory. It also acts as a business broker, potentially bringing partners together as driven by the needs of the European aerospace industry.
KCMC: The Knowledge Centre for Materials Chemistry
Web site: http://www.materialschemistry.org/kcmc/.
The Knowledge Centre for Materials Chemistry is a virtual centre of expertise providing multi-disciplinary research and innovative knowledge transfer based on world class capabilities in applied materials chemistry.
It acts as a single point of contact for companies of all sizes to access a substantial range of facilities and expertise in applied materials chemistry at four leading academic institutions at Bolton, Liverpool and Manchester Universities and the Science and Technology Facilities Council at Daresbury.
A core capability of KCMC is computational modelling for which NW-GRID resources are used.
Daresbury Laboratory is well placed to host and exploit high performance computing and data management systems. It has a large secure computer hall (former home of the UK national facility HPCx), good electricity supply and UPS, high speed network fabric and experts in managing and operating computer systems plus software development, computational science and legal and business management (the latter through STFC Innovations Ltd.).
A number of computer systems are hosted for specific purposes, a current example is a Fujitsu system on loan for a software development project.
Other systems are hosted for the EPSRC Distributed Computing Programme and are available for all UK academic researchers for evaluation and software development. These include the Woodcrest and Harpertown clusters, Cell cluster and IBM Power-7 system which are listed below.
OCF enCore Service
In Dec'2010, OCF signed a collaboration agreement with Daresbury Laboratory to make use of its available processing power from a new shared IBM iDataPlex cluster. STFC Daresbury Laboratory makes around 50% of its processing power available to OCF's customers (from an initial set of 20 nodes). Through an SLA driven agreement, the requirements of OCF's customers will receive priority on the server cluster and the size of the cluster can be grown if there is sufficient demand and funding. OCF is thus aiming to sign a network of academic and research partners to contribute a constant and significant level of compute power to meet projected customer demand.
The enCore service managed by OCF is aimed at UK businesses of any size and from any sector. It can act as an overflow service for businesses to meet a temporary requirement for more processing power. It can enable design consultants from SMEs for example to pitch for larger projects than would otherwise be possible, due to the limitations of their IT infrastructure. It serves as an OCF courtesy service for customers whilst they wait for tenders to complete or for a new dedicated HPC system to arrive. It acts as a direct hardware replacement by businesses in order to reduce their capital expenditure (cloud pay on demand model)
University of Huddersfield
A partnership with the Universiy of Huddersfield has enabled the first step in extending the size of the partner cloud based on the iDataPlex system. Purchase of an additional set of 20 nodes will enable an ``elastic computing'' cloud like model to be implemented. A flexible queuing system based on LSF will prioritorise jobs from the various partners but enable them to access as much of the system as they require within agreed time allocations.
Computational power in itself is of no direct benefit. Where NW-GRID creates real value for projects is the combined access to hardware, open source and commercial applications and expert knowledge from the partner sites. NW-GRID offers a pay-as-you-go service for commercial access to computational resources, application licenses and expertise with a number of pricing models to meet customers' growing requirements.
Currently NW-GRID has completed simulation and modelling projects in the following sectors. Separate technical case studies are available for each and some additional information is available from the Web site.
By subscribing to NW-GRID you can use its high performance computer systems for your projects. For applications and further information, please visit the NW-GRID Web site at http://www.nw-grid.ac.uk. Information about some of the work carried out on NW-GRID is described in two publications [20,21] and a report [1] which are available from the Web site.
A separate document is available which lists regional initiatives using high performance computational resources [4].
The following computing systems are (or have been) available to NW-GRID users as a multi-institutional Grid or to local academic users (and commercial subject to discussion) on DSIC. For further information about commercial access and the services available contact John Bancroft on 01925 603148 or Michael Gleaves on 01925 603710 or visit the DaComS Web site http://www.dacoms.ac.uk. For access to the OCF enCore service, contact Jerry Dixon on 07508 033900.
For more information about the Condor Pools and related resources see http://www.grids.ac.uk/twiki/bin/view/GridAndHPC.
The overall architecture of the DSIC Campus Grid is shown in Figure 3. This depicts the main cluster resources and also the pools of linux and windows work stations accessible via the Condor master nodes. We currently take a ``federated'' approach to Campus Grid deployment with departments responsible for their own pools. Flocking between them is however permitted and encouraged. This architecture is closely based on work done in the NERC funded e-Minerals e-science project in collaboration with University of Cambridge.
A separate report on Clouds for e-Research is available [3].
The Grid infrastructure enables access to all of this hardware in a seamless fashion that can be highly automated. The Grid is now a key infrastructure for the North West science strategy and the NW-GRID project resonates strongly with the key elements of the NWDA’s regional strategy in particular in working with targeted emerging sectors in the environment, bio-technology and pharmaceutical and complex materials areas, establishing the North West as a global player in Grid technologies, e-research and in embedding e-competencies across the region’s business, academic and industrial base.
The compute clusters at the partner sites are complemented by a high speed private network which can be enhanced and configured to meet all your requirements for secure access and data transfer between clusters and storage systems. All systems are supported by appropriate disk storage and data backup.
It is worth noting that Daresbury Laboratory is also a member of NorthGrid, with Universities of Lancaster, Liverpool, Manchester and Sheffield. Daresbury provides networking expertise.
Daresbury Laboratory (DL) and Campus (DSIC) networking can be used by commercial traffic.
The available networking out of DSIC Campus is 1Gb/s light path from Daresbury Laboratory to Manchester through Net North West (NNW). There is also a 1Gb/s link to Liverpool through NNW for resilience This is the used as the failover for the Manchester link within the current network configuration.
There is a fibre based connection between Daresbury Innovation Centre (DIC) building and Daresbury Laboratory currently operating at 100Mb/s -– the bottleneck is a router in the DIC building. Fibre into the Daresbury router could provide faster connections if connectivity within DIC were upgraded.
Commercial traffic from DSIC is routed out to a commercial ISP connected to NNW Manchester based on the originating IP range. Currently internet access for companies in DIC is provided by this route.
There is wireless connection on the Daresbury Laboratory site as mentioned above.
There are several current opportunities to enhance this network, see Figure 4.
Data storage is not currently a high priority on DSIC except for the HPCx service which had its own disc sub-system, tape robot and off site backup archive. Discussions are however under way to host similar services for POL and VEC and potentially for commercial users. There are huge gains to be made via economy of scale and we see ourselves ideally placed, working with appropriate vendors, to offer future data storage services over the network infrastructure already in place.
The current technical provision is through a combination of Panasas for high performance storage on clusters such as DL1 and IBM GPFS for distributed hierarchical file management. It is likely that the GPFS will be enhanced in the future.
[growing requirement to be addressed]
Sakai is our framework of choice for delivering portal services. It is the second most widely used open source portal framework and has been designed to support both learning and research involving up to tens of thousands of users. It is thus an ``enterprise class'' service [11,19,12,2]. Sakai complements our other Campus Grid activities by acting as a single sign on container for a range of Web 2.0 style shared tools such as: resource folders, e-mail archive, blog, wiki, calendar, online chat, RSS news reader, search, and interfaces to project specific applications. Most importantly Sakai can be used to support ``virtual organisations'' through its worksites and role based access control [13].
A number of Sakai instances hosted on blade1 are currently in use as follows. These consortia mostly use Sakai as an information management and collaboration system for collaborating user groups their projects.
A number of other servers are running Sakai for development and demonstration purposes such as bonny.dl.ac.uk (NeISS development); clyde.dl.ac.uk (Sakai build and test); dee.dl.ac.uk (test site for Daresbury drawing office and programmes group); congo.dl.ac.uk (gateway centres oversight group);
Similar services have been offered to other gateway centres and to the VEC.
[Information TBA]
[Information TBA]
All core NW-GRID sites and many UK and overseas Universities are equipped with Access Grid rooms for virtual meetings. The A1 AG room is used for the fortnightly NW-GRID Technical Board and Operations Board meetings, for meetings with the NGS, HPCx, HECToR and other project partners, for instance NCeSS.
The T22 combined Access Grid and video conferencing suite in the Tower at Daresbury was an additional state-of-the-art facility. It had an AG enabled training room next door (T23) and a conference/ training suite nearby which can hold 60 people (or 2x groups of 30 if partitioned). AG and video conferences can be broadcast into this space. This facility was part funded by NWDA and is available for use on the Daresbury Science and Innovation Campus. T22 and T23 were re-commissioned for use by the Detector Systems Cenrte in 2010 so the AG facilities there are no longer available.
It is important to include information management in this discussion. This includes the library services, many of which are now on line with subscription and delivery managed as appropriate using resolvers. Publications of STFC staff, collaborators and facilities users are available through the ePubs repository which is a valuable science research knowledge base at http://epubs.cclrc.ac.uk. Discovery and access tools need to be provided alongside research and commerce tools, for instance ePubs must be accessible from portals and other information systems. We explored these issues in consultancy carried out for JISC during 2007 [6]. Institutional Repositories are discussed in a book by Cathy Jones from the STFC e-Science Centre [16].
During the period when e-Science was a strong research topic on the Daresbury site we supported an Oracle DB service. We no longer have DBA staff or systems to support this, so have moved critical services to other sites, for instance the National Grid Service runs an Oracle server from University of Manchester. We believe that DSIC should have a campus wide Oracle service again to support longer term service developments, particularly in information management.
The middleware infrastructure used on the DSIC Campus Grid is a combination of Globus [15], Condor [18] and SRB [8]. We have made a large investment in developing a ``lightweight Grid infrastructure'' building on this middleware and allowing users to do data management and submit computational Grid jobs from their desktop workstations (which might be also resources in the Condor pool). The software which integrates this infrastructure is now referred to as the G-R-Toolkit [24].
The G-R-Toolkit combines the best software developed at SFTC during its e-Science Programme from 2001-2007. It allows users of many applications in computational research to manage their high performance computing and data and information management tasks directly from their desktop systems. Components of G-R-T, some of which are available separately include:
| GROWL Scripts - facilitate management of digital certificates and access to datasets on remote Grid resources. | SRB Client and RCommands - desktop tools to manage stored datasets and metadata. |
| RMCS - uses Condor DAGMan to create and enact workflows to integrate data management and remote computation. | R, Perl and Python framework - scripting interfaces suitable for many research domains from bio-informatics to chemistry. |
| AgentX - a sophisticated semantic toolset using domain specific ontologies to link applications with ASCII, XML and DB formats. | G-R-T C library - Web service clients appropriate for application programming. |
G-R-T uses Grid middleware to perform its tasks on behalf of the user. Well known technology is re-deployed on a dedicated intermediate server (rmcs.dl.ac.uk), including Web Services, Condor, Globus, SRB and MyProxy.
G-R-T will work alongside and extend existing toolkits. It has a ``plug-in'' capability allowing Grid client functionality to be imported into applications such as Matlab, Stata, Materials Studio and others. G-R-T is written in a modular style using Web services to achieve a Service Oriented Architecture, a widely adopted pattern in software engineering. This enables its client side to be re-factored or extended to suit most research requirements.
Components of the G-R-Toolkit were developed by Rob Allan, Adam Braimah, Phil Couch, Dan Grose, John Kewley and Rik Tyer in STFC's Grid Technology Group and their partners, see http://www.grids.ac.uk/twiki/bin/view/GridAndHPC/GRToolkit.
GROWL is the Grid Resources On Workstation Library [23] development of which was funded in a JISC VRE-1 project.
GrowlScripts is a set of useful command line scripts which were developed by John Kewley during and after the GROWL VRE-1 project.
RMCS is the Remote My Condor Submit developed in the NERC funded e-Minerals project [21].
AgentX was developed by Phil Couch in the e-CCP project funded by STFC.
RCommands were developed by Rik Tyer to enhance RMCS by facilitating logging of metadata records associated with computational jobs.
MultiR and SabreR were developed by Dan Grose at University of Lancaster based on GROWL but written in the R language. They have been applied to longitudinal statistical analysis [10], bio-informatics and geography.
RMCS and RCommands have also been deployed by Jonathan Churchill on the National Grid Servic, see http://wiki.ngs.ac.uk/index.php?title=Category:Community_Software.
The software is currently deployed by hand as outlined on the Wiki pages at http://www.grids.ac.uk/twiki/bin/view/GridAndHPC.
This work was in part funded by the North West Development Agency, the UK e-Science Programme, EPSRC's SLA with Daresbury Laboratory, STFC and University of Huddersfield.
We thank Jonny Smith, formerly at the Cockcroft Institute and now with Tech-X, who set up most of the CI Condor infrastructure and contributed to previous versions of this document.
For many stimulating discussions we thank the following: Violeta Holmes and Ibad Kureshi of University of Huddersfield; Gillian Mirray, Tony Robotham and James Forrest of the VEC; Rick Anderson of KCMC.
We thank our former colleagues who worked on the e-Science software development and deployment in many Grid projects: Rik Tyer, Phil Couch, Adam Braimah, Andy Richards, Asif Akram, Dave Meredith, Xiaobo Yang, Xiao Dong Wang, Grahame Winter, Ronan Keegan, Jens Thomas and Jamie Rintelman.
Finally we thank Mark Calleja and Martin Dove at University of Cambridge for inspiration and encouragement over a long period.
We thank other members of NW-GRID and the UK Campus Grid SIG for sharing information.
This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -local_icons -split 3 -html_version 4.0 dsic_grid
The translation was initiated by Rob Allan on 2011-05-16