-- RobAllan - 17 Aug 2006

Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data

Simon Coles (School of Chemistry, University of Southampton)

Abstract

Recently the funding councils in the UK stated that `the data underpinning the published results of publically-funded research should be made available as widely and rapidly as possible'.

Thirty years ago a research student would present about five crystal structures as their Ph.D. thesis, however with modern technologies and good crystals this can now be achieved in the timespan of a single morning. This data overload is seen across the whole spectrum of analytical chemistry. Additionally, the general route for the publication of analytical data is coupled with and often governed by the underlying chemistry and is therefore subject to the lengthy peer review process and tied to the timing of the publication as a whole. This bottleneck in the dissemination of analytical chemistry data hinders the potential growth of databases and the data mining studies that are reliant on these collections. In addition, publication in the mainstream literature still offers only indirect (and often subscription controlled) access to only a subset of all the data acquired during an experiment

The work of the eBank-UK project (http://www.ukoln.ac.uk/projects/ebank-uk/) has addressed this problem by establishing an institutional data repository that supports, manages and disseminates metadata relating to the crystal structure data it contains (i.e. all the files generated during a crystal structure determination experiment). This process alters the traditional method of peer review by openly providing crystal structure data where the reader or user may directly check correctness and validity. The repository (http://ecrystals.chem.soton.ac.uk) makes available all the raw, derived and results data from a crystallographic experiment with little further researcher effort after the creation of a normal completed structure in a laboratory archive. Not only does this approach allow rapid release of crystal structure data into the public domain, but it can also provide mechanisms for value added services that allow rapid discovery of the data for further studies and reuse, whilst ownership of the data is retained by the creator. Funding has just been gained to scope the development of a federation of such repositories across different software platforms, different universities and different countries.

The Repository for the Laboratory (R4L? ) project (http://r4l.eprints.org) builds on the eBank-UK concept of capturing all the data generated during an analytical chemistry experiment as an approach to laboratory data management. This is achieved by embedding the continuous and automatic deposition of data and metadata into a repository in the analytical chemistry experiment workflow. Thus, as an experiment is performed, an entry is seamlessly created in the laboratory repository. This record may be formatted in the jump off page so as to be an experimental report of the data. This record may be a formal report that can be used in the publication process or it may provide a mechanism for interaction with the data for analysis purposes.

Topic revision: r1 - 17 Aug 2006 - 08:21:49 - RobAllan
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback