RCommands user guide

Introduction to the RCommands

The RCommands framework provides a set of scriptable commands to associated metadata to files stored within a distributed files system such as the Storage Resource Broker, a set of FTP servers, or a set of files available over http. They enable the creation or metadata to be semi-automated. The RCommands insert and modify metadata held within a central metadata server.

The RCommands have been developed by Rik Tyer (http://www.e-science.clrc.ac.uk/web/staff/richard_tyer) of the CCLRC eScience Centre (http://www.e-science.clrc.ac.uk/web/), Daresbury Laboratory UK (http://www.cclrc.ac.uk/Activity/DL).

Data organisation

The RCommands assume a three-layer hierarchy for the data

  • The study level. This is the over-arching level under which you will group all files concerned with one particular piece of work. Examples might be a study of sea surface temperatures in the North Atlantic Ocean. If you use the pdf publications files as your data, all together they might represent a single study called "escience".
  • The dataset level. This grouping will consist of a set of files associated with one aspect of the study. For example, in a study of sea surface temperatures, it might be one season or one region. If you use the pdf publications files above, we have already separated these into possible data sets ("grid computing", "data management", "collaborative tools" and "applications").
  • The data object level. This will consist of a single file or a natural collection of files (such as the complete set of files produced by a single computation). If you use the pdf publications file, each file will be a data object.

One important point should be noted: the study and dataset levels are completely abstract. In contrast, the data objects correspond to URIs that point to real objects, including (but not exclusively so) files or collections of files in the SRB.

Users should not feel constrained by this hierarchy. For example, you may feel that your whole life's work is one study, so that this level has little meaning. On the other hand, you may feel that any one study should only have data objects. This hierarchy has many interpretations and should be used in the way that best suits the investigator.

It is possible to add metadata to each of these levels. Within the framework of the RCommands, each level will have an ID number that is used in the scriptable RCommands.

The commands

There are only ten RCommands, with detailed descriptions provided below.

  • Rinit: starts an RCommand session, and is needed in order to read information from configuration files.
  • Rpasswd: changes your password that is associated with your access to the metadata server.
  • Rcreate: creates a metadata object, ie any of the study, dataset and data object levels of metadata.
  • Rannotate: adds a decription or a metadata parameter name/value pair to a study of dataset
  • Rls: lists the different entities within the metadata database.
  • Rget: displays the metadata associated with a particular entity.
  • Rrm: removes entities from the metadata database.
  • Rchmod: adds or removes investigators to or from a study.
  • Rsearch: searches the metadata associated with studies and datasets for name/value pairs or keyword descriptions
  • Rexit: ends an RCommand session and has the primary effect of cleaning away hidden files created during the session.

Usage

Username

You will need a username to provide you with access to the RCommands database: this will be provided by the database manager.

Create the configuration files

You need to create a file of the name ~/.rcommands/rcommand.config, which has the form

username = password = cacertdir = /etc/grid-security/certificates

Initiating an RCommand session

You initiate an RCommand session using the Rinit command. You can test that all is well by typing the Rls command: it will return a message telling you about any studies you have. To get information about other commands, you can simply type the command name with no arguments, you can use the unix man command, or you can look at the information below.

Creating a study

First use the Rcreate command to create a study level. To use Rcreate you will need to give the study a name, add a description, and assign it to a topic, via:

Rcreate -n -k -t

First you should think about the topic. You can list all topics by the command

Rls -t

Chose a topic and note the number; this will be the topicID label. Run the Rcreate command to create a study. The name and description labels can contain more than one word within quotes. For example, suppose we want to create a database containing a set of workshop papers, we might set this up by:

Rcreate -n "Workshop papers" -k "Papers for workshop" -t 4

We can check that this has worked by running the Rls command. This will return information like


StudyID? : 1026 Name: Workshop papers

where the StudyID? number will differ for different people. Now we can look at this in more detail using the Rget command:

Rget -s studyID

where you add your StudyID? number. For the example above:

Rget -s 1026

gives


StudyID? : 1026 Name: Workshop papers Description: Papers for workshop Created by: martin dove Status: In Progress Start_date: 07-01-2006

Adding datasets with metadata

Now we want to add some data sets to the study. Following the example of pdf publications, we could create some datasets by

Rcreate -s 1026 -n "Papers on grid computing" Rcreate -s 1026 -n "Papers on data management" Rcreate -s 1026 -n "Papers on collaborative tools" Rcreate -s 1026 -n "Papers on escience applications"

Each invocate will create a DatasetID? , as will be echoed to the screen. Now check on the results of these commands by

Rls -s 1026

This will show you the DatasetID? for each dataset (again, different users will get different numbers). You can look at any one dataset by using the command

Rget -d DatasetID?

where you use the appropriate number of each DatasetID? .

Now we will add some metadata against each data set. For this we use the Rannotate command. The first is to add a brief description to the dataset. In my example, running Rls - s 1026 gives


Dataset ID: 26 Dataset Name: Papers on grid computing Parent StudyID? : 1026
Dataset ID: 27 Dataset Name: Papers on data management Parent StudyID? : 1026
Dataset ID: 28 Dataset Name: Papers on collaborative tools Parent StudyID? : 1026
Dataset ID: 29 Dataset Name: Papers on escience applications Parent StudyID? : 1026

We can use the Rannotate command in in two ways. First we can add a description to the dataset. My example is

Rannotate -d 29 -k "Collection of papers on escience applications"

Second we can add some name pairs. My example is

Rannotate -d 29 -p topic=escience Rannotate -d 29 -p topicarea=applications

Running the Rget -d 29 command to view the metadata gives


DatasetID? : 29 Name: Papers on escience applications Parent StudyID? : 1026 Created by: martin dove Creation_date: 07-01-2006 Description: Collection of papers on escience applications

Note that this shows the description but not the name pair values. To see the name pairs I need to use the command Rget -d 29 -p, which yields:


Parameter Name: topic Parameter Value: escience
Parameter Name: topicarea Parameter Value: applications

You can repeat this for other datasets, and you can be add whatever name/value pairs you like.

Adding data objects with metadata

Finally we reach the point where we can add metadata to the data objects. You need to first have data somewhere, and in our case our data are in the SRB. The data object can either be a file or a collection of files within the SRB. The command for adding metadata to a data object is

Rcreate -u -d -n

The specifies where the file is and has the form

srb:////

In general: is composed of

/home/.//.../.

An example might be

srb://Test/home/nieessrb40.srbdom/test.dat

The gives the dataset that you want to associate the file with, and is the name you want to give the data object.

You then add metadata with the Rannotate command in the same way that you added name/value pair metadata to the datase:

Rannotate -o dataObjectID -p =

where you get the object dataID from the dataset using the command Rls -d . Hopefully by now you are getting more familiar with the various ID labels: studyID, datasetID and now dataObjectID for the study, dataset and data object respectively.

As before, you can use the Rget command to get the metadata from a data object:

Rget -o -p

Searching on the metadata

The power of metadata comes down to what you do with it! The Rcommands provide for this with the Rsearch command. There are several ways to use this command:

Rsearch -s studyID -p = Rsearch -d datasetID -p = Rsearch -d datasetID -k Rsearch -o dataObjectID -k

Once you have created enough metadata you can experiment with the Rsearch command.

Syntax of the RCommand line commands

Rinit ...

Starts an RCommands session. [edit] Usage

Rinit [-v] [edit] Description

Rinit reads in the config information from ~/.rcommands/rcommand.config, it then authenticates with the RCommand server, and then obtains a session key which is stored in ~/.rcommands/rcommand.. This session key is valid for one hour and is specific to the shell instance within which Rinit was executed. [edit] Option

  • -v: Prints version string and exits

Exit status

Rinit returns zero on success or non zero if there is an error. [edit] Files

  • ~/.rcommands/rcommand.config: RCommand configuration information
  • ~/.rcommands/rcommand.: Session key for shell with

Rpasswd ...

Changes the RCommand password [edit] Usage

Rpasswd [-v]

Description

Rpasswd changes user RCommand password both on the RCommand server and within the user configuration file. [edit] Options

  • -v: Prints version string and exits

Exit status

Rpasswd return zero on success or non zero if there is an error. [edit] Files

  • ~/.rcommands/rcommand.config: RCommand configuration information

Rcreate ...

Creates metadata objects [edit] Usage

Rcreate -v Rcreate -n -k -t Rcreate -s -n Rcreate -d -n -u

Description

Rpasswd creates either study, dataset or data object.

Options

-v : Prints version string and exits -s studyID : StudyID? to create dataset in -d datasetID : DatasetID? to create data object in -n name : Name of study, dataset or data object -k description : Description of study -t topicID : Initial topic ID for study -u url : URL of data object

Exit status

Rcreate return zero on success or non zero if there is an error.

Rannotate ...

Attaches a parameter (name/value pair) to either a dataset or a data object [edit] Usage

Rannotate -v Rannotate -s studyID -t topicID Rannotate [-s studyID | -d datasetID] -k Rannotate [-d datasetID | -o dataID] -p =

Description

Rannotate is used to annotate different entities within metadata database. Topics can be assigned to studies or parameters (name/value pair) can be attached to either a dataset or a data object. Description fields of study or dataset can be updated using the -k flag. [edit] Options

-v : Prints version string and exits -s studyID : Specifies study to annotate -d datasetID : Specifies dataset to annotate -t topicID : TopicID? to add to study -k : Description to add to either study or dataset -p = : Name/value to add to dataset/data object -t topicID : Specifies topic to add to study

Exit status

Rannotate return zero on success or non zero if there is an error.

Rls ...

Lists different entities within metadata database

Usage

Rls [-v | -c | -t] Rls -s studyID Rls -d datasetID

Description

Rls lists entities within the metadata database. With no arguments, it will list all studies where the user is either the originator or an investigator. With -c or -t options, it lists the people or topics, respectively, within the database. The -s option will list the data sets within the specified study, while the -d option will list the data objects within the specified dataset.

Options

-v : Prints version string and exits -s : Lists datasets within a given study. -d : Lists data objects within a given dataset. -o : Shows metadata corresponding to a given data object. -t : Lists topics within database. -c : Lists people (colleagues/collaborators) within database.

Exit status

Rls return zero on success or non zero if there is an error.

Rget ...

Displays metadata associated with particular entity [edit] Usage

Rget -v Rget -s studyID [-c|-t] Rget [-d datasetID | -o dataObjectID] [-p]

Description

Rls shows metadata associated with metadata objects or their parameters.

Options

-v : Prints version string and exits -s studyID : Selects study to show metadata -d datasetID : Selects dataset to show metadata -o dataObjectID : Selects data object to show metadata -c : If used with -s, will list investigators associated with study -t : If used with -s, will list topics associated with study -p : If used with -d or -o, will list parameters associated with either dataset or data object

Exit status

Rget return zero on success or non zero if there is an error.

Rrm ...

Removes different entities from metadata database

Usage

Rrm [-v] Rrm -s studyID -t topicID Rrm [-d datasetID | -o dataObjID] [-p ParamName? ]

Description

Rrm removes entities or parameters from within the metadata database.

Options

-v : Prints version string and exits -s : Specifies study to remove topics from. -d : Specifies dataset to remove, or dataset parameter if used in conjunction with -p option. -o : Specifies data object to remove, or data object parameter if used in conjunction with -p option. -t : Specifes topic to remove from topic list. -p : Used with -d or -o options in order to remove dataset or data object parameters.

Exit status

Rget return zero on success or non zero if there is an error.

Rchmod ...

Adds or removes investigators to/from a study

Usage

Rchmod -v Rchmod -s studyID [+c|-c] personID

Description

Rchmod adds or removes investigators to/from a study

Options

-v : Prints version string and exits -s studyID : StudyID? to modify investigator list -c personID : Removes corresponding person from study list +c personID : Adds corresponding person to study list

Exit status

Rchmod return zero on success or non zero if there is an error.

Rsearch ...

Searches dataset and data objects for parameters

Usage

Rsearch -v Rsearch -u url Rsearch -t topicID Rsearch [-s studyID | -d datasetID] -p = Rsearch [ -d datasetID | -o dataObjectID ] -k

Description

Searches for entities within metadata database. Can search for studies by topic. Can search for keywords within study and/or dataset metadata. Can search to specifed parameters attached to either datasets and/or data objects. Can search for data objects with a specific url.

Options

-v : Prints version string and exits -u url : Searches for data object with specified url -s studyID : Specifies study to search -d datasetID : Specifies dataset to search -t topicID : Searches for studies with this topicID -p = : Parameter to search for -k keyword : Specifies keyword to search for

Exit status

Rsearch return zero on success or non zero if there is an error.

Rexit ...

Finishes an RCommand session

Usage

Rexit -v

Description

Rexit removes the shell session file (~/.rcommands/rcommand.) and contacts the RCommand server to invalidate the session key. If Rexit is not used, the session key will expire one hour after it was created.

Options

-v : Prints version string and exits

Files

  • ~.rcommands/rcommand.config - RCommand configuration information
  • ~/.rcommands/rcommand. - Session key for shell with

Exit status

Rexit return zero on success or non zero if there is an error.

Topic revision: r1 - 19 Jan 2009 - 10:55:54 - RobAllan
 
This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback