Subsections


RMCS: REMOTE MY_CONDOR_SUBMIT

References [#!globus!#,#!globus2!#,#!mcs!#,#!mcs-user!#,#!bruin!#,#!calleja!#,#!seagul!#,#!rmcs!#,#!rmcs-arch!#].

Summary

RMCS, RCommands and RGem together comprise a suite of software tools enabling the efficient utilisation of spare and distributed computer capacity within an organisation or a federation of organisations (a mini-Grid). Its particular advantage lies in the capability for the user to easily submit many computational calculations at once from his desktop machine, which is specifically useful for scenario analysis and parameter sweep type application as they might be found in finance, flood prediction, environmental, chemical or medical applications. The software supports the generation of the required input data sets, the submission of the computational applications to the pool of available resources, the collection of the results, extraction and cataloguing of key parameters from the results and their representation on user request in table or graphic form. It is also possible to integrate the access to the software functionality into existing products such as Matlab, Materials Studio and other Graphical User Interfaces. If required the software can provide well documented audit trails for any cycle of calculations.

In short, RMCS is a Grid computing solution that was developed in order to provide a functional and accessible interface to cluster computing and associated data/ metadata management capabilities.

Functionality Overview

RMCS is currently principally aimed at cluster computing, i.e. dedicated resources on a mini-Grid.

The key benefits provided by RMCS are a high degree of automation and a high degree of abstraction/ versatility with respect to how that functionality is presented to the user.

Automation

The automation essentially removes most of the ``book keeping'' tasks associated with running calculations, e.g. manually selecting machines, moving files to/ from machines, monitoring a job's progress and collating results.

The user provides the framework with a single control file that contains a list of machines and the locations of the application to run and input files. The framework will select a machine based on certain criteria, copy the executable and input files to the chosen cluster and then submit the job to the machine's queuing system. RMCS tracks individual job's progress and gives a single view of the user's calculations (that will often be running on many different machines). When the calculations finish, the output files are uploaded to a central file store and key metadata relating to the calculation is extracted and stored in a database. The end result of this is that once a set of calculations completes, all the output files are stored in a single location and a key subset of the results are retrievable from the metadata database for immediate analysis.

In many cases, custom tools were developed to facilitate with the generation of the input files themselves and the setting up of the control files to the Grid etc. This led to it being trivial to run sets of independent calculations relating to a single problem.

Benefits to Automation:

Abstraction and Versatility

The use of grid middleware provides a common interface to different computational resources, i.e. the user is hidden from details of the underlying system and its batch queuing system. The user can drive the system using a single control file (most of which could be generated if required, e.g. via a ``wizard'' interface).

Typically a user of grid technology would require significant firewall configuration, which is not always possible within a corporate environment or even at home (e.g. behind a NAT router). RMCS uses a standard software architecture (a three tier system) in order to solve this. In essence, the user talks to a central server that has all the required network configuration, which then proxies the user's requests to the grid infrastructure.

The way that the grid functionality is exposed to the user makes it possible to create a range of interfaces and programming libraries. This means that the functionality could be integrated into existing interfaces and custom interfaces for specific tasks could be readily constructed. Benefits to abstraction:

RMCS provides a client interface to the My_Condor_Submit package developed at University of Cambridge in the e-Minerals Project [#!e-minerals!#,#!calleja!#].

My_Condor_Submit (MCS) [#!bruin!#,#!mcs!#] is a tool developed by the eMinerals project to allow simplified job submission to remote Grid resources with built-in meta-scheduling and load balancing, and both data and metadata management. The meta-scheduling is implemented within MCS itself while the job submission is handled by Condor-G and the metadata capture and storage are handled by RCommands, and the SRB SCommands, respectively. The Globus Toolkit is used to provide security but creates a data IO problem; MCS also solves this problem. One can submit MCS jobs from a client machine with Condor-G, Globus and SRB clients installed.

Some work done using MCS and RMCS is described in [#!alfredsson!#,#!AHM2007:NWGRIDMiddleware!#,#!petit!#].

Attributes

Version: 0.1.2
Public calls:
Public modules:
Other software required: SOAP::Lite, gSOAP
Other modules required: GROWL Scripts
Date: 2006-9
Origin: P.R. Tyer, CCLRC Daresbury Laboratory. Re-written by R.J. Allan, Daresbury Laboratory
Remote services required: SRB and MCS servers
Restrictions: Only one SRB connection can be active at once per program compiled with this library.
Language: Perl, C
Conditions on external use: Standard, see separate chapter

How to use the Package

The client package is included with the G-R-Toolkit and is configured to communicate with an available RMCS server.

The existing MCS (my_condor_submit) framework as developed at University of Cambridge for the e-Minerals project is essentially a 2-Tier system, with the submit node forming the first tier and all backend resources (collectively referred to as ``The Grid'') forming the second tier. The aim of RMCS is to move to the 3-Tier model shown in Figure 6.1. The use of the three tier model will remove the requirements for extensive firewall configuration and Grid middleware to be installed on the desktop client machines. Also since the client layer will be very thin and mostly be composed of Web service invocation code, it will be much easier to integrate the functionality into existing applications.

Figure 6.1: High level logical architecture showing system as a standard 3 tier application

Workflow and Job States

The steps in the RMCS workflow are as follows.

  1. Pre-scheduling actions
  2. Pre-processing actions
  3. Main job
  4. Post-processing actions

This workflow is achieved using Condor Dag-Man and the RMCS script has a quasi-Condor interface.

Figure 6.2: RMCS Architecture.

Figure 6.3: RMCS Server Architecture.

[To be re-written.] The different states that a job can be in are shown in the table below. Figure 6.4 shows the possible state transitions within the system. The RENEW PROXY state arises when the system requires a valid proxy to action a task, but the current user proxy has expired within the MyProxy repository. It is perhaps better thought of as a blocking mask on an existing state, e.g. the job state is actually running, but the Renew Proxy flag is set. The MyProxy username and password are stored within the persistent storage framework for the lifetime of the job. This allows the R-MCS system to download a fresh delegate if the one downloaded originally has expired.

Figure 6.4: State transitions within R-MCS system.

Job State Description
SUBMIT-PENDING User has called submit job web service, but system has not acted on this yet
CANCEL-PENDING User has called cancel job web service, but system has not acted on this yet
QUEUING Job is in a batch queue.
RUNNING Job is running.
FINISHED Job has finished.
SUBMIT-FAILED System has acted on job in SUBMIT-PENDING state, but submission failed.
CANCEL-FAILED System has acted on job in CANCEL-PENDING state, but submission failed.
JOB-FAILED Previously RUNNING job has failed. Not currently distinguishing between FINISHED and JOB-FAILED
RENEW-PROXY Mask that can apply to RUNNING, CANCEL-PENDING and QUEUING states.
CLEANME This is an internal state used to indicate that temporary files and db entries associated with a job are to be purged.

Specification of RMCS

The initial set of command line tools are presented in the sections with the same name below. It is important to note that this is only an inital set and these can and hopefully will be revised/ enhanced/ modified etc. in response to user feedback.

Command Description Usage
rmcs_cancel Cancels job rmcs_cancel -j jobID
rmcs_cleanup Cleans up sessionId directory on RMCS server rmcs_cleanup
rmcs_getoutput Downloads a file from the sessionId directory on the RMCS server rmcs_getoutput
rmcs_init Creates new sessionId and downloads certificate to RMCS server rmcs_init
rmcs_listdir Lists files in the sessionId directory on the RMCS server rmcs_listdir
rmcs_putinput Uploads a file into the sessionId directory on the RMCS server rmcs_putinput
rmcs_remove Removes job details of finished/failed job rmcs_remove -j jobID
rmcs_rmfile Removes a named file from sessionId directory on the RMCS server rmcs_rmfile
rmcs_session Prints current RMCS sessionId rmcs_session
rmcs_select_session Selects an old RMCS sessionId for re-use rmcs_select_session -s sessionId
rmcs_status Gets job details of either specified job or all user's jobs rmcs_status -j jobID
rmcs_submit Submits MCS file rmcs_submit -i MCS Filename -n jobName
rmcs_update Uses existing sessionId and downloads new certificate to RMCS server rmcs_update

Session information is managed by a file called .sessionId which is stored in the working directory. This permits the use of multiple sessions with associated local files. SessionId can be printed using the rmcs_session command and an old session re-instated using rmcs_select_session -s oldSessionId.

rmcs_cancel

NAME rmcs_cancel - Cancels active RMCS job or jobs
SYNOPSIS rmcs_cancel -v

rmcs_cancel -j jobID | jobID range

DESCRIPTION rmcs_cancel cancels active jobs, i.e. those in states RUNNING, QUEUING, CANCEL-PENDING or SUBMIT-PENDING.
OPTIONS -j Specifies job(s) to cancel via jobID(s). Range strings can be - and/or , separated.

-v Prints version string and exits

EXIT STATUS rmcs_cancel returns zero on success or non zero if there is an error.

rmcs_cleanup

NAME rmcs_cleanup
SYNOPSIS Cleans up sessionId directory on RMCS server
DESCRIPTION Cleans up sessionId directory on RMCS server
OPTIONS  
EXIT STATUS  

rmcs_getoutput

NAME rmcs_getoutput
SYNOPSIS downloads a named file from the RMCS server sessionId directory
DESCRIPTION downloads a named file from the RMCS server sessionId directory
OPTIONS file name
EXIT STATUS  

rmcs_init

NAME rmcs_init
SYNOPSIS this script initialises a new GRT RMCS session
DESCRIPTION this script initialises a new GRT RMCS session. It can only be used if the user has a valid Grid proxy certificate in the NGS MyProxy repository. Use growl-login -m to do this.
OPTIONS none
EXIT STATUS  

rmcs_listdir

NAME rmcs_listdir
SYNOPSIS lists all files in the server RMCS sessionId directory
DESCRIPTION lists all files in the server RMCSsessionId directory
OPTIONS  
EXIT STATUS  

rmcs_putinput

NAME rmcs_putinput
SYNOPSIS uploads a named file to the RMCS server sessionId directory
DESCRIPTION uploads a named file to the RMCS server sessionId directory
OPTIONS file name
EXIT STATUS  

rmcs_remove

NAME rmcs_remove - Removes terminated / failed job details from RMCS system
SYNOPSIS rmcs_remove -v

rmcs_remove -j jobID | jobID range

rmcs_remove -all

DESCRIPTION rmcs_remove removes details of inactive jobs, i.e. those in states FIN- ISHED, JOB-FAILED, CANCEL-FAILED or SUBMIT-FAILED from the RMCS system.
OPTIONS -j Specifies job(s) to remove via jobIDs. Range strings can be - and/or , separated.

-v Prints version string and exits

-all Will remove all user's inactive jobs

EXIT STATUS rmcs_remove returns zero on success or non zero if there is an error.

rmcs_rmfile

NAME rmcs_rmfile
SYNOPSIS removes a named file from the sessionId directory on the RMCS server
DESCRIPTION  
OPTIONS  
EXIT STATUS  

rmcs_select_session

NAME rmcs_select_session
SYNOPSIS selects an RMCS sessionId for re-use
DESCRIPTION  
OPTIONS  
EXIT STATUS  

rmcs_session

NAME rmcs_session
SYNOPSIS Prints out current RMCS sessionId
DESCRIPTION  
OPTIONS  
EXIT STATUS  

rmcs_status

NAME rmcs_status - Gets details of either user jobs, active machines or active vaults from RMCS server
SYNOPSIS rmcs_status -v

rmcs_status [-j jobID]

rmcs_status -s state

rmcs_status -l

rmcs_status -m

rmcs_status -server

DESCRIPTION To get details of all user's job, use rmcs_status without options. With the -s option, all jobs in a particular state can be found. The -j option gives detailed information on a single job.

To list active machines within the RMCS system, use the -l options. Similarly, active SRB vaults can be found using the -m option.

OPTIONS -j jobID Get details information on a single job

-l Lists active machines known to RMCS server

-m Lists active SRB vaults known to RMCS server

-s state Lists user's jobs in a particular state

-s-server Shows current number of jobs in SUBMIT-PENDING state and active jobs on server

-v Prints version string and exits

EXIT STATUS rmcs_status returns zero on success or non zero if there is an error.

rmcs_submit

NAME rmcs_submit - Submits a job to the RMCS system
SYNOPSIS rmcs_submit -v

rmcs_submit -i MCS file [-n job name] [-e] [-S]

DESCRIPTION rmcs_submit submits a job to the RMCS system. The user must specify a valid MCS file using the -i option, and optionally an arbitrary job name using the -n option. The command will prompt the user for their MyProxy password (unless -S is used).
OPTIONS -i MCS filename Specifies MCS file to use for this job

-n job name Label this job with an arbitrary name

-e RMCS server will email on job completion

-S MyProxy password will be read directly from stdin rather than prompting interactively

-v Prints version string and exits

EXIT STATUS rmcs_submit returns zero on success or non zero if there is an error.

rmcs_update

NAME rmcs_update - Updates MyProxy details associated with user's current jobs and clears the proxy expired flag
SYNOPSIS rmcs_update [-v]
DESCRIPTION rmcs_update prompts the user for a new MyProxy password, which is then associated with the user's current jobs.
OPTIONS -v Prints version string and exits
EXIT STATUS rmcs_update return zero on success or non zero if there is an error.
FILES /.rmcs/rmcs.config - RMCS configuration information

Template

NAME  
SYNOPSIS  
DESCRIPTION  
OPTIONS  
EXIT STATUS  

Error Codes

my_condor_submit has many error codes relating to different exit methods from the script. Some of these result from invalid input while others relate to network problems or to finding suitable machines to execute the user's submitted job on. The table below lists each error code and message and gives an explaination of the problem. Note that some of the error messages are truncated in the table compared to the string output to the user as some additional information can be output at runtime, for example the line within the input file that caused an error could be output but would not be included in the table below.

Error code Message Meaning
     
0 N/a Job submitted successfully (Not really an error)
1 Usage message Used to output the my_condor_submit command usage (Not really an error)
2 You have specified an invalid debug statement when calling my_condor_submit. Argument two to my_condor_submit must be 'debug' Used if you have called my_condor_submit in a way other than the correct usage.
3 Can't open the metascheduling database on any available server. my_condor_submit failed to connect to any of the databases it knows about and so couldn't metaschedule.
4 Can't open the specified input file my_condor_submit was unable to open the specified input file. Does it definitely exist?
5 You have entered an empty string as the pathToExe line Your input file contains a pathToExe statement with nothing after the = character. Enter a path and try again
6 You can only specify one 'pathToExe' line per submission More than one pathToExe statement was detected within your input file. Only one is allowed so please remove any extra entries. NOTE: This error message is deprecated and may be removed from later versions of my_condor_submit
7 You can only specify one 'numOfProcs' line per submission More than one numOfProcs statement was detected within your input file. Only one is allowed so please remove any extra entries
8 You can only specify one 'preferredMachineList' line per submission More than one preferredMachineList statement was detected within your input file. Only one is allowed so please remove any extra entries
9 Unable to open specified input file my_condor_submit was unable to open the specified input file. Does it definitely exist?
10 You have specified an empty directory name with an 'Sdir' directive. Please specify a valid directory name and try again An Sdir directive was detected with nothing after the = character. Enter a path and try again.
11 You must specify an 'Sdir' line before the 'Sput' line Your input file contains out of order statements. An Sput line must occur after an Sdir line.
12 You have specified an empty list of files to upload to the SRB with the following line. Please specify either a list of files or '*' and try again An Sput statement was detected with nothing after the = character. Enter a list of files and try again.
13 You must specify an 'Sdir' line before the 'Sget' line Your input file contains out of order statements. An Sget line must occur after an Sdir line.
14 You have specified an empty list of files to retrieve from the SRB with the following line. Please specify either a list of files or '*' and try again. An Sget statement was detected with nothing after the = character. Enter a list of files and try again.
15 You entered an Sforce line with a value other than 'true' or 'false' Sforce lines may only have true or false as their arguments.
16 You entered an empty 'RHome' line this doesn't make sense. An empty RHome line was detected in your input file. Please enter a path after the = character or remove the line and try again.
17 You can only have one RStudyID line per job submission Multiple RStudyID lines were detected within your input file. Only one is allowed so please remove any extra entries
18 You can only have one RDatasetName line per job submission Multiple RDatasetName lines were detected within your input file. Only one is allowed so please remove any extra entries
19 You must specify a valid alphanumeric value for RDatasetName on the line my_condor_submit detected an RDatasetName line with a value which is not made up of alphanumeric characters. Any non alphanumeric characters are not allowed, so please remove them.
20 You can only have one RDatasetID line per submission Multiple RDatasetID lines were detected within your input file. Only one is allowed so please remove any extra entries
21 You have specified an empty string to be used as the description of a data object created by the Rcommands with the line: An empty RDesc line was detected in your input file. Please enter a string after the = character or remove the line and try again.
22 You have specified an RDesc line without a preceeding Sdir line. Please rearrange your input file and try again Your input file contains out of order statements. An RDesc line must occur after an Sdir line. NOTE: This error message is deprecated and may be removed from later versions of my_condor_submit
23 You have specified a GetEnvMetadata line without a preceeding Sdir line. Please rearrange your input file and try again Your input file contains out of order statements. A GetEnvMetadata line must occur after an Sdir line
24 You entered an unexpected value for GetEnvMetadata, possible values are 'true' or 'false' You entered GetEnvMetadata lines may only have true or false as their arguments.
25 You have specified a AgentXDefault line without a preceeding Sdir line. Please rearrange your input file and try again Your input file contains out of order statements. An AgentXDefault line must occur after an Sdir line
26 You have specified a AgentX line without a preceeding Sdir line. Please rearrange your input file and try again Your input file contains out of order statements. An AgentX line must occur after an Sdir line
27 Malformed AgentX line detected Your input contains an AgentX line which has an invalid format. Please adjust the line and try again
28 You entered an empty output filename with the line Your input contains an Output statement with nothing after the = character. Enter a filename and try again
29 For my_condor_submit to function properly you must set Transfer_Output to 'false Your input contains a Transfer_Output line with value not equal to false this is invalid. The only possible value is false.
30 You have specified an unexpected value for Transfer_Error - you must enter either 'true' or 'false' Your input file contains a Transfer_Error line with an invalid value. You must specify Transfer_Error to be either true or false
31 You are attempting to have my_condor_submit meta-schedule for you, so Transfer_Executable must be set to 'false' If meta-scheduling then your executable must come from the SRB and so Transfer_Executable must be set to false.
32 You have specified an unexpected value for Transfer_Executable - you must enter either 'true' or 'false' Your input file contains a Transfer_Error line with an invalid value. You must specify Transfer_Error to be either true or false.
33 You specified a universe other than globus - this is incorrect. Please change to globus and try again my_condor_submit is only able to submit to globus enabled execution machines and as such specifying an universe line with a value other than globus does not make sense. Please change the line and try again.
34 You have specified a globusscheduler line but are also attempting to meta schedule. This doesn't make sense as meta-scheduling will decide where to run for you. Please remove the globusscheduler line and try again It doesn't make sense to ask my_condor_submit to meta-schedule for you and to specify a machine to submit to using the globusscheduler line. Therefore either remove the globusscheduler line or the meta-scheduling lines.
35 The jobmanager specified is not one that is supported by my_condor_submit at this time You have specified that you want to submit to a jobmanager that is not supported by my_condor_submit with a globusScheduler line. Are you sure you want to submit to this jobmanager?
36 You must either specify both a study ID *AND* a dataset name *OR* a dataset ID when using the RCommands Metadata can either be created in an existing dataset in which case you must specify the ID of an existing dataset to be used with an RDatasetID line; or it can be created in a newly created dataset in which case you must specify both the ID of an existing study with a RStudyID line and a name for the created dataset with a RDatasetName line.
37 You must specify a name to be used when creating a dataset within the specified study Your input file must contain both a RStudyID and RDatasetName statement or just a RDatasetID statement if you are going to collect metadata as part of your submitted job. Currently your input file does not satisfy this requirement.
38 You have specified that metadata should be collected as part of the run, however you have not correctly specified a dataset to be used to store the created metadata Your input file specifies that metadata should be collected but you have not included either an RDatasetID statement or the combination of both RStudyID and RDatasetName statements required to tell my_condor_submit where to store the metadata collected.
39 You must specify either a specific machine to run on using a globusscheduler line, or you must specify at least a directory to get your executable from using a pathToExe line if you want my_condor_submit to meta-schedule for you. Please specify one of these lines and try again Your input file does not specify where you wish to submit the job to (which can be set using the globusscheduler statement) or that you want my_condor_submit to choose somewhere to submit for you (which requires at least a pathToExe statement to be specified). Enter the type of statement needed to perform the type of submission you require and try again.
40 You have specified that metadata be captured for the following SRB directory but you have not specified the required name for the data object that will be created. Each specified Sdir entry that has associated metadata must also have its own RDesc line to create the data object into which the metadata will be stored
41 You must specify jobType to equal either performance or throughput if you want my_condor_submit to meta-schedule for you. Your input file contains a JobType element which has an invalid value. You must specify performace to submit to a cluster or throughput to submit to a condor pool.
42 You must specify numOfProcs to be a number not equal to 0 if you want my_condor_submit to meta-schedule for you. Your input file contains a numOfProcs statement whose value is not a number greater than 0 as required by my_condor_submit.
43 There was an error retreiving the list of machines to submit to from the database There was a problem connecting to the central my_condor_submit databases. Please try again shortly.
44 There are currently no machines to submit to. Please try again later None of the machines within your specified preferredMachineList can currently be submitted to by my_condor_submit. This could be due to planned maintenance on the machines in question. Use my_condor_submit -l to see all machines that are currently available to be submitted to.
45 my_condor_submit was unable to decide on a machine to submit to - its possible that all machines are refusing connections from you. Check that you can submit via globus directly and then try again All of the machines within your specified preferredMachineList statement that my_condor_submit is currently allowed to submit to appear to be refusing connections via globus. Check that you can submit a fork job to each of the machines manually before attempting to submit this job again.
46 There was an error retrieving the architecture and relevant queue command of a machine from the database. Please try again later. There was a problem querying the central databases for the execution machine's architecture and queue status command. Could be caused by network congestion / a busy period with the database, try again after a few minutes wait.
47 There was a problem retrieving the name of the machine ranked 1 from the database my_condor_submit encountered an error querying the central databases, wait a few minutes and try again
48 There was a problem working out how many machines there are my_condor_submit encountered an error when querying the central database. Please wait a few minutes and try again
49 There was a problem getting the previous rank of the machine just submitted to. my_condor_submit encountered an error when querying the central database as part of the machine reordering after successfully choosing a machine to submit to. The job will not have been submitted however so wait for a few minutes and try again
50 There was a problem moving the machine just submitted to, to the bottom of the rankings my_condor_submit encountered an error when updating the central database as part of the machine reordering after successfully choosing a machine to submit to. The job will not have been submitted however so wait for a few minutes and try again
51 There was an error adjusting the machine ranking within the database my_condor_submit encountered an error when updating the central database as part of the machine reordering after successfully choosing a machine to submit to. The job will not have been submitted however so wait for a few minutes and try again
52 Unable to retrieve your environment from the chosen execute machine. Ensure that you are able to run jobs there before submitting again to There was an error running /usr/bin/env on the remote machine to retrieve necessary information.
53 You cannot specify an agent-x line containing Invalid AgentX line detected within the input file. The invalid section will be output along with the message and must be fixed before resubmitting.
54 The specified jobmanager is not supported by my_condor_submit at this time when creating the main job file in create_main_condor_job(). Specified jobmanager my_condor_submit got to the stage of creating the main section of the job to be submitted and has found that the specified jobmanager cannot be submitted to currently. If you specified the jobmanager using a globusscheduler statement then please change the jobmanager. However if metascheduling then please report the error
55 You must have a valid proxy valid for at least 2 hours to submit using my_condor_submit please generate a new proxy using grid-proxy-init and submit your job again Condor will only submit globus jobs if the user has a proxy valid for at least 2 hours. my_condor_submit has found that your proxy is valid for less time than this, you should extend your proxy and then resubmit.
56 There was an error submitting the dag to condor An error occurred when submitting the job to condor. A more specific error message as returned by condor should be output at the same time.
57 There was a problem retrieving the list of available SRB vaults from the database As part of the SRB vault load balancing my_condor_submit encountered an error when retrieving the list of available vaults from the central database. Please wait a few minutes and try again
58 You have specified a MetadataString line without a preceeding Sdir line. Please rearrange your input file and try again Your input file contains out of order statements. All MetadataString elements should be preceded by an Sdir statement. Rearrange your input file and submit again
59 Malformed MetadataString line detected There was a problem parsing a MetadataString statement. Reformat the string to comply with the format specified in the user manual and submit again
60 Invalid MetadataString line detected. Unequal numbers of parameter names and values parsed from your input lines There was a problem parsing all of the MetadataString statements within the input file - more metadata names were extracted than values. Check your input file and submit again.
61 Unable to parse your agentX line when trying to consider your specified refinement An invalid AgentX statement was detected within your input file. Check that the line complies with the syntax specified within the my_condor_submit user manual and then submit again
62 You specified an Error line with an empty filename This relates to an error within my_condor_submit. Please contact my_condor_submit@eminerals.org with the exact error message output.
63 Unrecognised jobmanager specified by the metascheduling database Querying the central database resulted in my_condor_submit wanting to submit to a jobmanager that it does not know how to submit to. Please try again and if the problem persists contact my_condor_submit@eminerals.org
64 Unable to find a valid grid-proxy. Please create one using grid-proxy-init and try again As part of the checks to see that the user has a valid proxy by my_condor_submit no proxy could be found. Ensure that you have a valid proxy (by running grid-proxy-init) and try again
65 Empty filename specified within an Sget line. Please check the Sget lines specified and try again my_condor_submit encountered an empty list of files to get when creating the pre job. Check your input file and try again
66 Differing numbers of possible database hosts and ports detected - check internal my_condor_submit variables and try again my_condor_submit detected an error with its database connection code. Please contact my_condor_submit@eminerals.org with details of the error
67 No databases specified to try to connect to - check internal my_condor_submit variables and try again. my_condor_submit detected an error with its database connection code. Please contact my_condor_submit@eminerals.org with details of the error
68 The specified input file doesn't exist. File my_condor_submit was unable to find the input file specified when it was called. Please check that the file exists and try again
69 debug method called with no argument An internal debugging message error was detected. Please try again but with my_condor_submit with debugging turned off
70 addTracebackData called with an empty or no arguments An internal debugging message error was detected. Please contact my_condor_submit@eminerals.org with this error and your input file.
71 removeTracebackData called when no traceback items to remove An internal error within my_condor_submit has been found. Please contact my_condor_submit@eminerals.org with the error message and your input file.
72 You entered an empty input filename with the line: You specified an input statement within your input file that has nothing after the = character. Please check your input file and try again.
73 Problem retrieving the list of cluster machines that can be submitted to. An error was encountered contacting the central database. Please wait a few minutes and try again
74 Problem retrieving the list of condor pools that can be submitted to currently An error was encountered contacting the central database. Please wait a few minutes and try again
75 You specified more than one pathToExe line. Only one may be specified per submission. Second line specified You may only have up to one pathToExe line within your input file but more than one was detected. Please remove any extra occurences and try again.
76 You have specified more than one SRBHome line. Only one may be specified per submission You may only have up to one SRBHome line within your input file but more than one was detected. Please remove any extra occurences and try again.
77 You have specified more than one Sput line for one of your specified Sdir lines. Each Sdir may have a maximum of one Sput line Multiple Sput lines were detected within one Sdir line in your input file. Only one Sput can be specified per Sdir. Please remove the extra SPut statements and try again.
78 You have specified more than one Sget line for one of your specified Sdir lines. Each Sdir may have a maximum of one Sget line Multiple Sget lines were detected within one Sdir line in your input file. Only one Sget can be specified per Sdir. Please remove the extra Sget statements and try again.
79 You have specified more than one Rdesc line for one of your specified Sdir lines. Each Sdir may have a maximum of one RDesc line Multiple RDesc lines were detected within one Sdir line in your input file. Only one RDesc can be specified per Sdir. Please remove the extra RDesc statements and try again.
80 You have specified more than one GetEnvMetadata line for one of your specified Sdir lines. Each Sdir may have a maximum of one GetEnvMetadata line Multiple GetEnvMetadata lines were detected within one Sdir line in your input file. Only one GetEnvMetadata can be specified per Sdir. Please remove the extra GetEnvMetadata statements and try again.
81 You have specified more than one AgentXDefault line for one of your specified Sdir lines. Each Sdir may have a maximum of one AgentXDefault line Multiple AgentXDefault lines were detected within one Sdir line in your input file. Only one AgentXDefault can be specified per Sdir. Please remove the extra AgentXDefault statements and try again.
82 You may only specify one JobType line per submission Multiple JobType lines were detected within your input file. Only one is allowed so please remove any extra entries
83 You may only specify one Sforce line per submission Multiple Sforce lines were detected within your input file. Only one is allowed so please remove any extra entries
84 You must not specify a dataset name and a dataset id within the same submission Your input file contains conflicting statements regarding the dataset to be used within the metadata database. Please refer to the metadata section within the my_condor_submit user guide.
85 You must specify an RStudyID line when using a RDatasetName line A specified RDatasetName statement must be accompanied by a RStudyID statement within your input file.
86 You may only specify a numeric RStudyID with the RStudyID line The RStudyID statement specified within your input file includes a non-numeric study ID. This ID must be a number relating to an existing study within the metadata database.
87 You may only specify a numeric RDatasetID with the RDatasetID line The RDatasetID statement specified within your input file includes a non-numeric dataset ID. This ID must be a number relating to an existing dataset within the metadata database.
88 You may only specify one executable per submission Your input file contains multiple executable statements. Only one is allowed per submission so please remove the extra entries.
89 You must specify at least one collection in the SRB to transfer files to / from using an Sdir line For my_condor_submit to function correctly you must specify at least one SRB collection to transfer to / from. As such you must have at least one Sdir entry within your input file.
90 You may only specify upto one RHome line per submission Your input file contains multiple RHome statements. Only one is allowed per submission so please remove the extra entries.
91 You have specified an RDesc line without a preceeding Sdir line - this is not valid Your input file contains out of order statements. An RDesc line must occur after an Sdir line.
92 You may only specify upto one AgentXHome line per submission Your input file contains multiple AgentXHome statements. Only one is allowed per submission so please remove the extra entries.
93 You may only specify upto one AgentXLibs line per submission Your input file contains multiple AgentXLibs statements. Only one is allowed per submission so please remove the extra entries.
94 You may only specify upto one Output line per submission Your input file contains multiple Output statements. Only one is allowed per submission so please remove the extra entries.
95 You may only specify upto one Input line per submission Your input file contains multiple Input statements. Only one is allowed per submission so please remove the extra entries.
96 You may only specify upto one Transfer_Error line per submission Your input file contains multiple Transfer_Error statements. Only one is allowed per submission so please remove the extra entries.
97 You may only specify upto one Transfer_Executable line per submission Your input file contains multiple Transfer_Executable statements. Only one is allowed per submission so please remove the extra entries
98 You must specify notification to be one of 'always', 'complete', 'error' or 'never Notification lines may only have always, complete, error or never as their arguments. my_condor_submit detected a line with a different value. Check your input file and try again.
99 You may only specify upto one Notification line per submission Your input file contains multiple Notification statements. Only one is allowed per submission so please remove the extra entries.
100 You may only specify upto one Error line per submission Your input file contains multiple Error statements. Only one is allowed per submission so please remove the extra entries.
101 You may only specify upto one globusRSL line per submission Your input file contains multiple globusRSL statements. Only one is allowed per submission so please remove the extra entries.
102 You may only specify upto one globusScheduler line per submission Your input file contains multiple globusScheduler statements. Only one is allowed per submission so please remove the extra entries.
103 You may only specify upto one transfer_input_files line per submission Your input file contains multiple transfer_input_files statements. Only one is allowed per submission so please remove the extra entries.
104 You have specified a malformed transfer_input_files line. Please check and try again Your input contains a Transfer_input_files line which has an invalid format. Please adjust the line and try again
105 You may only specify upto one x509_user_proxy line per submission Your input file contains multiple x509_user_proxy statements. Only one is allowed per submission so please remove the extra entries.
106 You may only specify upto one Sdirect line per job submission Your input file contains multiple Sdirect statements. Only one is allowed per submission so please remove the extra entries.
107 You must specify Sdirect to be either 'true' or 'false' Sdirect lines may only have true or false as their arguments. my_condor_submit detected a line with a different value. Check your input file and try again.
108 SRecurse lines must always be preceeded at some point by an Sdir line Your input file contains out of order statements. An SRecurse line must occur after an Sdir line.
109 Only one SRecurse line may be specified per Sdir line Your input file contains multiple SRecurse statements within one Sdir statement. Only one is allowed per Sdir so please remove the extra entries.
110 You must specify SRecurse to be either 'true' or 'false' SRecurse lines may only have true or false as their arguments. my_condor_submit detected a line with a different value. Check your input file and try again.
111 You have entered an empty string for the value of an SRBHome line. This is not allowed my_condor_submit has detected an empty SRBHome line within your input file which is not allowed. You must specify a path or remove the line from your input file before trying again.
112 You have entered an empty string for the value of an AgentXDefault line. This is not allowed my_condor_submit has detected an empty AgentXDefault line within your input file which is not allowed. You must specify a path or remove the line from your input file before trying again.
113 You have entered an empty string for the value of an AgentXHome line. This is not allowed my_condor_submit has detected an empty AgentXHome line within your input file which is not allowed. You must specify a path or remove the line from your input file before trying again.
114 You have entered an empty string for the value of an AgentXLibs line. This is not allowed my_condor_submit has detected an empty AgentXLibs line within your input file which is not allowed. You must specify a path or remove the line from your input file before trying again.
115 You have entered an empty string for the value of an Executable line. This is not allowed my_condor_submit has detected an empty Executable line within your input file which is not allowed. You must specify a filename before trying again.
116 You have entered an empty string for the value of a GlobusRSL line. This is not allowed my_condor_submit has detected an empty GlobusRSL line within your input file which is not allowed. You must specify some details or remove the line before trying again.
117 You have entered an empty string for the value of a GlobusScheduler line. This is not allowed my_condor_submit has detected an empty GlobusScheduler line within your input file which is not allowed. You must specify a machine and jobmanager to use or remove the line and include relevant meta-scheduling lines before trying again.
118 There was an error retrieving the path to your homespace on the chosen execute machine. Please try submitting again my_condor_submit was unable to determine the path to your homespace on the remote machine. This means that the SRB transfers cannot take place. Try submitting the job again to retrieve the required path.
119 You have entered an empty string for the value of a x509_user_proxy line. This is not allowed my_condor_submit has detected an empty x509_user_proxy line within your input file which is not allowed. You must specify some details or remove the line before trying again.
120 An AgentX line was specified with a type of path that cannot be recognised Your input file contains an AgentX line with a path that my_condor_submit did not recognise. Change your path to conform to the line specification in the user manual and try again.
121 You have entered more than one environment lines. This is not allowed, please change your input file and try again my_condor_submit input files can contain a maximum of one environment line. Remove any additional environment lines and try again.
122 You have entered an empty string for the value of an environment line. This is not allowed. If an environment line is specified then it must have a value. The value you specified was an empty string, remove the line or enter a value and try again.
123 You have entered a globusRSL line which specifies a number of processors to be used with a count section. However this number is different to the specified numOfProcs line. Please correct this and try again. You have specified contradictory values for the number of processors your job requires in the globusRSL and numOfProcs lines. Check the two lines specify the same value and try again.
124 You have specified an invalid value for job_type within the globusRSL line. For jobs requiring one processor, job_type should be set to single. For jobs wanting more than one processor, job_type should be set to mpi. Please adjust your globusRSL line to remove the (job_type=x) section and try again. (my_condor_submit will work out the correct job_type if your globusRSL line does not specify it) Remove the value specified for job_type within your globusRSL line. my_condor_submit will work out the correct value for you.
125 You have entered more than one postEnvironment lines. This is not allowed, please change your input file and try again Your input file contains more than one postEnvironment line. Remove any additional lines and try again.
126 You have entered an empty string for the value of a postEnvironment line. This is not allowed. Your input file contains an empty postEnvironment line, remove the line or enter a value for the line and try again.

Examples

These examples are from the e-Minerals Web site http://www.eminerals.org/tools/mcs.html.

Example 1

The first three lines give information about the executable (name and location within the SRB) and the standard GlobusRSL command. The three lines with names beginning with S provide the interaction with the SRB. The Sdir line passes the name of the SRB collection containing the files, and the Sput * and Sget * lines instruct MCS to download and upload all files. The lines beginning with R concern the interaction with the metadata database through the RCommands. The identification number of the relevant metadata dataset into which data objects are to be stored is passed by the RDatasetID parameter. The Rdesc command creates a data object with the specified name. Its associated URL within the SRB will be automatically created by MCS.

There is no jobType specified so this defaults to ``throughput'' appropriate for running on a Condor resource without meta-scheduling. [Actually this is not true, it needs to be specified.]

For more information see Chapter 9.

[frame=single]
# Specify the name of the executable to run
Executable = gulp

# Specify where the executable should get stdin from and put stdout to
GlobusRSL = (stdin=andalusite.dat)(stdout=andalusite.out)

# Specify an SRB collection to get the relevant executable from
pathToExe = /home/codes.eminerals/gulp/

# Specify a metadata dataset to create all metadata within
RDatasetId = 55

# Specify a directory to get files from, put
# files to and relate to metadata created below
Sdir = /home/user01.eminerals/gulpminerals/
Sget = *
Sput = *

# Creates and names a metadata data object
Rdesc = "Gulp output from andalusite at ambient conditions"

# Specify metadata to get from files with Agent-x - get environment
# and default metadata only
AgentXDefault = andalusite.xml
GetEnvMetadata = True

Example 2

This shows a more complex example than Example 1, and includes the components of Example 1. This script contains near the top parameters for the metascheduling task, including a list of specified resources to be used (preferredmachineList) and the type of job (jobType). The script involves creation of a metadata dataset. It also contains commands to use AgentX to obtain metadata from the xml file. In this case, the study concerns an investigation of how the energy of a molecule held over a mineral surface varies with its z coordinate and the repeat distance in the z direction (latticeVectorC).

This is a so-called ``performance'' job so uses MCS meta-scheduling.

For more information see Chapter 9.

[frame=single]
# Specify the executable to run
Executable = siesta
# Instruct condor to not tell us the outcome from the job by email
Notification = NEVER

# Specify which file to use for stdin and stdout
GlobusRSL = (stdin=chlorobenzene.dat)(stdout=chlorobenzene.out)

# Force overwriting when uploading / downloading files
SForce = true

# Specify an SRB collection to get the relevant executable from
pathToExe = /home/codes.eminerals/siesta/
# Specify a list of machines that we are happy to submit to
preferredMachineList = lake.bath.ac.uk lake.esc.cam.ac.uk lake.geol.ucl.ac.uk pond.esc.cam.ac.uk
# Specify the type of machine to be submitted to
# (throughput for a condor pool and performance for a cluster)
jobType = performance
# Specify how many processors to use on the remote machine
numOfProcs = 1

# Specify a metadata study to create a dataset within
RStudyId = 1010
# Create and name a metadata dataset to contain data objects
RDatasetName = "chlorobenzene on clay surface"

# Specify an SRB collection to do some transfers to / from
Sdir = /home/user01.eminerals/clay_surface/
# Specify that we want to get every file from within this collection
Sget = *

# Specify another SRB collection to do some transfers to / from
Sdir = /home/user01.eminerals/chlorobenzene
# Specify that we want to put all local files into the specified collection
Sput = *

# Create and names a metadata data object
Rdesc = "chlorobenzene molecule on clay surface: first test"
# Specify metadata to get with Agent-x (Tied to the previous Sdir line)
# Get environment metadata
GetEnvMetadata = true
# Get default metadata from the specified file
AgentXDefault = pcbprimfixed.xml
# Get z coordinate information and store as zCoordinate in the metadata
# database
AgentX = zCoordinate, pcbprimfixed.xml:/molecule[1]/atom[last]/zCoordinate
# Get lattice vector information and store in the metadata database
AgentX = latticeVectorA, pcbprimfixed.xml:/Module/LatticeVector[1]
AgentX = latticeVectorB, pcbprimfixed.xml:/Module/LatticeVector[2]
AgentX = latticeVectorC, pcbprimfixed.xml:/Module/LatticeVector[3]
# Get the final energy from the file and store in the metadata database
AgentX = finalEnergy, pcbprimfixed.xml:/Module[last]/PropertyList[title='Final Energy']/Property[dictRef='siesta:Etot']
# Store an arbitrary string of metadata
MetadataString = arbString1, "Frst test of molecule height and z separation"

# Leave the code's stderr on the remote machine, to be uploaded to the SRB
# at job end
Transfer_Error = false

# End the file (taken from the condor input file)
queue

Rob Allan 2009-11-10