This chapter describes the input file format for RMCS which is based on the format used by Condor, DAGMan and Condor-G. Information is based on MCS Developers' release notes v1.4.0 and v.1.4.1 last updated Sept'2007.
The format of the my_condor_submit input file is heavily based on the condor input file format and in fact, many of the input lines come directly from condor_submit input files. The only difference between a condor_submit input file and a my_condor_submit input file is that the my_condor_submit input file can take a few extra input lines. All lines recognised by my_condor_submit are listed below and the context in which they can be used described in the following sections. Any line specified that is not recognised by my_condor_submit will cause a warning to be given.
Specifies if the job is an MPI or serial executable. Note that it is likely that some resource job managers contain bugs when used for single processor MPI jobs or multi-processor "serial" jobs (to maximise memory or for executables that manage there own inter-thread communication).
This specifies any command line arguments needed to run the job.
Notes: Arguments input by this method replace any arguments from the globusRSL line. This is parsed as a string. This should be fixed when the parser is re-wored.
Runs a user specified command or script at the end of the preScript.
Runs a user specified command or script at the start of the postScript.
These four tags allow the user to select subsets of the machines in the resource database to submit to. Three of the tags were new in version 1.4. The interaction is possibly non-obvious - it's a logical AND for all four tags, with the proviso that the defaults are to exclude noting and include all machines and all Grids. Note that in addition to these commands, it is possible for the Grid administrators to disable machines for all users (for example, to prevent failing machines from being used). MCS will exit with an error if no machines can be found according to the preferences expressed using the tags below.
This line is used to specify a list of resources to metaschedule to.
This selects a subset of machines to schedule to.
Lists machines not to be submitted to.
This excludes a subset of machines from consideration for scheduling.
The following lines all relate to staging of data and executables to the execute machine or retrieval of program output from the execute machine.
This specifies the SRB path to the ``Executable''. This path is not the SRB full path to the executable. The 'architecture' string of the machine the job runs on is appended to the pathToExe at job run time. For example if
[frame=single] Executable = ossia.x pathToExe = /ngs/home/joe-bloggs.ngs/test preferredMachineList = vidar.ngs.manchester.ac.uk-serial ngs.rl.ac.uk-serial
and the job runs on ngs.rl.ac.uk-serial, then RMCS will look to upload and run the executable in SRB at /ngs/home/joe-bloggs.ngs/test/linux-64-serial/ossia.x.
The mapping from machinename (eg ngs.rl.ac.uk-serial) to architecture can be found in the 'architecture' column of the MCS Grid Hosts table.
This specifies a collection to get files from or to upload files to within the SRB as part of the job submission.
Suggested changes are to Remove "S" from command name with Sdir to remain as a synonym with warning.
This specifies whether data transfer to / from the SRB should be directly between the execute machine and the SRB vault. Direct transfer leads to much improved performance but requires extra firewall holes. Set this to false if you are unable to transfer directly between your chosen execute resource and all of the SRB vaults. All e-Minerals vaults and execution resources allowed direct transfers.
This line specifies whether to overwrite local / SRB files when getting / putting files. A value of ``true'' will allow overwriting and ``false'' will not allow overwriting. Note a value of `false' will cause my_condor_submit to fail with an error if files being retrieved / uploaded already exist.
This specifies a list of files to retrieve from the previously specified collection within the SRB at the start of the submitted job. Note wildcards (*) are now properly supported and can be used as they would be with any Linux etc. command line command. Also, recursion is also allowed (i.e. subdirectories are downloaded) if a related Srecurse line is specified for the continaing Sdir line.
Files will only be retrieved recursively if used in conjunction with the SRecurse line described below.
Suggested changes are to: Allow any number per Dir block; Remove "S" from command name; Sget to remain as a synonym with warning; Better specification of wildcard arguments; and Expansion of wildcards at submit time .
This line is used to specify the location of the Scommands on the machine on which the Sput / Sget commands are called. You need not specify this line if the Scommands are in /home/srbusr/SRB3_3_1/utilities/bin.
This specifies a list of files to put into the previously specified collection within the SRB at the end of the submitted job. Wildcards (*) are now probably supported and can be used in the same manner as with normal Linux etc. command line commands. Directories can be uploaded recursively when used in conjunction with the SRecurse line described below.
Suggested changes are to: Allow any number per Dir block; Remove "S" from command name; Sput to remain as a synonym with warning; and Better specification of wildcard arguments
This line specifies whether to recursively upload / download files to / from the SRB. Used in conjunction with wildcards in Sget / Sput commands.
Turns on architecture-specific download / upload for this dir block.
The following lines all relate to obtaining and uploading of metadata to the e-Minerals metadata database. It is worth noting that metadata parameters are limited in length (currently to 50 characters for the value and 30 for the name). MCS will detect cases where this limit will be exceeded and attempt to warn the user to minimise the risk of loss of data integrity. This warning is achieved by inserting "***TRUNCATED DATA***" at the end of the stored string in the database and writing a warning to out.err which includes the original (un-truncated) string.
This line is used to instruct my_condor_submit to collect other data values from within a CML file and store them as metadata. The annotation will be created with the name as specified as the nameForAnnotation part of the line and will be retrieved from the file specified by the filename part of the line. The value will then be selected by evaluating the path specified by the rest of the line. A full description of this evaluation is given below.
This line is used to instruct my_condor_submit to extract metadata from a specified CML file. The metadata extracted will consist of all of the parameters within the first parameterList element within the CML file - this will typically consist of simulation input parameters. Also all of the metadata elements within the first metadataList will be extracted. All metadata extracted will be stored as annotations on the created data object. In addition an attempt is made to locate a UUID stored in the file. If this is found and it passes (partial) validation then this is stored in the database. Otherwise a null UUID (00000000-0000-0000-0000-000000000000) is stored.
This line is used to instruct my_condor_submit as to where AgentX should look for its mappings and ontology if the default location is not to be used. This location must have the same directory structure as that seen at the default location.
This line is used to instruct my_condor_submit as to where AgentX is installed on the execute machine if not in the default location or in a location that my_condor_submit does not know about.
This line is used to instruct my_condor_submit as to whether or not it should collect metadata regarding the submission and execution environments of the jobs which will then be stored within the metadata database. All metadata collected will be stored as annotations on the created data object.
This line is used to instruct my_condor_submit to store a specified string of metadata with a specified name within the created metadata data object. The string will be given the name as specified by name and will have value as specified by value.
This line is used to specify the ID of a dataset to contain the created data object which will in turn contain all of the collected metadata. This line must be used instead of the RStudyID and DatasetName lines.
This line is used to specify a string to be used as the name of a created dataset to contain the created data object which will in turn contain all of the collected metadata. This line must be used in conjunction with the RStudyID line instead of the RDatasetID line.
This line is used to specify the name to be given to the created data object within the metadata database. A data object with name equal to this line and URL equal to the preceeding Sdir line will be created to contain all harvested metadata.
This line is used to instruct my_condor_submit as to where the RCommand binaries are installed if they are not in the default location or a location that my_condor_submit already knows about.
This line is used to specify the ID of a study in which to create a dataset to contain the created data object which will in turn contain all of the collected metadata. This line must be used in conjunction with the RDatasetName line instead of the RDatasetID line.
The following lines all relate to meta-scheduling across the e-Minerals minigrid resources.
This line is used to specify the type of job being submitted which must be either `performance' or `throughput'. Choosing `performance' results in the job being submitted to a cluster machine while choosing `throughput' will submit to a condor pool.
This line is used to specify the number of processors to be used on the remote machine
The following lines are all standard condor input file tags that my_condor_submit understands and will accept as part of its input file.
This line is used to specify the name of the file to which stderr should be redirected for the main part of the submitted job i.e. the stderr from the actual job execution rather than data-staging sections of the submission.
This line is used to specify the name of the executable to be run for the main part of the submitted job i.e. the the actual job execution rather than data-staging sections of the submission.
This line is used to specify a additional arguments etc to the main part of the submitted job. Can be used to specify stdin, stdout and stderr for the main section of the job if desired
[frame=single] GlobusRSL = (stdin=file.in)(stdout=file.out)(arguments=-f example\_argument)
This line is used to specify a particular machine and jobmanager to submit to and can only be used when not meta-scheduling. This line can be used to submit to a machine that my_condor_submit does not know about as long as the specified jobmanager is one which my_condor_submit supports.
This line is used to specify the name of a file to be used for stdin for the main part of the submitted job. Can be used instead of the (stdin) section of the globusRSL line.
This line if specified will be ignored by my_condor_submit which will instead use the default value.
This line is not currently supported by the NGS RMCS server pending debug.
This line is used to specify whether you want condor to notify you of the status of the main part of the submitted job once it finishes by email. Possible values are 'always', 'complete', 'error' or 'never'
This line is used to specify a name for the file to be used for stdout for the main part of the submitted job. This file will be left on the remote machine to be uploaded using a relevant Sput line if desired.
This line is used within condor to tell it to submit the job and my_condor_submit uses it for the same purpose, however it is not actually needed by my_condor_submit and will actually just be ignored if specified.
This line is used to specify whether to return the stderr from the execution machine to the local machine (a value of `true') or leave it on the execution machine (a value of `false') to be uploaded using an appropriate Sput line.
This line is used to specify whether my_condor_submit should transfer the executable from the local machine to the execute machine rather than using the SRB. This doesn't make sense for meta-scheduled jobs. A value of ``true'' will transfer the file from the local machine while ``false'' will not
This line is used to specify a set of files that should be sent with the executable to the execution node within a condor pool. This line does not make sense when submitting to anything other than a condor jobmanager and so will be ignored by my_condor_submit in this case. The files will be transferred from the condor pool's submit node (to which my_condor_submit submits its job) to the relevant execution machine after they have been downloaded using the pre stage of the my_condor_submit job.
This line would be used to specify whether to return the main job's stdout file to the submission machine. However this does not make sense within the my_condor_submit context and so is ignored and the output file is always left on the remote machine to be uploaded with a relevant Sput line.
This line is included to provide backward compatibility with older versions of my_condor_submit and is used to tell condor that it should use Globus to submit to the remote execution machine. The only permissible value is `Globus'
This line is used to specify the location of the user's x509 certificate's certificate proxy should it not be in the location specified by the X509_USER_PROXY environment variable. The value specified here will override the value retrieved from grid-proxy-info and the environment variable. This line is designed to allow my_condor_submit to be used when the user has gsissh'd into a submit machine