Using the Dartmouth Discovery Cluster

From BrainImagingCenter
Jump to: navigation, search

The system administrators of the Discovery cluster have recently installed the OpenAFS client on all computational nodes (in addition to the head nodes). This means that we can now seamlessly run DBIC applications on discovery.

Obtaining Tokens on Discovery Headnode

You can now manually obtain DBIC AFS tokens from the discovery.dartmouth.edu headnodes:

[jdobson@discovery ~]$ klog jed@dbic.dartmouth.edu
Password:
[jdobson@discovery ~]$ tokens

Tokens held by the Cache Manager:

User's (AFS ID 10167) tokens for afs@dbic.dartmouth.edu [Expires Aug  8 18:20]
   --End of list--
[jdobson@discovery ~]$ 

This token will not move from the headnode to the client / execution nodes, but it will allow you to access and copy your data from the dbic.dartmouth.edu cell to your local discovery home directory.

Access to Applications

All of the DBIC applications and scripts stored in AFS are shared read-only with the whole Dartmouth community. This means that you do not need an AFS token to have access to applications and scripts. To use the DBIC environment, you just need to simply source the DBIC start-up scripts within your Torque submission script. Here is an example of lines to add to your submission script:


PROFILE_SCRIPT=/afs/dbic.dartmouth.edu/DBIC/etc/profile
if [ -f ${PROFILE_SCRIPT} ]; then
 source $PROFILE_SCRIPT
fi

After sourcing the profile script, your system PATH will be modified to place DBIC applications before all other system and discovery-local applications. This means that you can simply run, for example, AFNI commands without the full path.

[jdobson@discovery ~]$ which fslversion
/afs/dbic.dartmouth.edu/usr/local/common/bin/fslversion
[jdobson@discovery ~]$ fslversion
4.1.9

Using iPython

Some minimal testing has been done to determine that the DBIC Python and iPython install is working correctly from the Discovery cluster nodes. You should be able to import most libraries, packages, and functions. If you cannot, please report the name of the package.

[jdobson@discovery ~]$ echo "from mvpa2.suite import *; mvpa2.test(limit='clf', verbosity=4, exit_=False)" | ipython

Access to Data

There are three solutions to gaining access to your data stored on the DBIC AFS file servers. The first requires no authentication while the other solutions involve obtaining an AFS token, either on the Discovery nodes or on a DBIC system and passing to Discovery.

Sharing Your Data with Discovery

The DBIC AFS cell contains a userid that maps all Dartmouth IP (public and private) address to an entity named 'dartmouth-ip'. You can assign rights (via an Access Control List or ACL) to this entity with the fs command. You will have to run this command on a DBIC node or after manually obtaining an AFS token on discovery.

[jed@dexter ~]$ fs sa directory_name dartmouth-ip rl

The ACL of 'rl' is the minimum needed to grant read-only access to the Discovery cluster. This will enable you to copy data from the DBIC AFS cell to Discovery. You will not, however, be able to write it back without obtaining a token. You might want to use this ACL to share raw data, run some analysis locally on the Discovery cluster, and then copy it back manually from Discovery to a DBIC system. Alternatively, you might just want to manually obtain a token from the Discovery cluster and copy your data locally before submitting your scripts. When you scripts finish, you can login to discovery again, obtain a token, and copy the data back to the DBIC AFS cell.

Obtaining a Token on Discovery

Unlike the Sun Grid Engine configuration on the DBIC system, the Discovery cluster cannot automatically pass an AFS token obtained on the head node to the computational nodes. You can use the 'klog' application provided by the OpenAFS package to obtain tokens. You can include a line to obtain your token within your submission script but this will require exposing your DBIC password to the Discovery cluster in plain text. This is not ideal. To do this, you will need to add the following command to your Torque submission script:

[jdobson@discovery ~]$ echo "secretpassword" | klog -principal DBIC_USER_NAME -cell dbic.dartmouth.edu -pipe

Note that you will need to provide your DBIC user name as an argument for the -principal option. This might be different from your discovery username. The cell name has to be explicitly listed as the local AFS cell is not dbic.dartmouth.edu.

Passing Tokens from DBIC systems to Discovery (Experimental)

To test our experimental job submission, you will first need to have credentials to login to the discovery cluster and have some basic scripts already established. You should select scripts that run successfully already on DBIC or Social Brain Sciences (SBS) systems before attempting to run jobs on discovery.

You must submit jobs from dexter.dartmouth.edu. Before you attempt this, please ask for your home directory on dexter to be set as a local directory. This is needed to allow jobs to smoothly move between dexter and the discovery cluster via Condor and without access to your AFS home directory. Once your home directory has been configure correctly, the 'pwd' command should return a local (non-AFS) directory:

[jed@dexter ~] > pwd
/home/jed

After connecting to dexter.dartmouth.edu with ssh, you will need to configure the BOSCO environment. BOSCO enables transparent job submission via Condor through SSH to the Torque job manager running on discovery.

[jed@dexter ~] > bosco_quickstart 
...
Type the submit host name for the BOSCO resource and press [ENTER]: 

In response to the question about the submit host, enter 'discovery.dartmouth.edu'. You will be asked for your username on the remote system. Enter your discovery username.