Running I/O Intensive Jobs on the Cluster

September 5, 2006

1. IMPORTANT UPDATE

With the addition of the new node, store2, and its 2.6 TB of RAIDed storage, the cluster now has a three globally accessible file systems:

/home3, with approximately 2.6 Terabyte of storage
/home2, with approximately 0.5 Terabyte of storage
/home, with approximately 1.0 Terabyte of storage

Per the discussion below, ONLY /home3 and /home2 ARE TO BE USED BY PARALLEL JOBS DOING MORE THAN A VERY SMALL AMOUNT OF OUTPUT.

Users are encouraged to use /home3 whenever possible, since I/O to that node should be faster than to /home2

Note that directories of the form /home3/<user>, where <user> is any valid user of the cluster, should exist and be owned by <user> (e.g. /home3/fransp exists and is owned by fransp, etc.)

As always, contact Matt immediately should you have any questions about this, or any other, aspect of cluster operation.

January 22, 2004

1. The Problem

The cluster has only two globally accessible file systems

/home, with approximately 1.0 Terabyte of storage
/home2, with approximately 0.5 Terabyte of storage

Both of these file systems are NFS-mounted by all of the compute nodes; /home is physically attached to head, while /home2 is attached to store, a machine whose existence may be news to many of you.

If many of the compute nodes are performing intensive I/O to user directories on /home, then the head node can easily become overloaded handling the NFS traffic. In such a case, as many of you have already noticed, head becomes very unresponsive for interactive tasks, such as compilation, file editing, or even execution of simple commands such as ls.

Over the past few months, such occurences have become more frequent, with load averages on head periodically hitting 10 or more, and the time has come to address this increasingly unacceptable situation.

2. The Proposed Solution

Any and all users who are running jobs on the cluster, particularly parallel jobs or concurrent batches of serial jobs, that are doing substantial amounts of I/O (i.e. generating more that a few megabytes of data), should now adopt one of both of the following strategies:

Run your code and have its output generated in your /home2/$USER directories.
These directories have recently been created for all users. This will off-load NFS traffic from head to store, but since store is not used for interactive purposes, a high NFS load on it should not be as serious an issue as it for head.

Note that /home2 is mounted on head, so that files created on /home2 via jobs running on the compute nodes can be propagated to the outside world via scp from head.
Use local storage on the nodes:
Each node has approximately 60 Gbytes of "scratch" disk space available via /var/scratch. Upon login into any compute node the directory /var/scratch/$USER will be created if it doesn't already exist, and you can then perform I/O to files and directories within that scratch directory.

Files can subsequently be transferred to your /home orG/home2 directory via cp. Note that there is currently no mechanism for directly tranferring data from /var/scratch on the compute nodes to the outside world, due to the fact that the nodes are on a private internal network.

3. Additional Comments

Whichever strategy you adopt, you must actively and responsibly manage your disk usage. At a minimum this means checking the {/home,/home2} and /var/scratch usage pages regularly to ensure that you are not using more than your fair share of NFS storage, and ensuring that
1. You do not fill up /var/scratch on any of the nodes.
2. You clean up your scratch directory on the nodes when the data is no longer required.
Users who are CONFIDENT that they are not part of the NFS overloading problem can continue to perform I/O to /home as previously.
WARNING!!: At least for the time being, NO backups will be made of /home2, and there are no plans to ever back up /var/scratch on any of the compute nodes.

Management realizes that these new directives may lead to some inconvenience for users, but again, since the head node is close to unusable at many times these days, some action needs to be taken.

As usual, if you have any questions or comments about this new policy, or if you encounter problems, contact Matt immediately.

BACK to cluster home page.