2008-07-15

Subject checkout on shared volume

In our lab, we have five macbook pros that could theoretically be used all at once to test subjects in a single experiment. In the past, we have gotten into trouble when, due to experimenter error, a certain subject slot has been run on more than one computer. To get past that problem, we want to use a shared volume to contain the experiment setup hierarchy, and come up with some way for all of the computers to share that hierarchy. Obviously, there must be some method to prevent two computers from trying to use the same resources. The simplest way is to set a lock at the filesystem level, marking the subject as "taken", and releasing the lock. However, the most straightforward way to do the sharing, using one of Apple's group iDisks, has no locking mechanism. You can't even make files read-only. The lockfile(1) program, when asked to create a lockfile on an iDisk, gives up and suggests praying instead.

I did come up with a locking mechanism. What you do is to use a reserved folder. A computer that wants to lock the resource waits until the folder is empty, then writes its ID into the folder (the ID could be, for example, the ethernet address of en0). After a short delay, the computer then checks to see if there is exactly one file in the folder, namely, it's own ID. If so, then it has the lock. If there is more than one, then it removes its ID, waits a short but random period, and tries again. The only problem with this mechanism is that it is very slow, on an already slow filesystem like the iDisk.

After pondering this for a while, I thought of another approach. Instead of setting a lock before accessing the subject slot, you randomly choose the "next" subject to test, and then rename it to a name with your ID. For example, if the subject is called "12", and if your ID is aa.bb.cc.dd, then you would simply "mv 12 12-incomplete-aa.bb.cc.dd". Then wait a short time and see if "12-incomplete-aa.bb.cc.dd" exists. If it does, you now own subject 12; if not, try again. (If the locked name doesn't exist, it means that a race occurred and another computer locked it between the time you found it and the time you did the mv command.)

The random selection is somewhat important, but not critical. If you just go in a fixed order, all it means is that there is slightly greater probability that a given computer will have to try more than once to get a subject.

Once the subject is locked, testing proceds. When it is complete, the name is changed again to, e.g., "12-complete-aa.bb.cc.dd". Note that it is still locked, in a sense, since it will not appear in the list for testing.

One other brief note: it might make sense for each subject on the remote volume to be an archive, for example tar.gz format. This would facilitate copying it onto the macbook pro. A question to be resolved is whether data is place into the archive or somewhere else on the remote volume.

No comments:

About Me

My photo
Ignavis semper feriƦ sunt.