Quoting Mike Miller <mbmiller at taxa.epi.umn.edu>: > We are thinking about putting together a cluster of maybe 10 machines, > presumably using GNU/Linux. Do any of you have experience with this? > > Some of the things I'm wondering about include the appropriate > configuration of machines -- isn't it better in terms of cost/benefit to > buy fewer dual quad-core machines than more single CPU machines, > especially if the jobs are not very memory-instensive? > > We certainly want to use shared disks, but is there any problem with > booting all the computers from the same network drive? That seems like a > good idea to me rather than to have separate HDDs in the machines, but I'm > not sure how it is done. > > What free software is available for managing jobs, e.g., batch queuing? > > FYI ... The idea is to use these machines for our genetic analyses -- > maybe 600,000 SNPs on 7,500 people, but this mostly consists of running > one SNP at a time on some collection of traits. I don't think the memory > requirements are too great unless we try to load a lot of the data at > once. > > Mike > You might want to take a look at what the folks at UW are doing with Condor: http://www.cs.wisc.edu/condor/ That being said, I'm sure you're not the only one at the U whose looking to do this, I have to imagine there are quite a few folks at it already. Are there any internal peer groups or other ways to collaborate with campus folks? Josh