Setting up F@H Linux
2007-11-11 tutorialI decided today that I should reformulate my tutorials section to be an extension of my blog. It's a lot less headache all around, I think. I'll fold the whopping one I already have in later, though. For now, it's time for a new one. This tutorial will run through how to set up the Folding At Home client on a Linux computer.
For those not in the know, F@H is an incentive by Stanford University to simulate protein folding. The idea behind it is, many diseases are caused by protein misfolding. If we can understand how proteins fold naturally, then we can hopefully develop cures for diseases like Alzheimer's, Parkinson's, ALS, and certain cancers. Unfortunately, the calculations to simulate and study the process of protein folding require mucho computer horsepower - lots of memory and processor throughput. Happily for the folks studying protein folding, modern consumer PCs are quite capable of performing folding calculations on proteins in a reasonable timeframe - one protein may take up to a week. The great folks at Stanford have developed and distributed a client that can be downloaded on several different architectures and OSes, and saves a lot of supercomputing time. You can find out more at the Stanford folding site.
As I mentioned, Folding at Home has clients for Windows 32 and 64 bit, Linux x86 and x86_64, Mac OSX PPC, OSX Intel, certain ATI GPUs, and the Playstation 3. If you own a Playstation 3 version 1.6 or later, Sony has actually bundled F@H on it. As of the writing of this post, the PS3 has pushed the total computing power of F@H to over a petaflop of computing power. As a matter of fact, PS3s alone account for 1.2 petaflops of processing power... that's a lot of calculations! However, I'm not here to sell you a PS3. I'm here to tell you how to fold on your computer.
With that in mind, we'll need to do a little brainstorming first. What kind of computer are you wanting to fold with? Your two basic options with Linux are whether to run the 32-bit client or the SMP client. The SMP client requires a 64-bit OS, and also either a multi-core processor, or parallel processors. I'll be sure to mention the finer points of this as I come to them. Note that if you have a multi-core processor, but only a 32-bit OS (like Slackware, for example), then you can't run SMP. Hey, I don't make the rules... this limitation is the one and only reason I started running Kubuntu 64 on my laptop.
Okay, now that we know what we're up against, we should probably start by downloading the client software from the Stanford website. Don't download it from anywhere else... this is science, after all, and while they aren't averse to the idea of open-source software, they don't want anybody messing with the results. Stanford assigns points to users (they don't really mean anything, but it's fun to see how you're doing against the rest of the world) and doesn't want some unscrupulous user rewriting their client to falsify results in order to boost point totals. I agree wholeheartedly; I wouldn't want this software to lead to another Therac-25 situation. Anyway, digression aside, download the latest Linux client. As of right now, it's version 6.00 beta1. It says that it's 64 bit only, but it lies through its teeth; it will run on a 32-bit OS, it just won't run with the SMP flag in place.
Got it downloaded? Good. Now, flip over to your download directory, and move fah6 into its own directory. I don't think you can mess up the client, but it's better to be safe than sorry. Now that you have everything in place, run the client. You'll need to have root privileges for this, so if you need to use sudo, then do so. Note that if everything is in place to run SMP, then you should do so; it runs faster, and it produces more useful information (and Stanford assigns more points accordingly.) To do so, run the client with ./fah6 -smp.
The client will ask you some questions. First, it wants to know your username; if you want to, you can run anonymously, but you can keep track of your points by creating a user name. On the client download page, you may have noticed that you could search for a user name: If you want to, you can choose a username based on the results from that box. If you do, then you can enter this username in the client.
Next, it will ask about a team number; again, this is your preference. I fold for team 11314, the Extreme Overclocking team. You can pick a team on the Stanford team results page, or you can randomly choose a team number, or you can pick a team from an organization, software project, or website you are affiliated with (there are teams from Slashdot, Majorgeeks.com, Ubuntu, Penny Arcade, Hack-a-day, Gentoo...), or you can choose not to fold with a team.
Next, it will ask for your passkey; this is explained more at the Stanford website, and again, it's optional.
The next question asks whether the client should ask before it sends or fetches work. I personally run my folding client all the time, and it may work for a week without me stopping it. Now, I can complete about one work unit a day, so it's to my benefit not to have to manually tell it to send the work it's completed or to fetch a new work unit. Thus, I said no here, but if you have limited internet bandwidth or a dial-up connection, you may want to say yes.
The next question deals with using a proxy. I personally don't use one, so I said no. I assume that you would know if you need a proxy, so use your own discretion here. If you really want to know, you can either use the magic of Google to find out more, or you can ask me, and I'll exercise my Google-fu for your benefit.
The next question is kind of misleading. It asks about the size of work units to process, and mentions memory requirements. The real problem is not the memory requirements, but the processing time. I run F@H on a 2.2 GHz Intel Core 2 Duo, and on a 1.6 GHz Sempron. I'm doing a test to see how the Sempron handles big work units; if it gets them done in a timely fashion, I'll keep doing them. If it bombs out, I'll either fold with normal units or even small ones, to get them done faster, which in turn speeds up the research. Use your discretion on what you think is best for your machine.
The next question is about advanced options. I firmly believe that everybody should go through these once, just so they know what to expect. You can usually choose the default options, so don't feel obligated to change something just because I recommend it.
The next question deals with core priority; your two choices are "Low" and "Idle". By default, Folding at Home only uses the CPU cycles you're not using. It does everything it can to get out of the way if you suddenly need all your processor for certain tasks. You can change this to Low, and it will theoretically perform some amount of work at all times. I really haven't noticed a difference on this, so Your Mileage May Vary here. The default is Idle, and that should work just fine.
The next question is kind of a strange one. Certain processors have special instructions built in that are meant to improve performance. Some you may have heard of include MMX, SSE, SSE2, SSE3. There are historical situations in which certain processors had flawed implementations of these instructions, and therefore this flag was developed. Long story short, you probably don't need to worry about this flag.
The next question is about checkpoints. Checkpoints occur when the client saves all the work it's been doing, so if the power goes out or the client is killed, you don't lose a lot of work. The default should be fine for most folks, but you can extend it if you are looking to maximize disk life, or shorten it if you worry about power outages or other such things.
The next question deals with memory. F@H has been pretty good about determining how much memory is available, but if this number is way off from what's actually in your machine, or if you prefer that the client only uses a certain amount, then you can change it manually here. Note that F@H will never use all your memory, so don't think that you won't be able to do anything else while running it.
The -advmethods flag can be a good thing or bad thing. If you're working with a powerful machine, you can get more resource-intensive workunits by saying yes to this question. On the other hand, this can cause you to miss deadlines, and it really taxes your computer. Be wise about the choice you make here.
If your system clock is really screwy, you computer can abort work units if it thinks that it missed a deadline, even if it actually hasn't. I haven't yet figured out how to make the F@H client display the correct time on the timestamps, although I suspect it's going off UTC time instead of the local clock in my computer. If you need to set this to yes, you'll likely know it.
The next question is the main reason I took you on this trip: the machine ID. As I mentioned, I fold on two machines. My main machine got ID 1; my second machine got ID 2. If you fold on more than one machine, you will want to set unique numbers for each of them.
Okay, we've walked through all the questions, so your machine should go to work downloading a new work unit and get started on it. Here's a few tips that may save you some grief:
- To re-run the configuration, you can start the client with -config to reconfigure and then start folding, or -configonly to just configure and quit.
- To properly stop the client in the middle of a work unit, use Ctrl-C to stop the process. I recommend giving it about 30 seconds to finish what it's doing and write everything to the disk before closing the terminal or restarting/shutting down the computer.
- Folding at Home can, theoretically, degrade gaming performance. Either don't game on your folding machine, or shut down the client while you game. Alternatively, it could work fine. This is where your experience trumps my advice.
Well, that pretty much sums it up. If you need any clarification, let me know in the comments. I hope to see you folding soon, and thanks for your interest in helping to wipe out diseases!
Comments:
No comments added yet for this post.