[Sorry if this is a duplicate ... My reply early this morning was sent from the "wrong" address.... -Scott] Heh. As a former employee of Sendmail and someone who's done a lot of research an implementation on high-volume mail servers, I'd have a lot to say/type on the matter ... except for that darned tendinitis recovery problem. >>>>> "mb" == Michael Burns <sextus at visi.com> writes: mb> Anything, in my experience, is better than sendmail. qmail and mb> postfix are two. Er, Sendmail 8.11 and 8.12 has a "multiple queue directory" feature that hasn't received much publicity but can rival Postfix's speed despite still being a fork()-happy resource pig. :-) Back in May 2001, Sendmail's message delivery record on a *single* machine in *real* Internet conditions (i.e. not in a perfect lab environment) was 2.98 million messages/hour. That's about 825 messages/second. Average message size was 4.9KB. Rob Kolstad had a USENIX paper a few years ago about tuning Sendmail. Much of what it talks about is still valid, despite changing times and MTAs. I can't think of other public email tuning resources off the top of my head, though they do exist if you dig. For high volume delivery, touching disk == performance death. Period. So, avoid touching disk if you can. Most machines don't have non-volitile memory, so pure solid state disk or hybrid SSDs with hard disk backing store (and small batteries to keep the drive alive long enough to flush cache memory contents to that disk) are the way to go. But wait! Those things are expensive! Yup. But what if I need to queue more stuff than that? Then don't store it on SSD: move it elsewhere. Sendmail has a "fallback host" feature: if delivery fails on the first attempt, forward the message to the fallback. The fallback has a good disk system for delivery queue storage, but doesn't need SSD. You *know* (or **hope**) most of your messages are delivered on the first attempt, and you *know* this small fraction remaining are going to have to wait, so you tune the fallback hardware much differently than your first-attempt server(s). I don't know if other MTAs have the fallback feature, but for high volume outgoing delivery, it's wonderful. Gotta run, but, er, my $0.02 on other advice I've read on this list: * File system choice makes a huge difference. You must decide how important it is to recover queued messages in event of an OS crash or hardware failure. Friends don't let friends who care about data integrity use ext2fs. (See Dug Song's comments on this in a recent /.-publicized interview.) GFS, spiffy as it is, won't give you the file ops you need for 825 msgs/second. Softupdates (which *are* quite data safe when using fsync(), as paraniod app writers have to be) or Veritas's VxFS + SSD or random-disk-I/O-designed-and-tuned RAID is quite marvelous. * RAID: You're I/O bound due to random disk activity, reads & writes, not bulk data throughput. If you have a 500GB RAID array for queueing, you have too much disk space, but you can't help that because you can't buy 2GB disks anymore. Striping gives spreads the seek activity the most. Mirror on top of that if you care about crash recovery. If you use parity, you deserve what you get. If you take that array and stripe it over 8 60B drives, what a deal. But (I'm exaggerating here to make a point!) you're better off striping it across 60 8GB drives. * 15K RPM drives won't help nearly as much as lots of slower spinning drives will, and you won't have to worry about your machine room catching fire. * Your drives shouldn't be IDE drives unless you want to deserve what you get. * Most people don't pay attention to their SMTP server's DNS servers. Silly people! How on earth to you expect to figure huge volumes of email without being able to resolve high volumes of DNS records? And cache that info damn well & quickly, despite whatever efforts your MTA makes? Silly, silly.... * Don't bother configuring your MTA (or the servers they run on) to use or provide IDENT protocol services (RFC 1413). * Mosix's process migration probably won't help because most MTAs fork processes often, and they're short-lived. The resources & time you spend migrating the process can be much more than it's worth. * Most file systems have some sort of structure similar to FFS's "cylinder group". Configure the stripe width on your RAID subsystem such that all cylinder groups fall across *all* drives. It's amazing how often sysadmins screw this up. "Hey, Dave, why does the blinkenlight on that one RAID member blink so much more often than any of the others?" * Your boxes have room for more RAM? And you haven't bought more yet? What are you thinking? Each forked process, open file descriptor, buffered disk page, socket, pipe, DNS cache entry, ad infinitum takes space, right? - -Scott ------- End of Forwarded Message