Archives: February 2007

Getting servers in line

I spent a lovely weekend morning setting up monitoring on servers - yes, what fun :-)

I like all my servers to run logcheck, smartmontools, sysstat, and lm-sensors.

logcheck means watching your email every hour, and adding in yet more ignore rules for things your server thinks it’s perfectly OK to spit out.

smartmontools means waiting to see which attributes it’s going to complain about, making sure it’s set up to mail you about bad sectors, and getting this all in inside the 128-character line-length limit.

And lm-sensors, well that takes a lot of tweaking, to get all the alarms to stop ringing, labelling the right temperatures, and ignoring the disconnected pins.

Ugh, it’s painful work, but it helps in the long-run…

Postfix + SMTP-AUTH

I finally found a good blog post on the subject of getting Postfix to do SMTP-AUTH via SASL.

I went one step further, and instead of moving /var/run/saslauthd/ to the Postfix chroot, I did a bind mount:

/etc/fstab:

/var/run/saslauthd /var/spool/postfix/var/run/saslauthd none bind 0     0

Postfix was announcing methods like CRAM-MD5 which can’t be supported by the PAM backend, so I restricted them down to PLAIN and LOGIN (over TLS only, obviously):

/etc/postfix/sasl/smtpd.conf:

pwcheck_method: saslauthd
mech_list: plain login

Now, it’s working nicely, and I can IMAPS and SMTP-AUTH-TLS to my mail server from anywhere.

Shared #clug server?

tea and tumbleweed were chatting one evening about having a shared dedicated server in london for #clug (or clug in general) members to use as MX, shell box, etc.

Basically Xen would be out of the question, because cheap boxes don’t have enough RAM, it would have to be shared. This would mean agreeing on a common MTA (HORRORS!), and sharing root between a group of people… :-)

Some options:

tumbleweed has a “Value” server with uk2.net, who’s lease expires on 22 Feb. If 11 other people were interested, we could renew it, for the cost of about R600 pa, each. This would save on setup costs.

Anyone keen? We have a wiki page on the subject, add yourself.

Horrific performance with 3ware RAID

I’ve been enjoying our server at UK2.net. It’s a pretty speedy machine (although a little light on RAM - I suspect that they don’t want people running Xen), and it’s connected to a fat pipe. But I’ve been experiencing a lot of bad lockups.

I traced the problem to postmaping the uceprotect.net RBL file. They recommend that you rsync this file from them, and then postmap it into a fast lookup database for postfix, rather than using their DNSRBL service. But running the postmap was taking my box 40 mins. The same operation, on a loaded, lower-spec, 2 year old server took 2 mins (yes this server also has RAID1 on the volume concerned). On my UK2 box, while the postmap was running, the machine became totally unresponsive, and it could take a minute or two to log in, serve a web page, or even execute a basic command like ps.

Clearly something wasn’t right. And it was something in the IO system. The only answer is the 3ware RAID controller. (It’s a 8006-2, doing RAID-1) I know these controllers have a big buffer, so I looked up the 3ware website, for tuning guidance. I followed it to the letter, and things didn’t really improve. I tried the deadline scheduler, and tweaking the buffers, but it only got marginally better.

Personally, I’ve always used software RAID, even for RAID-5, and I’ve never had bad performance like that. And having the RAID in a portable format has really helped with recovery in the past. I understand that Windows monkeys have to use hardware RAID (because their software RAID sucks so much), but is this kind of performance normal?

I’ve asked UK2 to chuck my controller and give me software RAID :-)

Update

I’ve now got software RAID 1, and postmap runs in 25 seconds. That’s what I call a 60x speed improvement :-)

Oh, and the system is totally responsive while the postmap runs.