I recently spent a good bit of time investigating a mystery that I thought
might be of interest to LULA & majordomo folks. I am running a number of
majordomo lists off my redhat linux 5.2 machine. This weekend, I wanted to
send mail to one of them, and noticed that I wasn't getting the mail sent
out. I examined the logs and determined that for some reason, mail was not
being processed but instead queued. Other mail was going through fine.
What was happening?
After snooping, I found that queued mail typically goes into the directory
/var/spool/mqueue. Inside this directory was a number of files I easily
verified represented my queued mail. I won't go into the details here, but
there are 2 files per queued message (a control file and the data or message
file).
Referencing my nutshell Sendmail book (thank god I was smart enough to buy
this monster), I determined that the sendmail could be run in immediate
queue processing mode, with verbose processing on, with the command:
sendmail -q -v.
This command produced no output, and the queued mail was not sent. I
explored all the logs (majordomo,/var/log/maillog) and couldn't find
anything helpful. Finally, I went to debugging mode and a number of hops
around the sendmail book.
After a bit of investigation I began to suspect that this had something to
do with the load on my mail server. I eventually determined by running
sendmail -v -q -d3.1 that load average support was built into my sendmail,
as the diagnostic returned:
getla(): 8.03. The 8.03 represented the average load calculation for my
machine.
sendmail -v -q -d3.30 "Show result of decision to queue" turned out to be
the diagnostic that helped. What I learned is that sendmail has built in
queueing logic that checks the machine load average against predefined
limits. If the load average is higher than the predefined load limit
(Default 8.0) then sendmail goes into a secondary "queue factor"
calculation. The results of this calculation need to be lower than the
QueueFactor default of 600000.
Running the -d3.30 debugger should return something like:
getla(): 8.08
shouldqueue: CurrentLA=8, pri=30000: FALSE (CurrentLA < QueueLA)
As you can see, my load average had crept up to > the default 8.0. The key
was that sendmail was no longer reporting to me the line, FALSE (CurrentLA <
QueueLA), and was then showing me details on the queued mail including
QueueFactor numbers in the 800k range, and a diagnostic "TRUE (by
calculation).
The Sendmail book 34.8.49 explains that when the machine load avg exceeds
the configured QueueLA(x) threshold, then a calculation is performed:
msgpri > q / (la - x +1). I won't detail the factors in this calculation,
but it turns out that my bulk mail priority had a higher msgpri than the
results of the calculation. Thus, regular mail with normal priority was
being processed, but my mail list messages were not.
This was easily fixed by going to the /etc/sendmail.cf file and locating the
QueueLA config item, UNCOMMENTING it (that little oversight cost me some
time), and upping the number. I pushed mine up to 30.
I restarted sendmail(kill -HUP sendmail_process#) and all was again well
with my system. I was surprised to find that there was no faq item on this
either in the sendmail or majordomo faqs, so I decided to document this
journey in hopes it might save others some time.
|
|