On Tue, 14 Mar 1995, Thomas Leavitt wrote:
> One of our users, who maintains a relatively large mailing list, has
> had his subscriber lists truncated by Majordomo at least twice. Needless to
> say, this is quite distressing.
>
> I have been told by another list-owner at Netcom that this has happened
> to her a number of times as well... from this, I suspect this phenomenon,
> is likely to occur when the load on the server the list is running
> on peaks, since Netcom machines are cronically overloaded, and the
> truncations on our site have occured when we've be dealing with rogue processes
> spamming the system.
>
> We are working on a custom solution to this problem, but I thought
> I'd raise the issue here, and see if anyone else had experienced it,
> or has identified the cause (we added a number of error checks after
> the first occurance, but these seem to have been ignored when the
> phenomenon repeated itself).
This has happened once before with HotWired's "HotFlash" list - it's quite
distressing to wake up one day and find a 40,000 member list cut in half! We
were able to rebuild it from backups and reprocessing the majordomo log,
but it definitely brings up the point that majordomo just isn't designed
to deal with lists of more than a couple of thousand people - it takes
too long to do anything and is thus liable to conflicting processes,
system hangs because there's 12 concurrent processes all competing to
edit the same file, and a sysadmin who is forced to kill processes when
they are causing the machine to swap.
Subscribes and unsubscribes are where it takes the longest amount of time
- part of the culprit is the pattern matching, which is a rather complex
regular expression. If you turn off mungedomain and turn on strip, it
helps, but I had to hack the code to get subscription time down to a
reasonable level (on a 486/50 running BSDI and acting as the main mail
server, from 3 minutes to 20 seconds). However, unsubscriptions take a
longer time than subscriptions because unsubscriptions require rewriting
the whole file for *each* unsubscribe request, instead of just appending
to the file.
Anyways, with some modifications we've been able to support 45,000 people
but we're looking for something that can handle 4 to 10 times that, and
even with majordomo running on its own Pentium 90 or Sparc 10 I don't
think it's designed to handle that. So, we're about to start
modifications to majordomo along the lines of the following:
Instead of modifying one huge list, majordomo splits the main list into a
configurable number of sublists based on the domain name of the address.
I.e., for hotflash, in the lists directory there'd be a hotflash directory
with files in it like "edu", "com", "uk", etc - thus, when majordomo has to
remove a .uk address is only has to modify that subfile "hotflash/uk". This
will also solve the locking problem, where we'll have 5 or 6 majordomos
competing for a lock on the same large file, whereas with this they don't
have to necessarily compete. Since .edu and .com can be huge, one could
allow "y.edu" to have all the sites which end with "y.edu", etc. Notice that
mungedomain could still work with this if one wanted. The heuristic would be
defined by a configuration option in hostflash.config, something like
Segments = edu,n.com,com,uk
Notice that all *n.com hosts would go into "n.com", and all other .com hosts
into "com". When the heuristic changes majordomo would have to be told
somehow.
We do essentially this very thing with the hotflash list to speed up
sendmail delivery - when majordomo hands the message off to sendmail,
/etc/aliases is set up such that 39 parallel sendmail processes are fired
up on subsections of the list, organized by domain since most sites can
accept mail for all its users in one transaction. It'd be nice to get
the same benefit for the subscription/unsubscription action as well.
Now here's the kicker - we'd *love* to contribute this back to the public
domain to be integrated into the next release of majordomo, but it's not
clear what the majordomo code maintainers' priority is for this type of
thing. It doesn't make much sense for us to release it unless it can
make it in, since the benefit we get by releasing it is that it'll be
used by others and checked for bugs, security holes, etc. If we were to
release it as a separate patch, we'd have to both 1) modify it each time
a new release of majordomo occured to make sure the patches fit in
cleanly and 2) support it ourselves instead of letting the community
support it. We've made enough other mods ourselves that we've stayed at
1.92 (implementing the security patches by hand) rather than move to
1.93, but I don't want to be doing that forever to keep up with new
functionality in new majordomo releases.
Is there a concensus that this is something the majordomo community could
use? Would it be possible to work on a pre-release of 1.94 to that our
patches can be more in sync with the current efforts? We've done enough
smaller hacks (which I'll outline in separate patches) that to get up to
1.94 we'll have to spend some time, but I think it'll be worth it. Those
other hacks include things like
1) making INDEX and GET not-list-related (yes, turning it into a file
server - make that an option in the config at least! :)
2) allow moderators to put Approved: headers at the top of the message
instead of in the mail headers, for those with mailers who can't edit
headers
3) having majordomo just die when accessed from particular hosts -
majordomo wages war on other autoreply daemons every now and then (vacation
programs that don't do the right thing and only send one reply a day,
etc) so being able to look in a list and find regular expressions to
match the address on are very useful
Again, these are things we're willing to contribute to the majordomo
distribution. Finally, we're working on a script to automate processing
of mail to list-owner, recognizing bouncing addresses and removing them
if they bounce too much or are completely wrong, which we think will be
able to eliminate about 90% of our work in that area (which now takes
about 10 man-hours per week, as roughly 1% of mailing list addresses go
bad per week - this is true for all lists I run).
What do people think?
Brian
Follow-Ups:
References:
|
|