Great Circle Associates Majordomo-Users
(January 1999)
 

Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: Capacity planning
From: Chuq Von Rospach <chuqui @ plaidworks . com>
Date: Wed, 27 Jan 1999 21:26:44 -0800
To: "Robert A. Hayden" <rhayden @ mr . net>, Tim Daigle <Tim_Daigle @ cw . com>
Cc: Majordomo-Users @ GreatCircle . COM
In-reply-to: <Pine.LNX.3.95.990127083018.4920B-100000@geek.net>
References: <85256705.006FA350.00@cwexternal.cw.com>

At 8:36 AM -0600 1/27/99, Robert A. Hayden wrote:
>> Could it support 500,000 mail messages a day?
>> What is the upper limit roughly?
>
> It's tough to say because it depends on what your demographics are.

It depends on a number of things. From what I can see, network speed 
is the key, followed by memory, processor speed, and disk throughput.

It also depends on your message load, what you're distributing and 
how it's distributed. 500K messages isn't a tough thing in some 
circumstances, reall tough in others. It all depends on what you're 
doing.

Majordomo has huge capacities, once you figure out how to tweak it.

A few hints, given what I understand of this.

First, what Robert says is very true -- lots of memory will help, but 
only to the point that you stuff your pipe to the internet full. That 
needs to be monitored as well, and it *is* very possible to stuff a 
pipe during peak loads and see performance go to heck without 
realizing why your performance trails off. Ditto if you overload 
memory -- you start paging or swapping, and the delivery processes 
thrash each other instead of deliver.

What I did was spend some time modelling both network throughput and 
memory usage (a real rough way to judge network throughput is to use 
ping, and ramp up your mail delivery until you see packets starting 
to drop out of the connection. That implies that the pipe is 
overstuffed. this, of course, assumes the pipe is mostly dedicated to 
your usage; true in my case here at Plaidworks, not true at Apple).

Ditto with RAM. Ramp up delivery until you start seeing memory used 
up and paging rates go up. that'll tell you what you can stuff into 
RAM without thrashing.

Whichever of these maxes out first, that's your limit. For mail 
delivery, one of these is very likely going to run out first -- not 
CPU, if you have anything reasonable pushing the bits about. Disk 
comes into factor in other ways, it's not a super issue for mail 
DELIVERY.

Whatever you do, stay under that value, but stay JUST far enough 
under that value to not cross it. Maximum throughput happens just at 
that level, and you want to stay in that zone as much as possible. 
What I've done is built a custom queueing system for sendmail with a 
perl script that runs once a minute, checks how many sendmail's are 
running, and spawns sendmails if it's below my limit. This ramps 
delivery up to max very quickly, and also avoids over-ramping it, 
things that are tough to manage with sendmail's standard systems. On 
my systems (200 MHz PowerPC based Apple Network Server AIX boxes with 
256Megs of RAM), 80 concurrent sendmails is about all she wrote. On 
my new machines, which are Sun Enterprise 250's, the numbers will be 
much higher, but we're just starting to install those beasts.

Let me take a step back and point out a few things that probably 
aren't obvious unless you've been running large lists for a while.

First, there are three areas where performance issues get critical in 
dealing with large lists of addresses.

1) Delivery -- see above. Also use something like Bulk_mailer to 
break up your list into smaller batches, so they parallelize well. 
Makes no sense to build a system for 80 sendmails if you load 
everything into one sendmail batch. That single-threads it again. ugh.

2) Queueing -- Majordomo has to queue the messages into sendmail. For 
large lists, majordomo->bulk_mailer->sendmail->end_user is how this 
stuff goes (roughly). But if you look at this, you still have a 
single-threading problem here. And sendmail, by default, does 
dns-lookups during queueing (if you want to watch this stuff, track 
down the bulk_mailer process, find the output file it generates in 
/var/spool/mqueue, and tail -f the xf* file...)

	Now, sendmail has an option to defer DNS stuff until delivery. I 
haven't experimented with it yet, but it's on the list to try. But 
with 500,000 addresses, single-threading them INTO the mail queue is 
just as deadly as single-threading them OUT of the mail queue. So 
I've been spending the last month or so implementing, testing, and 
tweaking sublisted systems. this means a 'mailing list" is actually 
lots of smaller lists, all linked together. Unfortunately, majordomo 
doesn't support this, so you have to wire your own around the edges. 
But the speed differences are amazing, as long as you're careful 
about how they're implemented. But basically, if you have a list 
"fred_list", it really feeds into N sub-lists (fred_list-1 through 
fred_list-N), so that when you mail to "fred_list", it really spawns 
N parallel bulk_mailers instead of one, each spewing out parallel 
queues into sendmail for parallel delivery.

	Side note: running a caching-only DNS server costs you ~25 megs 
of RAM, but speeds things up amazingly. I'm going to experiement in a 
few weeks with a dedicated DNS server feeding my list servers, but I 
expect the on-host server will be faster. So build that into your 
expectations and run it, or you'll waste a lot of energy speeding up 
a system that spends most of its time waiting for your DNS. and 
you're likely to overoad your DNS set up to handle all your other 
stuff as well...

3) Admin updates -- As your address list grows, so will your admin 
hassles. There's nothing quite like getting 1000 sub/unsub requests a 
day, where your majordomo is processing them once every ten minutes 
(at best). you get creative, fast. But as your list grows, so does 
the amount of list churn as addresses add, drop and change. Majordomo 
stores all this as flat files, meaning you read/write the file for 
each change. Okay if it's 40K. Not okay if it's five megs.

sublisting is a key here. First, you can parallelize your admin 
updates to some degree (but it's here that disk IO contention starts 
nuking you -- on my new sun, we're mirroring and striping the disks, 
and using multiple disk heads and various other speedups), but more 
importantly, where it might take you ten minutes to update a single 
monolithic mj address file, you can make that same update on one of 
the sublists in 15-20 seconds, so even single-threaded your admin 
speeds way up.

On the other hand, sublisting creates all sorts of horrors on the 
admin side, from how users get spread across the sublists, how you 
avoid duplicates across multiple sublists, and how you can get users 
to unsubscribe from all of this without causing braicramp or massive 
admin overhead. (hint: "unzubscribe *" ain't it. On my big system, 
that takes over 20 minutes of CPU time to process a single request, 
and about 2.5 hours of real time. Not practical. i've written a perl 
script that simulates this, tracks down the aadress and rewrites the 
email with the proper user commands and does it in about 2 CPU 
minutes, which shows you how brutally inefficient the unzub* thing 
is...)

Right now, we sub everyone onto a single main list, then when it 
grows large enough, migrate those users to the sublists and de-dupe 
the lists. that seems to minimize the hassles. Each sublist's info is 
customized so that the unzub info is hooked to that specific sublist, 
and that seems to work well so ffar. There are also panic buttons 
that allow users to interface with that unzub* script.

Those are the three areas to watch. the one that's likely ot kill you 
is the admin update site. Haven't even mentioned bounce processing 
yet. My stuff is still under development as we're still approaching 
efficient delivery, but it's doing pretty well, but I'm honestly 
still working out the kinks. And if all this sounds rather 
complicated, it is -- and I'm leaving stuff out. But these are the 
highlights.

And easier answer, of course, is simply to buy really huge boxes, a 
huge network pipe, but not all of us have unlimited budgets, and that 
only works to the degree that no matter how much computer you buy, if 
you succeed at this stuff, it won't be enough. (On one of my boxes, I 
made guesses about subscribe size for the next calendar year that 
it's beginning to look like I'll hit in February. That is good, sort 
of... grin)




--
Chuq Von Rospach (Hockey fan? <http://www.plaidworks.com/hockey/>)
Apple Mail List Gnome (mailto:chuq@apple.com)
Plaidworks Consulting (mailto:chuqui@plaidworks.com) 
<http://www.plaidworks.com/> + <http://www.lists.apple.com/>

Featuring Winslow Leach at the Piano!


References:
Indexed By Date Previous: Re: timed unsubscribes
From: Chuq Von Rospach <chuqui@plaidworks.com>
Next: Re: more problem..
From: Ben Smithurst <ben@scientia.demon.co.uk>
Indexed By Thread Previous: Re: Capacity planning
From: "Robert A. Hayden" <rhayden@mr.net>
Next: Re: Capacity planning
From: Jason L Tibbitts III <tibbs@math.uh.edu>

Google
 
Search Internet Search www.greatcircle.com