Just getting in the middle of this. A user asked me a similar question and I
cringed at the performance problems.
However... If the mail came in to majordomo with multiple address that were
lists on the same majordomo server, then you could do "union" processing of
the lists. This should be able to over come most of the performance issues.
But then.... You will have to be careful about lists with vastly differing
configuration parameters. Like if mail was sent to "fvwm" and
"fvwm-workers", and everyone in "fvwm-workers" was in "fvwm"; If
"fvwm-workers" was also archived or digested, would these messages be
Also there is the end user case where filter by listname occurs for archival
or warning purposes. If mail were send to "staff" and "staff-pager" would
one not get the page?
But these could probably be worked around. It would make more users happy
My 2 cents....
On 11 Feb 1998, Jason L Tibbitts III wrote:
: Much technical rambling follows:
: >>>>> "CL" == Christopher Lindsey <firstname.lastname@example.org> writes:
: CL> What about using a combination of the Message-Id: and username?
: That's a database search per address per list per message, plus a database
: write per address per list per non-duplicate message. You could do it, but
: I'm agonizing over speed as it is. I'd say you'd increase delivery times
: by an order of magnitude, probably worse for large lists. (And this global
: per-site database would end up being huge.)
: Now a combination of both methods using the already-present Message-ID
: caches could give the best of all of them, but I can see a bad race
: condition and it would still be dog slow to compute the list differences.
: The way to do it, then, would be to have a per-list variable containing
: other lists to avoid duplicates with. (This is an optimization attempt.)
: Then when a message is received, you snoop the message-id caches of the
: other lists to find out which lists' addresses you exclude. (Therein lies
: the race condition; note that the most common case will have all of the
: different messages coming in at once. I see no way to avoid it without
: some kind of semi-global locking mechanism, which is doable but serializes
: some things and thus eats into performance even more.)
: Now you have two paths, one of which is easy (code-wise) and sucks lots of
: RAM, the other painfully slow. You can expand all of the exclude lists
: into one big address list and pass that into the delivery engine (which
: conveniently already supports exclude lists). Everything else happens
: automatically. This requires that the exclude list sit around in RAM. It
: _could_ be internally optimized to just a hash lookup per address during
: delivery, at the expense of more memory. How big are your lists? It's
: probably not bad for lists under 10000 or so members. There are additional
: (serious) optimizations available if the lists are sorted in some manner.
: The other path is to check for membership in the exclude lists while you're
: building the RCPT batches inside the delivery engine. This would be slow
: but doesn't keep the exclude list in memory.
: If you're really interested in taking a stab at this, I'll be happy to
: point you straight into the code. I have no doubts that most of this isn't
: even all that difficult; I just don't think it can be done with reasonable
: speed. I could be wrong.
: BTW, how does the other software that you think will do it do it?
: Also, sorry about the flu.
: - J<