Arnold de Leon <arnold@Synopsys.COM> writes:
# The problem is with the way majordomo does locking.
# If you have a large number of requests come together the
# machine ends up thrasing.
# Here are the changes I made the locking to make friendlier
# to the machine.
This is pretty deep in the guts of Majordomo, so I've redirected
this reply to Majordomo-Workers.
Clearly something needs to be done about locking and thrashing
in Majordomo, but I'm not sure this change (exponential backoff)
isn't going to cause more problems than it solves.
On my machine, each running Majordomo seems to account for about 8 MB
of swap space, between the sendmail process and the perl process. Keep
a few of these waiting around, and you're talking about a _lot_ of swap
space tied up.
On the other hand, I currently have to be careful not to issue more than
2 or 3 simultaneous "approve" commands, or the load on the machine goes
through the roof and it starts thrashing anyway.
The current locking mechanism doesn't even begin to preserve ordering
of near-simultaneous requests. An exponential backoff is going to make
the situation even worse for processes that are waiting for locks; they'll
try less often for the lock, and end up waiting even longer.
Essentially, what Majordomo has now is a spin lock. A better system would
be either a wait lock, or some sort of queuing mechanism. A wait lock
would not solve the swap space problem mentioned above, and you'd have to
be careful of deadlock. Queuing would have to be done by Majordomo, not
by the mail system, if you want to preserve ordering, and that adds a fair
bit of complexity to Majordomo. On the other hand, I've long wished for
a generalized queuing package written in perl, for a variety of different
uses; anybody want to take up the challenge of writing such a package (or
integrating an existing one) for Majordomo?
Another change that needs to be made to the locking is what happens
with permission problems. If it's a permission problem that's causing
shlock to fail, it should say so and abort, rather than trying until it
times out and then simply reporting "lock failure".
Brent Chapman | Great Circle Associates | Call or email for info about
Brent@GreatCircle.COM | 1057 West Dana Street | upcoming Internet Security
+1 415 962 0841 | Mountain View, CA 94041 | Firewalls Tutorial dates