[ Chan Wilson writes: ]
>
> Well, shoot, it's not like it's a big change to do exponential
> backoff, now folks:
That's not the point. A design decision such as this has far reaching
impact. Unless you plan to delay 1.94 another month while this is
thoroughly tested, it's irresponsible to put it in now.
> Dave> Your method is trying to reduce the system load caused by lock
> Dave> contention by deferring it to near-infinity.
>
> No it 'taint.
Yes, 'tis. :-b :-)
> An exponential backoff, especially one a tad random like above, gives
> the following sequence:
Then you agree with me that a random factor is necessary.
> For the situation John has, and quite likely the majority of Majordomo
> servers out there, the exponential backoff is much better. Especially
> with Linux creaping out into the home and home domains becoming
> prevalent, we may be back to a time when system resources became
> unavailable for a long time.
Even if we accept the assumption that poorly tuned systems are in the
majority and that Mj should target that environment (which I *don't*
accept, BTW), the situation John describes is still the exception rather
than the norm. Even your implementation recognizes the fact that pure
exponential backoff only compounds the problem unless it incorporates a
random factor.
> [...] envision a scenario where many Majordomo requests come in at
> once, each to the same list.
I'll even settle for a few requests instead of many. One gets the lock
and the rest sleep exponentially longer while processing progresses.
Eventually some go to sleep for the period of time just less than
twice the time it takes to process one of the requests (since the
previous sleep time was just under the processing time). Meanwhile,
other requests for the same list arrive, find the lock free and begin
processing. The sleepers awaken, find the lock busy again and sleep for
even longer. Meanwhile, more requests arrive, etc, until finally the
Rip-Van-Requests must be abandoned because they can't get the lock and
have reached the try and/or sleep limit. This scenario doesn't require
huge sudden peak loads, just a moderate peak in a somewhat steady
stream of requests. All the system has to do is get a little behind and
starving requests become a certainty.
While the lengthening time would seem to allow more processing to get
done and supposedly increase the chances of the lock being freed, in
fact it completely reprioritizes requests such that older requests
effectively drop in priority faster the older they get. What's more,
practical limits force the number of retries to a relatively small
maximum unless the exponentiality is abandoned at some maximum delay
time. Why not just make the range of the random factor increase quasi-
exponentially with each retry? Or even check the system load and range
the random delay based on the load? The busier the system, the longer
the potential delay (talk about loading the system!).
Please understand what I'm saying here: Exponential backoff is not a
panacea for the boundary conditions it's trying to address and *purely*
exponential backoff creates its own problems. This is a complicated
problem and requires careful modeling and analysis. It shouldn't be
lightly incorporated into 1.94 at the last minute like this without that
analysis and testing. I should think your followup about forgetting to
change the number of retries should convince you of that.
--
Dave Wolfe *Not a spokesman for Motorola*
Motorola MMTG 6501 Wm. Cannon Dr. W. OE112 Austin TX 78735-8598
Follow-Ups:
References:
|
|