On Fri, Jun 12, 1998 at 03:22:46PM -0500, Jason L Tibbitts III <email@example.com> wrote:
> >>>>> "BS" == Bryan Fullerton <firstname.lastname@example.org> writes:
> Drat, and I still can't repeat it. Could you give me the domain of the
> failing host? I can't seem to find one that triggers this behavior.
It's an intermittant thing - the problem was resolved an hour or so after
the failed message finally died, because the remote network came back up.
That's the problem - this is going to happen intermittantly, just because
of how people have their nameservers setup.
Basically, to duplicate it you need to have a domain be resolvable to the
root nameservers (ie be registered with the InterNIC) and have your
nameserver be able to reach the root nameservers, but not any of the
primary or secondary nameservers for the domain. This can happen if
people don't take the advice of the NIC and put all their namservers
for a domain on the same network, and then that network goes down or
loses connectivity for some reason.
> I have some ideas; the first thing is to up the default select timeout.
> Right now if nothing comes back for 60 seconds (the default timeout) then
> majordomo assumes it's not going to hear back. What amount of time is a
> reasonable wait? Perhaps I could, say, quintuple it when sending an RCPT?
What timeout does sendmail use? Or is the timeout in the resolver
(in an effort to figure this out, I've posted a message to
comp.mail.sendmail asking the same thing)
> Perhaps also after failing on an RCPT for some number of times we could
> just drop it from the list. The problem here is that then we'd have to
> find some way to communicate this to the owner. Without any bounces the
> address would stick around and complicate things for an eternity. (I have
> some schemes for managing multiple outbound delivery connections in
> parallel, but these may make for more work than is useful.)
> BTW, are you running a caching nameserver? I find it odd that this doesn't
> get negatively cached anywhere.
Not sure. Perhaps because the root nameservers are reachable? I don't
know enough about what conditions are needed to cause a negative cache
> It was trying really hard to get through. The assumption is that failures
> when talking to hosts are temporary things, so (after doing the deal with
> the backup hosts) it will try again and keep trying until it finally just
> gives up.
Hmm... can the backup hosts be set? Is this part of the delivery options
stuff? I haven't really looked at that part of the config yet, so it's
just using my main server to deliver mail, but I could dump it to an
upstream mail server (mail.uunet.ca or something) if mine's not working
http://www.samurai.com http://www.feh.net http://www.icomm.ca
"One Code to rule them all, one Code to bind them
In the land of Redmond where the Shadows lie."
- Joe Thompson, with apologies to Tolkien