Nice posting Jay, thanks,
I think I missed the original poster's intent when I began discussing
operating firewalls in parallel. If the other system is just a backup
for emergencies, then the setup is much more simple.
However operating the systems in parallel gives you excellent uptime at
100% capacity, and some down time at 50% capacity. There
would be almost no time with 0% capacity. However operating in
parallel is not without complication. I've heard of some software
known as 'LifeKeeper' that is capable of migrating services from a
failed server to one of N other servers. I'm sure that there are
other products.
Mark Riggins
Jay Clark wrote:
>
>
> > We are asked to warrant a >99,5% uptime for a firewall system in a financial
> > organization. We're trying to figure out what's the best way to manage such
>
> A 99.5 reliability gives you around 43.8 hours per year of allowable
> downtime, given a non-redundent configuration this can probably be
> accomplished by having the system self test itself with a report to a
> server running on another platform.
>
> If the server misses an "I'm OK" it would then alert your NOC or your
> maintenance tech via a pager.
>
> Given 24x7 coverage at the NOC or by the maintenance tech and a software
> failure (or a damn complete set of spares) you can probably meet this
> reliability figure.
>
> You just have to assume a MTTF (mean time to fail) and a MTTR (mean time
> to repair) and run some numbers to see if you will probably not have to
> pay up under the warrenty provisions.
>
> (I love probablity)
>
> A better way is to set up a fully redundent system with a soft fail to
> the backup. With this setup the lack of an "I'm OK" message would
> trigger the switchover, which would include making sure the normal
> equipment is down and bringing the standby on line.
>
> This drops your MTTR to less than 5 minutes, and if you have a MTTF of
> around 5000 hours your reliablity skyrockets.
>
> In microwave transmission systems I have designed systems with 99.999
> availablity as the normal standard and 6 "9's" on special request, and
> the same methods should work for your application.
>
> If you go with the redundent system, and if the 5000 hour mean time to
> fail is a valid number for the equipment under consideration, then you
> would be able to warrent 99.99% availablity and _not_ have to incur the
> cost of the 24x7 maintenance coverage.
>
> Eliminating the cost of maintenance coverage should pay back the cost of
> the redundent system in a couple of months.
>
> <g> life is simple when ya can afford to do it right the first time.
>
>
References:
|
|