I tend to agree with Marcus when it comes to testing and the
philosophy of code development. My concept is that one writes code
to "fail-soft"; that is, the default failure mode will be the one
which prevents damage. My experience in looking at reams of system
and application code over the past many years is that most people
do not write code which fails-soft.
(I use fail-soft instead of fail-safe. "Fail-safe" used to
refer to a mechanism where everything needs to be a "go" before you
go to the next step (multiple keys, a "green-board", etc.). "Fail-soft"
is different in that it recognizes that unexpected things will
happen after the "go".
In essence, writing "fail-soft" code means that you check for
error conditions and that you check for conditions you expect and
assume that anything remaining represents an error condition.
Depending on the sensitivity of the application, "fail-soft" code
may validate that the system request it just made really happened by
re-checking the results of the call.
Most problems that I've seen with systems and applications
generally center on people making assumptions that calls always work
or that the information they're using is correct.
So, I test code in all the various conditions I can think
of; then I make a judgment call on my exposure if there is an
unexpected failure somewhere and, if that is serious enough, add a
test to detect the inconceivable. Frequently, the inconceivable
is detected when there is a system upgrade.
This is especially true with code related to security; such as
firewalls. One doesn't need to test all the "what-ifs", one does need to
test all the valid allowable scenarios and then reasonably try to
verify that the "fail-soft" code is working by sending (or forcing)
unexpected responses (ex: change the file to a named pipe). You
can't verify that you don't have bugs; you can reasonably verify that
you will handle the unexpected in a reasonable manner. You can also
enforce checks for common "don't do that" situations (ex: changing a file to
Since one frequently can't, in this age of binary released
code, validate that interface calls don't have "hidden options" in
them, I tend to clear unused fields and verify that the call seems
to work as documented. There needs to be a place where you have to
trust your vendor.
Lastly, the firewall needs to log what happened. In the
absence of any other guideline, the firewall should log that it was
called and how it terminated. Then, if you ever see a start message
without the following termination message, you have problems.