Against Software Reliability?


In my earlier post about Software Reliability Models, I did not go into depth about the different approaches to software reliability.

My main point in that initial post was to establish that software reliability modeling “is a thing”, and that functional safety engineers should seek to understand it and, potentially, to use it.

In today’s post, I am still not going to go deep into the methods. This post intends to provide a little background and history on software reliability. Unfortunately, as the title of this post suggests, this history seems to reveal more problems than solutions. It will take a little patience to wade through the literature toward solutions.

This post draws heavily from the work of Dr. Bev Littlewood who is one of the pioneers of software reliability. All of the linked papers are very readable, and I highly encourage you to read all of the links.

IEC 61508 Gives You Options

Functional safety engineers know that IEC 61508 (and its child standards IEC 61511 and ISO 26262) do not require you to quantify so-called systematic failures, including software failures.

What most don’t know is that IEC 61508 gives you the option to quantify software failures in Part 7 Annex D. In my experience, not many folks took up that option. Did you ever wonder why?

The methods outlined in Annex D are fairly straightforward, although the pre-requisites can be hard to satisfy. Software reliability is modeled as either a Bernoulli trial (for low demand systems) or a homogenous Poisson process (for high demand or continuous systems). It was his 2016 criticism of the Annex D methods that first introduced me to Dr. Littlewood’s large body of prior work.

I was pleased to see that one of the papers Littlewood references is a 1991 paper that demonstrates the infeasibility of quantifying the reliability of safety critical software via traditional methods. I made a similar point about validating IEC 61511 safety system performance 25 years later.

Both Dr. Littlewood’s paper and the referenced papers make a compelling case that validating reliability for ultra-high reliability systems is a non-trivial problem.

No Solution Exists

If the approach outlined in IEC 61508 is infeasible for most applications, what can you do? We can go back to Dr. Littlewood’s earlier work to get some ideas.

One of the seminal works in software reliability is Littlewood’s 1993 paper Validation of Ultrahigh Dependability for Software-Based Systems. A key observation from the paper is:

The validity of our results then depends on the validity of the modelling assumptions. There is always a chance that reality will violate some of these assumptions: we need to form an idea of the probability of this happening. Examples of such assumptions are:
• in reliability growth modelling: regularity of the growth process, and in particular, realism of the test data with respect to the intended operational environment;
• in process-based statistical arguments: representativeness of the new product with respect to the known population from which statistical extrapolations are made;
• in structural reliability modelling: correctness of the parameters, completeness of the model (e.g., independence assumptions, transitions in a Markov model, branches in a fault tree);
• in proofs of correctness: conformance of the specification to the informal engineering model.

An even more concise and bold summary may be found in the abstract (emphasis added):

It appears that engineering practice must take into account the fact that no solution exists, at present, for the validation of ultra-high dependability in systems relying on complex software.

One of my favorite quotes (and one of the most controversial) from the paper was targeted at the DO-178 standard for avionics software. However, the same comment could be made about the IEC 61508 family of standards:

This seems to mean that a very low failure probability is required but that, since its achievement cannot be proven in practice, some other, insufficient method of certification will be adopted.

In summary, no solution existed, but we tried to do it anyway, even if the methods were insufficient. That was 1993. Have we gotten any better in 25 years?

Important note: The ultra-high reliability systems they were considering had PFD targets in the range of 10E-7 to 10E-9, equivalent to IEC 61508 “SIL 6” or better (the standard stops at SIL 4). For systems less than ultra-high reliability, the result probably changes from “no solution” to “very hard to achieve”.

From Infeasible to Feasible?

To help answer that question, we can go back to Littleton. Someone asked him the above question, and the result was the 2011 paper “Validation of Ultra-High Dependability…” – 20 years on. (Note: If you’re short on time, read this paper first since it concisely summarizes the 1993 paper)

After summarizing the original paper, the paper reflects on the progress of the last twenty years, including some areas where the author sees that little progress has been made.

IEC 61508 did not exist in 1993, so it is no surprise that the paper calls out the standard for continuing the DO-178B approach of using “good practices” to qualitatively justify performance targets that are never quantitatively verified.

The paper also laments the fact that there is still some controversy regarding using probabilistic approaches for software dependability. Critics claim that probabilistic methods can be a “dangerous temptation for self-delusion”. This strikes me as a case of blaming the tool instead of the fool.

This line of argument leads into one of the key statements of the paper:

Treating this “epistemic” uncertainty rigorously and formally seems necessary, and using probabilities brings the advantages of a unified treatment of the different sources of uncertainty. Such a probabilistic argument may then sometimes show that we have limited grounds for confidence in a system before deployment (e.g. confidence that this flight control system has a failure rate better than 10-9 per hour). This is a benefit, not a defect, of the probabilistic approach, if risk assessment practices are to be beneficial for the engineering profession and the public.

Rigorously treating epistemic uncertainty? That caused my ears to prick up. Long time readers of my blogs know where this is going…

Wait for it…

You got it: Bayesian inference. Indeed, when we look at Littlewood’s other work, we can find Bayesian inference being used to construct “multi-legged” dependability arguments that quantitatively combine different types of evidence from different sources. After a long discussion of all the problems, this offers hope of practical solutions. Stay tuned for more on that subject.

“Measure what can be measured, and make measurable what cannot be measured.” – Galileo

Wrap Up

This post went beyond software reliability “is a thing” and talked about some of the issues that have been encountered in the past trying to quantitatively estimate and predict software reliability.

Unfortunately, this post was long on problems and short on solutions. Next time, we will take a look the “multi-legged” argument approach and see how Bayesian inference helps with the solution.

Thanks for reading! If you liked this, please share it on Linkedin and follow us for updates.

Stephen Thomas, PE, CFSE
Stephen Thomas, PE, CFSE

Stephen is the founder and editor of He is a functional safety expert with over 26 years of experience.  He is currently a system safety engineer with a leading developer of autonomous vehicle technology. He is a member of the IEC 61508 and IEC 61511 functional safety committees. He is a member of the non-profit CFSE Advisory Board advising the exida CFSE program. He is the Director of Education & Professional Development for the International System Safety Society and an associate editor for the Journal of System Safety.

2 thoughts on “Against Software Reliability?

Leave a Reply

Your email address will not be published.