Software Reliability Models

"All models are wrong, but some are useful"

- aphorism in statistics

Engineers are comfortable with hardware reliability. We routinely talk about failure rates and MTBF. If you are deeply involved in that world, you may talk about Weibull distributions or physics-of-failure models.

There are even thick guidebooks to simply look up the numbers. If TUV says your hardware MTBF is eleventy kajillion years, you must be good. AMIRITE? But I digress...

Things get murkier once we start talking about software. Those same engineers that glibly accept hardware failure rates will balk if you ask about the probability of software failure. Common objections include:

"Software doesn't have failures. It only has human errors"
"There is no way to model software reliability. It's too random"
"Software failures are not random, they are systematic."

Can you spot the fallacies in each of these statements?

The last one is my favorite because closely resembles the language used in the international Functional Safety standards, including IEC 61508, IEC 61511, and ISO 26262.

The IEC 61508 Approach

I have already made my case for the non-distinction between so-called random and systematic failures. I will not rehash it here, but you can find it at IEC 61511 is Wrong About Systematic Failures.

One of my key arguments in that earlier post is that the purpose of Safety Integrity Levels (or ASILs) is to accurately estimate the likelihood that a safety-critical system will fail, and we should estimate all significant sources of failure, including software.

My issue with the functional safety standards is that they assume (either implicitly or explicitly, depending on the standard) that it is impossible to quantitatively predict or estimate the reliability of software.

Therefore the standards embrace the approach of qualitatively addressing software reliability through structured work processes, rigorous verifications, and thorough validation. All of these are good practices that are pre-requisites for high reliability software.

This qualitative approach alone may be adequate for relatively simple industrial controllers or embedded automotive ECUs, but for complex machine learned autonomous software systems, do we need to do more? History suggests yes.

Modeling Software Reliability

Software reliability models have a long history and have been used successfully in many applications across industries. For example, NASA was estimating software failure rates as far back as 1978.

Michael Lyu's 2002 paper, Software Reliability Theory, provides a thorough overview of developments in the field, starting with seminal works in the early 1970's.

By 2002, Lyu identifies over 20 different probabilistic software reliability models. That is only the traditional statistical models and does not include the Bayesian models. Although there were far fewer, Bayesian models also started development in the early 1970's.

It's an interesting historical note that Lyu explains the disparity in modeling approaches this way:

"It seems, however, that the Bayesian approach suffers from its complexity and from the difficulty in choosing appropriate distributions for the parameters. Added to this is the fact that most software engineers do not have the required statistical background to completely understand and appreciate Bayesian models. The latter is perhaps the main reason why these models have not enjoyed the same attention as the classical models (there are almost 5 times as many classical models as Bayesian models, and they are used in a great majority of the practical applications)"

What a difference a couple of decades makes! Bayesian analysis is now far more mainstream. A 2015 literature review by R. Wahono revealed that Naive Bayes was the most commonly published method for software defect prediction.

Wrap Up

After 50 years, software reliability prediction continues to be an active field of scientific research.

The models may not be simple, and they may not be accurate in all circumstances. However, software reliability is a real field of study with a long history of literature. Functional safety engineers ignore it at their peril!

The IEC 61508 and ISO 26262 standards were developed based on relatively simple industrial and automotive embedded controls. Without comprehensive software reliability models, are the prescriptive qualitative methods in the standards adequate for the complex machine learned software in Level 5 self-driving cars? Call me a skeptic.

Don't miss the follow-up post: Against Software Reliability?