ChatGPT and reliability

The day before my last post—the one about ChatGPT trying to solve beam bending problems—Adam Wuerl, an aerospace engineer at Blue Origin, wrote up his similar adventure with ChatGPT. Adam was trying to get it to solve a reliability problem:

My problem: some hardware has a per-use reliability of X. How many uses can I expect before it fails? In probability speak, what is the expected value of the discrete probability distribution that models a sequential series of successful trials that terminates in a failure?

As with my structural analysis problems, ChatGPT failed to come up with the correct answer to this problem, mainly because it flipped the definitions of success and failure partway through the solution. Despite that, Adam was able to extract enough useful information from ChatGPT’s bumbling and prolix answers to remind himself how to do the problem, and he solved it on his own.

Adam was happier with this than I would have been, but I understand his point of view. He wasn’t making up a problem as a test—he really wanted the answer, and he wanted it quickly. He didn’t have the textbook he would have used to help him solve the problem, and he wasn’t confident in the terms needed to do a successful Google search, so ChatGPT acted as a fuzzy search engine, taking his question and spewing out enough verbiage to trigger Adam’s own buried knowledge. It didn’t come up with an answer that would get a good homework grade, but it prompted Adam’s own intelligence, which got the job done.

Adam’s post triggered something in me, too: a desire to derive the formula for the expected number of uses of a device before failure. As Adam says, we’ll start with the assumption that each use of the device constitutes a Bernoulli trial, meaning that the the trials are independent and the probability of failure is the same for each trial. This assumption, which works very well for things like flipping coins and rolling dice, may not model hardware failure perfectly, but it’s a good place to start if you don’t have any further information.

We’ll start by defining p as the probability of failure for each use of the device. Then we can start working out the probabilities of various sequences. In the following table, the sequences use S for a successful use and F for a failure. All of the sequences end with F.

Sequence Probability
F p
SF (1p)p
SSF (1p) 2p
SSSF (1p) 3p
etc. etc.

We’ll call the random number of consecutive successes before a failure N, so the probability of N equaling a given number n is

Pr(N=n)=(1p) np

The expected value of N is therefore

μ N= n=0 n(1p) np=p n=0 n(1p) n

Unlike Jacob Bernoulli, I’m not good at working out infinite series. So I opened Mathematica and gave the command

p*Sum[n*(1 - p)^n, {n, 0, Infinity}]

which returned

1pp

You could do the same thing in Wolfram Alpha using the “natural language” command

sum of p*n*(1-p)^n for n from 0 to infinity

This is the expected number of successes before a failure. There’s a similar question that leads to a slightly different answer: What is the expected number of trials at which the first failure occurs? Here, we’ll use M as the random number of trials until the first failure, and

Pr(M=m)=(1p) m1p

Therefore the expected number of trials until the first failure is

μ M= m=1 m(1p) m1p=1p

where again, I used Mathematica to do the infinite series.

This value tends to be used more often in engineering reliability calculations and has been given the name return period. For example, if a flood of a certain level or higher has a 2% chance of occurring in any given year (and we assume that more than one such flood in a single year is impossible), the return period for that kind of flood is

1.02=50

and people will call it a 50-year flood. Structures are often designed (with a factor of safety) against the 50-year flood or the 50-year wind.

Since M is basically one more trial than N, you’d expect μ M to be one larger than μ N. And that’s exactly right:

μ Mμ N=1p1pp=1(1p)p=pp=1

Sometimes the math works out just as you think it should.