_images/logo.png

Working with fitted distributions

Probability distributions can be created by specifying the parameters or by fitting the model to data. The main difference between the two distribution objects is the presence of additional parameters required to generate the confidence bounds on the plot. This document shows how a fitted distribution can be used to make certain predictions, with a focus on extracting values from the confidence bounds.

The additional parameters that are added to the distribution object when it is created by a Fitter are the standard error of each parameter (eg. alpha_SE, beta_SE) and the covariance between the parameters (e.g. Cov_alpha_beta). The confidence interval is also required, though this defaults to 0.95 for 95% confidence bounds. It may be specified using the CI argument.

The confidence bounds available are bounds on time or bounds on reliability. For an detailed explaination of how these are calculated, please see the theory document on confidence intervals. The important thing to note here is that they are different, and which one you should use depends on what you want to know. If you want to know the system reliability at a certain time, then you are specifying time (CI_x) and seeking bounds on reliability (CI_type=’reliability’). If you want to know the time that the system will reach a certain reliability, then you are specifying reliability (CI_y) and seeking bounds on time (CI_type=’time’). Note that these are paired, so CI_y only works with CI_type=’time’ and CI_x only works with CI_type=’reliability’. The only exception to this is the Exponential Distribution which does not accept CI_type since bounds on time and reliability are identical.

Example 1

In this example, we see how the confidence bounds on time or reliability should be used based on what we are trying to predict.

from reliability.Distributions import Weibull_Distribution
from reliability.Fitters import Fit_Weibull_2P
import matplotlib.pyplot as plt

dist = Weibull_Distribution(alpha=500,beta=3)
data = dist.random_samples(10,seed=1)
fit = Fit_Weibull_2P(failures=data,show_probability_plot=False,print_results=False)

plt.figure(figsize=(10,7))
plt.subplot(121)
arrow_x = 25
arrow_y = 0.025

X_lower,X_point,X_upper = fit.distribution.CDF(CI_type='time',CI_y=0.7)
plt.arrow(x=0,y=0.7,dx=X_upper,dy=0,color='red',head_width=arrow_y,head_length=arrow_x,length_includes_head=True)
plt.arrow(x=X_lower,y=0.7,dx=0,dy=-0.7,color='red',head_width=arrow_x,head_length=arrow_y,length_includes_head=True)
plt.arrow(x=X_point,y=0.7,dx=0,dy=-0.7,color='red',head_width=arrow_x,head_length=arrow_y,length_includes_head=True)
plt.arrow(x=X_upper,y=0.7,dx=0,dy=-0.7,color='red',head_width=arrow_x,head_length=arrow_y,length_includes_head=True)
plt.xlim(0,dist.quantile(0.99))
plt.ylim(0,1.1)
plt.text(x=0,y=0.705,s='CI_y=0.7',va='bottom')
plt.text(x=X_lower,y=0.035,s='lower',va='bottom',ha='right',rotation=90)
plt.text(x=X_point,y=0.035,s='point',va='bottom',ha='right',rotation=90)
plt.text(x=X_upper,y=0.035,s='upper',va='bottom',ha='right',rotation=90)
plt.title('Weibull CDF\nConfidence bounds on time')

plt.subplot(122)
Y_lower,Y_point,Y_upper = fit.distribution.CDF(CI_type='reliability',CI_x=300)
plt.arrow(x=300,y=0,dx=0,dy=Y_upper,color='red',head_width=arrow_x,head_length=arrow_y,length_includes_head=True)
plt.arrow(x=300,y=Y_lower,dx=-300,dy=0,color='red',head_width=arrow_y,head_length=arrow_x,length_includes_head=True)
plt.arrow(x=300,y=Y_point,dx=-300,dy=0,color='red',head_width=arrow_y,head_length=arrow_x,length_includes_head=True)
plt.arrow(x=300,y=Y_upper,dx=-300,dy=0,color='red',head_width=arrow_y,head_length=arrow_x,length_includes_head=True)
plt.xlim(0,dist.quantile(0.99))
plt.ylim(0,1.1)
plt.text(x=301,y=0.001,s='CI_x=300',va='bottom')
plt.text(x=40,y=Y_lower+0.002,s='lower')
plt.text(x=40,y=Y_point+0.002,s='point')
plt.text(x=40,y=Y_upper+0.002,s='upper')

plt.title('Weibull CDF\nConfidence bounds on reliability')
plt.tight_layout()
plt.show()
_images/confidence_bounds_traceV2.png

Example 2

There are 5 plots available (PDF, CDF, SF, HF, CHF) and confidence bounds are only available for 3 of them (CDF, SF, CHF). The following example shows how these plots can be generated, as well as extracting the confidence bounds (the red, purple, and blue points) using CI_x and CI_y. You can also turn off the confidence bounds if you set plot_CI=False.

import matplotlib.pyplot as plt
from reliability.Distributions import Weibull_Distribution
from reliability.Fitters import Fit_Weibull_2P

dist = Weibull_Distribution(alpha=50,beta=2)
data = dist.random_samples(10,seed=1)
fit = Fit_Weibull_2P(failures=data,show_probability_plot=False,print_results=False)
CI_x = [dist.quantile(0.25),dist.quantile(0.5),dist.quantile(0.75)]
CI_y = [0.25,0.5,0.75]

plt.figure(figsize=(16,9))
plt.subplot(3,5,1)
fit.distribution.PDF()
plt.title('PDF')

plt.subplot(3,5,2)
fit.distribution.CDF(plot_CI=False)
plt.title('CDF')

plt.subplot(3,5,7)
lower, point, upper = fit.distribution.CDF(CI_y=CI_y,CI_type='time')
plt.scatter(lower,CI_y,color='blue')
plt.scatter(point,CI_y,color='purple')
plt.scatter(upper,CI_y,color='red')
plt.title('CDF time')

plt.subplot(3,5,12)
lower, point, upper = fit.distribution.CDF(CI_x=CI_x,CI_type='rel')
plt.scatter(CI_x,lower,color='blue')
plt.scatter(CI_x,point,color='purple')
plt.scatter(CI_x,upper,color='red')
plt.title('CDF reliability')

plt.subplot(3,5,3)
fit.distribution.SF(plot_CI=False)
plt.title('SF')

plt.subplot(3,5,8)
lower, point, upper = fit.distribution.SF(CI_y=CI_y,CI_type='time')
plt.scatter(lower,CI_y,color='blue')
plt.scatter(point,CI_y,color='purple')
plt.scatter(upper,CI_y,color='red')
plt.title('SF time')

plt.subplot(3,5,13)
lower, point, upper = fit.distribution.SF(CI_x=CI_x,CI_type='rel')
plt.scatter(CI_x,lower,color='blue')
plt.scatter(CI_x,point,color='purple')
plt.scatter(CI_x,upper,color='red')
plt.title('SF reliability')

plt.subplot(3,5,4)
fit.distribution.HF()
plt.title('HF')

plt.subplot(3,5,5)
fit.distribution.CHF(plot_CI=False)
plt.title('CHF')

plt.subplot(3,5,10)
CI_y_chf = [1,3,5]
lower, point, upper = fit.distribution.CHF(CI_y=CI_y_chf,CI_type='time')
plt.scatter(lower,CI_y_chf,color='blue')
plt.scatter(point,CI_y_chf,color='purple')
plt.scatter(upper,CI_y_chf,color='red')
plt.title('CHF time')

plt.subplot(3,5,15)
lower, point, upper = fit.distribution.CHF(CI_x=CI_x,CI_type='rel')
plt.scatter(CI_x,lower,color='blue')
plt.scatter(CI_x,point,color='purple')
plt.scatter(CI_x,upper,color='red')
plt.title('CHF reliability')

plt.suptitle('Example of the range of plots available for a fitted distribution')
plt.tight_layout()
plt.show()
_images/range_of_plots_available.png

Example 3

Now, let’s get into a realistic example. We begin by importing the automotive dataset from the Datasets module. This dataset provides failure times (in miles) and right censored times (in miles) for a fleet of vehicles on test. We will fit a Weibull Distribution, and then from the fitted distribution, we want to know the system reliability after 100000 miles.

from reliability.Datasets import automotive
from reliability.Fitters import Fit_Weibull_2P
import matplotlib.pyplot as plt

fit = Fit_Weibull_2P(failures=automotive().failures,right_censored=automotive().right_censored,show_probability_plot=False)

# we want to know the system reliability after 100000 miles
lower,point,upper = fit.distribution.SF(CI_x=100000,CI_type='reliability',CI=0.8)
plt.scatter([100000,100000,100000],[lower,point,upper],color='black')

print('Failures:',automotive().failures)
print('Right censored:',automotive().right_censored)
print('')
print("The 80% reliability estimates at 100000 miles are:")
print("lower bound: "+"{:.2%}".format(lower))
print("point estimate: "+"{:.2%}".format(point))
print("upper bound: "+"{:.2%}".format(upper))
plt.show()


'''
Results from Fit_Weibull_2P (95% CI):
Analysis method: Maximum Likelihood Estimation (MLE)
Optimizer: TNC
Failures / Right censored: 10/21 (67.74194% right censored)

Parameter  Point Estimate  Standard Error  Lower CI  Upper CI
    Alpha          134651         42767.3   72252.9    250937
     Beta         1.15443         0.29614  0.698249   1.90863

Goodness of fit    Value
 Log-likelihood -128.974
           AICc  262.376
            BIC  264.816
             AD  35.6052

Failures: [5248, 7454, 16890, 17200, 38700, 45000, 49390, 69040, 72280, 131900]
Right censored: [3961, 4007, 4734, 6054, 7298, 10190, 23060, 27160, 28690, 37100, 40060, 45670, 53000, 67000, 69630, 77350, 78470, 91680, 105700, 106300, 150400]

The 80% reliability estimates at 100000 miles are:
lower bound: 33.83%
point estimate: 49.20%
upper bound: 62.87%
'''
_images/automotive_bounds.png