Fit_Weibull_2P_grouped

class reliability.Fitters.Fit_Weibull_2P_grouped(dataframe=None, show_probability_plot=True, print_results=True, CI=0.95, force_beta=None, quantiles=None, method='MLE', optimizer=None, CI_type='time', downsample_scatterplot=True, **kwargs)

Fits a two parameter Weibull distribution (alpha,beta) to the data provided. This function is similar to Fit_Weibull_2P however it accepts a dataframe which allows for efficient handling of grouped (repeated) data.

Parameters:

dataframe (dataframe) – a pandas dataframe of the appropriate format. See the example in Notes.
show_probability_plot (bool, optional) – True or False. Default = True
print_results (bool, optional) – Prints a dataframe of the point estimate, standard error, Lower CI and Upper CI for each parameter. True or False. Default = True
method (str, optional) – The method used to fit the distribution. Must be either ‘MLE’ (maximum likelihood estimation), ‘LS’ (least squares estimation), ‘RRX’ (Rank regression on X), or ‘RRY’ (Rank regression on Y). LS will perform both RRX and RRY and return the better one. Default is ‘MLE’.
optimizer (str, optional) – The optimization algorithm used to find the solution. Must be either ‘TNC’, ‘L-BFGS-B’, ‘nelder-mead’, or ‘powell’. The default optimizer is ‘TNC’. The option to use all these optimizers is not available (as it is in all the other Fitters). If the optimizer fails, the initial guess will be returned.
CI (float, optional) – confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is 0.95 for 95% CI.
CI_type (str, optional) – This is the confidence bounds on time or reliability shown on the plot. Use ‘none’ to turn off the confidence intervals. Must be either ‘time’, ‘reliability’, or ‘none’. Default is ‘time’. Some flexibility in names is allowed (eg. ‘t’, ‘time’, ‘r’, ‘rel’, ‘reliability’ are all valid).
force_beta (float, int, optional) – Used to specify the beta value if you need to force beta to be a certain value. Used in ALT probability plotting. Optional input. If specified it must be > 0.
quantiles (bool, str, list, array, None, optional) – quantiles (y-values) to produce a table of quantiles failed with lower, point, and upper estimates. Default is None which results in no output. To use default array [0.01, 0.05, 0.1,…, 0.95, 0.99] set quantiles as either ‘auto’, True, ‘default’, ‘on’. If an array or list is specified then it will be used instead of the default array. Any array or list specified must contain values between 0 and 1.
downsample_scatterplot (bool, int, optional) – If True or None, and there are over 1000 points, then the scatterplot will be downsampled by a factor. The default downsample factor will seek to produce between 500 and 1000 points. If a number is specified, it will be used as the downsample factor. Default is True. This functionality makes plotting faster when there are very large numbers of points. It only affects the scatterplot not the calculations.
kwargs – Plotting keywords that are passed directly to matplotlib for the probability plot (e.g. color, label, linestyle)

Returns:

alpha (float) – the fitted Weibull_2P alpha parameter
beta (float) – the fitted Weibull_2P beta parameter
alpha_SE (float) – the standard error (sqrt(variance)) of the parameter
beta_SE (float) – the standard error (sqrt(variance)) of the parameter
Cov_alpha_beta (float) – the covariance between the parameters
alpha_upper (float) – the upper CI estimate of the parameter
alpha_lower (float) – the lower CI estimate of the parameter
beta_upper (float) – the upper CI estimate of the parameter
beta_lower (float) – the lower CI estimate of the parameter
loglik (float) – Log Likelihood (as used in Minitab and Reliasoft)
loglik2 (float) – LogLikelihood*-2 (as used in JMP Pro)
AICc (float) – Akaike Information Criterion
BIC (float) – Bayesian Information Criterion
AD (float) – the Anderson Darling (corrected) statistic (as reported by Minitab)
distribution (object) – a Weibull_Distribution object with the parameters of the fitted distribution
results (dataframe) – a pandas dataframe of the results (point estimate, standard error, lower CI and upper CI for each parameter)
goodness_of_fit (dataframe) – a pandas dataframe of the goodness of fit values (Log-likelihood, AICc, BIC, AD).
quantiles (dataframe) – a pandas dataframe of the quantiles with bounds on time. This is only produced if quantiles is not None. Since quantiles defaults to None, this output is not normally produced.
probability_plot (object) – the axes handle for the probability plot. This is only returned if show_probability_plot = True

Notes

If the fitting process encounters a problem a warning will be printed. This may be caused by the chosen distribution being a very poor fit to the data or the data being heavily censored. If a warning is printed, consider trying a different optimizer.

Requirements of the input dataframe: The column titles MUST be ‘category’, ‘time’, ‘quantity’ The category values MUST be ‘F’ for failure or ‘C’ for censored (right censored). The time values are the failure or right censored times. The quantity is the number of items at that time. This must be specified for all values even if the quantity is 1.

Example of the input dataframe:

category	time	quantity
F	24	1
F	29	1
F	34	1
F	39	2
F	40	1
F	42	3
F	44	1
C	50	3
C	55	5
C	60	10

This is easiest to achieve by importing data from excel. An example of this is:

import pandas as pd
from reliability.Fitters import Fit_Weibull_2P_grouped
filename = 'C:\Users\Current User\Desktop\data.xlsx'
df = pd.read_excel(io=filename)
Fit_Weibull_2P_grouped(dataframe=df)

static LL(params, T_f, T_rc, Q_f, Q_rc)

static LL_fb(params, T_f, T_rc, Q_f, Q_rc, force_beta)

static logR(t, a, b)

static logf(t, a, b)