Fit_Weibull_Mixture
- class reliability.Fitters.Fit_Weibull_Mixture(failures=None, right_censored=None, show_probability_plot=True, print_results=True, CI=0.95, optimizer=None, downsample_scatterplot=True, **kwargs)
Fits a mixture of two Weibull_2P distributions (this does not fit the gamma parameter). Right censoring is supported, though care should be taken to ensure that there still appears to be two groups when plotting only the failure data. A second group cannot be made from a mostly or totally censored set of samples. Use this model when you think there are multiple failure modes acting to create the failure data.
- Parameters:
failures (array, list) – An array or list of the failure data. There must be at least 4 failures, but it is highly recommended to use another model if you have less than 20 failures.
right_censored (array, list, optional) – The right censored data. Optional input. Default = None.
show_probability_plot (bool, optional) – True or False. Default = True
print_results (bool, optional) – Prints a dataframe of the point estimate, standard error, Lower CI and Upper CI for each parameter. True or False. Default = True
optimizer (str, optional) – The optimization algorithm used to find the solution. Must be either ‘TNC’, ‘L-BFGS-B’, ‘nelder-mead’, or ‘powell’. Specifying the optimizer will result in that optimizer being used. To use all of these specify ‘best’ and the best result will be returned. The default behaviour is to try each optimizer in order (‘TNC’, ‘L-BFGS-B’, ‘nelder-mead’, and ‘powell’) and stop once one of the optimizers finds a solution. If the optimizer fails, the initial guess will be returned. For more detail see the documentation.
CI (float, optional) – confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is 0.95 for 95% CI.
downsample_scatterplot (bool, int, optional) – If True or None, and there are over 1000 points, then the scatterplot will be downsampled by a factor. The default downsample factor will seek to produce between 500 and 1000 points. If a number is specified, it will be used as the downsample factor. Default is True. This functionality makes plotting faster when there are very large numbers of points. It only affects the scatterplot not the calculations.
kwargs – Plotting keywords that are passed directly to matplotlib for the probability plot (e.g. color, label, linestyle)
- Returns:
alpha_1 (float) – the fitted Weibull_2P alpha parameter for the first (left) group
beta_1 (float) – the fitted Weibull_2P beta parameter for the first (left) group
alpha_2 (float) – the fitted Weibull_2P alpha parameter for the second (right) group
beta_2 (float) – the fitted Weibull_2P beta parameter for the second (right) group
proportion_1 (float) – the fitted proportion of the first (left) group
proportion_2 (float) – the fitted proportion of the second (right) group. Same as 1-proportion_1
alpha_1_SE (float) – the standard error (sqrt(variance)) of the parameter
beta_1_SE (float) – the standard error (sqrt(variance)) of the parameter
alpha_2_SE (float) – the standard error (sqrt(variance)) of the parameter
beta_2_SE (float) – the standard error (sqrt(variance)) of the parameter
proportion_1_SE (float) – the standard error (sqrt(variance)) of the parameter
alpha_1_upper (float) – the upper CI estimate of the parameter
alpha_1_lower (float) – the lower CI estimate of the parameter
alpha_2_upper (float) – the upper CI estimate of the parameter
alpha_2_lower (float) – the lower CI estimate of the parameter
beta_1_upper (float) – the upper CI estimate of the parameter
beta_1_lower (float) – the lower CI estimate of the parameter
beta_2_upper (float) – the upper CI estimate of the parameter
beta_2_lower (float) – the lower CI estimate of the parameter
proportion_1_upper (float) – the upper CI estimate of the parameter
proportion_1_lower (float) – the lower CI estimate of the parameter
loglik (float) – Log Likelihood (as used in Minitab and Reliasoft)
loglik2 (float) – LogLikelihood*-2 (as used in JMP Pro)
AICc (float) – Akaike Information Criterion
BIC (float) – Bayesian Information Criterion
AD (float) – the Anderson Darling (corrected) statistic (as reported by Minitab)
distribution (object) – a Mixture_Model object with the parameters of the fitted distribution
results (dataframe) – a pandas dataframe of the results (point estimate, standard error, lower CI and upper CI for each parameter)
goodness_of_fit (dataframe) – a pandas dataframe of the goodness of fit values (Log-likelihood, AICc, BIC, AD).
probability_plot (object) – the axes handle for the probability plot. This is only returned if show_probability_plot = True
Notes
This is different to the Weibull Competing Risks as the overall Survival Function is the sum of the individual Survival Functions multiplied by a proportion rather than being the product as is the case in the Weibull Competing Risks Model.
Mixture Model: \(SF_{model} = (proportion_1 × SF_1) + ((1-proportion_1) × SF_2)\)
Competing Risks Model: \(SF_{model} = SF_1 × SF_2\)
Similar to the competing risks model, you can use this model when you think there are multiple failure modes acting to create the failure data.
Whilst some failure modes may not be fitted as well by a Weibull distribution as they may be by another distribution, it is unlikely that a mixture of data from two distributions (particularly if they are overlapping) will be fitted noticeably better by other types of mixtures than would be achieved by a Weibull mixture. For this reason, other types of mixtures are not implemented.
If the fitting process encounters a problem a warning will be printed. This may be caused by the chosen distribution being a very poor fit to the data or the data being heavily censored. If a warning is printed, consider trying a different optimizer.
- static LL(params, T_f, T_rc)
- static logR(t, a1, b1, a2, b2, p)
- static logf(t, a1, b1, a2, b2, p)