# Mixture_Model

class reliability.Distributions.Mixture_Model(distributions, proportions=None)

The mixture model is used to create a distribution that contains parts from multiple distributions. This allows for a more complex model to be constructed as the sum of other distributions, each multiplied by a proportion (where the proportions sum to 1). The model is obtained using the sum of the cumulative distribution functions:

$$CDF_{total} = (CDF_1 × p_1) + (CDF_2 × p_2) + (CDF_3 × p_3) + ... + (CDF_n × p_n)$$

The output API is similar to the other probability distributions (Weibull, Normal, etc.) as shown below.

Parameters:
• distributions (list, array) – List or array of probability distribution objects used to construct the model.

• proportions (list, array) – List or array of floats specifying how much of each distribution to add to the mixture. The sum of proportions must always be 1.

Returns:

• name (str) – ‘Mixture’

• name2 (str) – ‘Mixture using 3 distributions’. The exact name depends on the number of distributions used.

• mean (float)

• variance (float)

• standard_deviation (float)

• skewness (float)

• kurtosis (float)

• excess_kurtosis (float)

• median (float)

• mode (float)

• b5 (float)

• b95 (float)

Notes

An equivalent form of this model is to sum the PDF. SF is obtained as 1-CDF. Note that you cannot simply sum the HF or CHF as this method would be equivalent to the competing risks model. In this way, we see the mixture model will always lie somewhere between the constituent models.

This model should be used when a data set cannot be modelled by a single distribution, as evidenced by the shape of the PDF, CDF or probability plot (points do not form a straight line). Unlike the competing risks model, this model requires the proportions to be supplied.

As this process is additive for the survival function, and may accept many distributions of different types, the mathematical formulation quickly gets complex. For this reason, the algorithm combines the models numerically ather than empirically so there are no simple formulas for many of the descriptive statistics (mean, median, etc.). Also, the accuracy of the model is dependent on xvals. If the xvals array is small (<100 values) then the answer will be ‘blocky’ and inaccurate. The variable xvals is only accepted for PDF, CDF, SF, HF, CHF. The other methods (like random samples) use the default xvals for maximum accuracy. The default number of values generated when xvals is not given is 1000. Consider this carefully when specifying xvals in order to avoid inaccuracies in the results.

CDF(xvals=None, xmin=None, xmax=None, show_plot=True, plot_components=False, **kwargs)

Plots the CDF (cumulative distribution function)

Parameters:
• show_plot (bool, optional) – True or False. Default = True

• plot_components (bool) – Option to plot the components of the model. True or False. Default = False.

• xvals (array, list, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

• kwargs – Plotting keywords that are passed directly to matplotlib (e.g. color, linestyle)

Returns:

yvals (array, float) – The y-values of the plot

Notes

The plot will be shown if show_plot is True (which it is by default) and len(xvals) >= 2.

If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters.

CHF(xvals=None, xmin=None, xmax=None, show_plot=True, plot_components=False, **kwargs)

Plots the CHF (cumulative hazard function)

Parameters:
• show_plot (bool, optional) – True or False. Default = True

• plot_components (bool) – Option to plot the components of the model. True or False. Default = False.

• xvals (array, list, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

• kwargs – Plotting keywords that are passed directly to matplotlib (e.g. color, linestyle)

Returns:

yvals (array, float) – The y-values of the plot

Notes

The plot will be shown if show_plot is True (which it is by default) and len(xvals) >= 2.

If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters.

HF(xvals=None, xmin=None, xmax=None, show_plot=True, plot_components=False, **kwargs)

Plots the HF (hazard function)

Parameters:
• show_plot (bool, optional) – True or False. Default = True

• plot_components (bool) – Option to plot the components of the model. True or False. Default = False.

• xvals (array, list, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

• kwargs – Plotting keywords that are passed directly to matplotlib (e.g. color, linestyle)

Returns:

yvals (array, float) – The y-values of the plot

Notes

The plot will be shown if show_plot is True (which it is by default) and len(xvals) >= 2.

If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters.

PDF(xvals=None, xmin=None, xmax=None, show_plot=True, plot_components=False, **kwargs)

Plots the PDF (probability density function)

Parameters:
• show_plot (bool, optional) – True or False. Default = True

• plot_components (bool) – Option to plot the components of the model. True or False. Default = False.

• xvals (array, list, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

• kwargs – Plotting keywords that are passed directly to matplotlib (e.g. color, linestyle)

Returns:

yvals (array, float) – The y-values of the plot

Notes

The plot will be shown if show_plot is True (which it is by default) and len(xvals) >= 2.

If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters.

SF(xvals=None, xmin=None, xmax=None, show_plot=True, plot_components=False, **kwargs)

Plots the SF (survival function)

Parameters:
• show_plot (bool, optional) – True or False. Default = True

• plot_components (bool) – Option to plot the components of the model. True or False. Default = False.

• xvals (array, list, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

• kwargs – Plotting keywords that are passed directly to matplotlib (e.g. color, linestyle)

Returns:

yvals (array, float) – The y-values of the plot

Notes

The plot will be shown if show_plot is True (which it is by default) and len(xvals) >= 2.

If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters.

inverse_SF(q)

Inverse survival function calculator

Parameters:

q (float, list, array) – Quantile to be calculated. Must be between 0 and 1.

Returns:

x (float) – The inverse of the SF at q.

mean_residual_life(t)

Mean Residual Life calculator

Parameters:

t (int, float) – Time (x-value) at which mean residual life is to be evaluated

Returns:

MRL (float) – The mean residual life

plot(xvals=None, xmin=None, xmax=None)

Plots all functions (PDF, CDF, SF, HF, CHF) and descriptive statistics in a single figure

Parameters:
• xvals (list, array, optional) – x-values for plotting

• xmin (int, float, optional) – minimum x-value for plotting

• xmax (int, float, optional) – maximum x-value for plotting

Returns:

None

Notes

The plot will be shown. No need to use plt.show(). If xvals is specified, it will be used. If xvals is not specified but xmin and/or xmax are specified then an array with 200 elements will be created using these limits. If nothing is specified then the range will be based on the distribution’s parameters. No plotting keywords are accepted.

quantile(q)

Quantile calculator

Parameters:

q (float, list, array) – Quantile to be calculated. Must be between 0 and 1.

Returns:

x (float) – The inverse of the CDF at q. This is the probability that a random variable from the distribution is < q

random_samples(number_of_samples, seed=None)

Draws random samples from the probability distribution

Parameters:
• number_of_samples (int) – The number of samples to be drawn. Must be greater than 0.

• seed (int, optional) – The random seed passed to numpy. Default = None

Returns:

samples (array) – The random samples

Notes

This is the same as rvs in scipy.stats

stats()

Descriptive statistics of the probability distribution. These are the same as the statistics shown using .plot() but printed to the console.

Parameters:

None

Returns:

None