https://raw.githubusercontent.com/MatthewReid854/reliability/master/docs/images/logo.png

chi2test

class reliability.Reliability_testing.chi2test(distribution, data, significance=0.05, bins=None, print_results=True, show_plot=True)

Performs the Chi-squared test for goodness of fit to determine whether we can accept or reject the hypothesis that the data is from the specified distribution at the specified level of significance.

This method is not a means of comparing distributions (which can be done with AICc, BIC, and AD), but instead allows us to accept or reject a hypothesis that data come from a distribution.

Parameters:
  • distribution (object) – A distribution object created using the reliability.Distributions module.

  • data (array, list) – The data that are hypothesised to come from the distribution.

  • significance (float, optional) – This is the complement of confidence. 0.05 significance is the same as 95% confidence. Must be between 0 and 0.5. Default = 0.05.

  • bins (array, list, string, optional) – An array or list of the bin edges from which to group the data OR a string for the bin edge method from numpy. String options are ‘auto’, ‘fd’, ‘doane’, ‘scott’, ‘stone’, ‘rice’, ‘sturges’, or ‘sqrt’. Default = ‘auto’. For more information on these methods, see the numpy documentation: https://numpy.org/doc/stable/reference/generated/numpy.histogram_bin_edges.html

  • print_results (bool, optional) – If True the results will be printed. Default = True

  • show_plot (bool, optional) – If True a plot of the distribution and histogram will be generated. Default = True.

Returns:

  • chisquared_statistic (float) – The chi-squared statistic.

  • chisquared_critical_value (float) – The chi-squared critical value.

  • hypothesis (string) – ‘ACCEPT’ or ‘REJECT’. If chisquared_statistic < chisquared_critical_value then we can accept the hypothesis that the data is from the specified distribution

  • bin_edges (array) – The bin edges used. If bins is a list or array then bin_edges = bins. If bins is a string then you can find the bin_edges that were calculated using this output.

Notes

The result is sensitive to the bins. For this reason, it is recommended to leave bins as the default value.