Circular Error Probable

From ShotStat
Revision as of 14:20, 1 March 2014 by Armadillo (talk | contribs) (Known variances and correlation)
Jump to: navigation, search

Previous: Measuring Precision

Circular Error Probable estimators (CEP)

The Circular Error Probable \(CEP(p)\) for \(p \in [0,1)\) is the estimated radius of the smallest circle that is expected to cover proportion \(p\) of the shot group. If systematic accuracy bias is ignored, the center of the circle is set to coincide with the observed group center. If systematic accuracy bias is taken into account, the center of the circle is set to coincide with the point of aim, probably offset from the observed group center.

Several different methods for estimating \(CEP(p)\) have been proposed which are based on different assumptions about the underlying distribution of coordinates (see the CEP literature overview for references):

  • The general correlated normal estimator (DiDonato & Jarnagin, 1961a; Evans, 1985) is based on the assumption of bivariate normality of the shot coordinates. It allows the \(x\)- and \(y\)-coordinates to be correlated and have different variances. If systematic accuracy bias is ignored, this estimate can be based on the closed-form solution for the distribution of radial error in the bivariate normal distribution re-written in polar coordinates (radius and angle; Hoyt, 1947; Paris, 2009). If systematic accuracy bias is taken into account, numerical integration of the multivariate normal distribution around an offset circle is required. The calculation of the correlated normal estimator is difficult and requires numerical approaches only available in specialized software.
  • The Grubbs-Pearson estimator (Grubbs, 1964) shares its assumptions with the general correlated normal estimator. It is based on the Pearson three-moment central \(\chi^{2}\)-approximation (Imhof, 1961; Pearson, 1959) of the cumulative distribution function of radial error in bivariate normal variables. This approach has the advantage that its calculation is much easier than the exact distribution and does not require special software. For \(p \geq 0.25\), the approximation to the true cumulative distribution function is very close but can diverge from it for \(p < 0.25\) and some distribution shapes.
  • The Grubbs-Patnaik estimator (Grubbs, 1964) differs from the Grubbs-Pearson estimator insofar as it is based on the Patnaik two-moment central \(\chi^{2}\)-approximation (Patnaik, 1949) of the true cumulative distribution function of radial error. For \(p < 0.5\) and some distribution shapes, the approximation can diverge from the true cumulative distribution function.
  • The Ethridge (1983) estimator is not based on the assumption of bivariate normality of \((x,y)\)-coordinates but uses a robust unbiased estimator for the median radius (Hogg, 1967). This estimator "assumes that the square root of the radial miss distances follows the logarithmic generalized exponential power distribution." (Williams, 1997). It is only available for \(p = 0.5\).
  • The modified RAND R-234 estimator is an early example of CEP and is based on lookup tables that have later been fitted with a regression model to accomodate systematic accuracy bias (Pesapane & Irvine, 1977). It assumes a mostly circular distribution of \((x,y)\)-coordinates. In its original form it was only available for \(p = 0.5\), but McMillan & McMillan (2008) proposed an extension to levels \(p = 0.9\) and \(p = 0.95\) based on numerical simulations.

Comparing CEP estimators

When comparing the relative merits of different CEP estimators, it is important to distinguish two situations:

  • The true variances of x- and y-coordinates as well as their true correlation are known. In practice, this is never the case.
  • The variances and the correlation of \((x,y)\)-coordinates have to be estimated from a sample with limited number of observations.

Known variances and correlation

With known variances and correlation, the CEP estimator based on the exact bivariate normal distribution is superior to the Grubbs estimators (as they are only approximations), to the Rayleigh estimator (as this is a special case of the general bivariate normal estimator with stricter assumptions), and to the RAND estimator. The correlated normal estimator generalizes to three-dimensional data, can accomodate systematic accuracy bias, and is available for arbitrary values of \(p\). When its assumptions are met, the only downside is the difficulty of computation without specialized software.

The Grubbs-Pearson estimator has the theoretical advantage over the Grubbs-Patnaik estimator that the approximating distribution matches the true distribution not only in mean and variance but also in skewness. However, the Grubbs-Pearson estimator does not seem to be used in comparison studies (see below), only the Grubbs-Patnaik estimator. Both Grubbs estimators generalize to three-dimensional data, can accomodate systematic accuracy bias, and are available for arbitrary values of \(p\).

The modified RAND-234 estimator can accomodate moderate systematic accuracy bias but does not generalize to three-dimensional data, and is limited to the 50% CEP.

The Ethridge estimator stands out because it does not require bivariate normality of the \((x,y)\)-coordinates. It generalizes to three-dimensional data, can accomodate systematic accuracy bias, but is limited to the 50% CEP.

Unknown variances and correlation

When variances and correlation have to be estimated from limited samples, the situation is different. While it seems common to plug in the sample estimates into the formula for the distribution of radial error when all parameters are known, the resulting distribution is not strictly valid anymore. The reason is that even if the estimates are unbiased, the uncertainty from estimating the variances and correlation are not reflected in the distribution formulae (compare with the case of the t-test vs. the z-test). With growing sample-size, the so-called sampling distribution of radial error will be asymptotically the same as with the case of know variances/correlation.

For small samples the question therefore is which estimator has the best characteristics, i.e., gives a good approximation to the true \(CEP(p)\). This question has been studied, e.g., by Williams (1997). A related question is which estimator is most robust to a very small number of outliers (fliers) that may result from clear operator error. See the literature overview for more comparison studies.