Difference between revisions of "Circular Error Probable"

From ShotStat
Jump to: navigation, search
m (Unknown variances and correlation)
Line 1: Line 1:
 
<p style="text-align:right"><B>Previous:</B> [[Measuring Precision]]</p>
 
<p style="text-align:right"><B>Previous:</B> [[Measuring Precision]]</p>
  
= Circular Error Probable estimators (CEP) =
+
= Circular Error Probable (CEP) =
  
The Circular Error Probable <math>CEP(p)</math> for <math>p \in [0,1)</math> is the estimated radius of the smallest circle that is expected to cover proportion <math>p</math> of the shot group. If systematic accuracy bias is ignored, the center of the circle is set to coincide with the observed group center. If systematic accuracy bias is taken into account, the center of the circle is set to coincide with the point of aim, probably offset from the observed group center.
+
The Circular Error Probable <math>CEP(p)</math> for <math>p \in [0,1)</math> is the estimated radius of the smallest circle that is expected to cover proportion <math>p</math> of the shot group.
  
 +
== Systematic Accuracy Bias ==
 +
Some approaches to estimating CEP conflate the question of precision with the question of accuracy, or "sighting in."
 +
 +
The simpler case only tries to estimate precision, and computes CEP about the sample center.
 +
 +
The general case allows that the point-of-aim is offset from the true center point-of-impact.  In the literature this is referred to as ''systematic accuracy bias''.  Including systematic accuracy bias sets the center of the circle to the point of aim, which means the sample center will probably be offset from that and CEP will be correspondingly larger.
 +
 +
== Estimators ==
 
Several different methods for estimating <math>CEP(p)</math> have been proposed which are based on different assumptions about the underlying distribution of coordinates (see the [[CEP_literature|CEP literature overview]] for references and the [[Measuring_Tools#shotGroups_Analysis_Package|shotGroups]] package for a free open source implementation):
 
Several different methods for estimating <math>CEP(p)</math> have been proposed which are based on different assumptions about the underlying distribution of coordinates (see the [[CEP_literature|CEP literature overview]] for references and the [[Measuring_Tools#shotGroups_Analysis_Package|shotGroups]] package for a free open source implementation):
  
* The general correlated normal estimator ([[CEP_literature#DiDonato1961a|DiDonato & Jarnagin, 1961a]]; [[CEP_literature#Evans1985|Evans, 1985]]) is based on the assumption of [[Measuring_Precision#General_Bivariate_Normal|bivariate normality]] of the shot coordinates. It allows the <math>x</math>- and <math>y</math>-coordinates to be correlated and have different variances. If systematic accuracy bias is ignored, this estimate can be based on the closed-form solution for the distribution of radial error in the bivariate normal distribution re-written in polar coordinates (radius and angle; [[CEP_literature#Hoyt1947|Hoyt, 1947]]; [[CEP_literature#Paris2009|Paris, 2009]]). If systematic accuracy bias is taken into account, [[CEP_literature#algos|numerical integration]] of the multivariate normal distribution around an offset circle is required. The calculation of the correlated normal estimator is difficult and requires numerical approaches only available in [[Measuring_Tools#shotGroups_Analysis_Package|specialized software]].
+
* The general correlated normal estimator ([[CEP_literature#DiDonato1961a|DiDonato & Jarnagin, 1961a]]; [[CEP_literature#Evans1985|Evans, 1985]]) is based on the assumption of [[Measuring_Precision#General_Bivariate_Normal|bivariate normality]] of the shot coordinates. It allows the '''''x'''''- and '''''y'''''-coordinates to be correlated and have different variances. This estimate can be based on the closed-form solution for the distribution of radial error in the bivariate normal distribution re-written in polar coordinates (radius and angle; [[CEP_literature#Hoyt1947|Hoyt, 1947]]; [[CEP_literature#Paris2009|Paris, 2009]]).
 +
** If systematic accuracy bias is taken into account, [[CEP_literature#algos|numerical integration]] of the multivariate normal distribution around an offset circle is required. The calculation of the correlated normal estimator is difficult and requires numerical approaches only available in [[Measuring_Tools#shotGroups_Analysis_Package|specialized software]].
  
* The Grubbs-Pearson estimator ([[CEP_literature#Grubbs1964|Grubbs, 1964]]) shares its assumptions with the general correlated normal estimator. It is based on the Pearson three-moment central <math>\chi^{2}</math>-approximation ([[CEP_literature#Imhof1961|Imhof, 1961]]; [[CEP_literature#Pearson1959|Pearson, 1959]]) of the cumulative distribution function of radial error in bivariate normal variables. This approach has the advantage that its calculation is much easier than the exact distribution and does not require special software. For <math>p \geq 0.25</math>, the approximation to the true cumulative distribution function is very close but can diverge from it for <math>p < 0.25</math> and some distribution shapes.
+
* The Grubbs-Pearson estimator ([[CEP_literature#Grubbs1964|Grubbs, 1964]]) shares its assumptions with the general correlated normal estimator. It is based on the Pearson three-moment central <math>\chi^{2}</math>-approximation ([[CEP_literature#Imhof1961|Imhof, 1961]]; [[CEP_literature#Pearson1959|Pearson, 1959]]) of the cumulative distribution function of radial error in bivariate normal variables. This approach has the advantage that its calculation is much easier than the exact distribution and does not require special software. For <math>p \geq 0.25</math>, the approximation to the true cumulative distribution function is very close but can diverge from it for <math>p < 0.25</math> and for highly elliptic distributions.
  
* The Grubbs-Patnaik estimator ([[CEP_literature#Grubbs1964|Grubbs, 1964]]) differs from the Grubbs-Pearson estimator insofar as it is based on the Patnaik two-moment central <math>\chi^{2}</math>-approximation ([[CEP_literature#Patnaik1949|Patnaik, 1949]]) of the true cumulative distribution function of radial error. For <math>p < 0.5</math> and some distribution shapes, the approximation can diverge from the true cumulative distribution function.
+
* The Grubbs-Patnaik estimator ([[CEP_literature#Grubbs1964|Grubbs, 1964]]) differs from the Grubbs-Pearson estimator insofar as it is based on the Patnaik two-moment central <math>\chi^{2}</math>-approximation ([[CEP_literature#Patnaik1949|Patnaik, 1949]]) of the true cumulative distribution function of radial error. For <math>p < 0.5</math> and for highly elliptic distributions the approximation can diverge significantly from the true cumulative distribution function.
  
 
* The Grubbs-Liu estimate was not proposed by Grubbs but can be constructed following the same principle as his original estimators. It differs from them insofar as it is based on the recent [[CEP_literature#Liu2009|Liu, Tang, and Zhang (2009)]] four-moment non-central <math>\chi^{2}</math>-approximation of the true cumulative distribution function of radial error.
 
* The Grubbs-Liu estimate was not proposed by Grubbs but can be constructed following the same principle as his original estimators. It differs from them insofar as it is based on the recent [[CEP_literature#Liu2009|Liu, Tang, and Zhang (2009)]] four-moment non-central <math>\chi^{2}</math>-approximation of the true cumulative distribution function of radial error.
Line 19: Line 28:
 
* The [[CEP_literature#Ethridge1983|Ethridge (1983)]] estimator is not based on the assumption of bivariate normality of <math>(x,y)</math>-coordinates but uses a robust unbiased estimator for the median radius ([[CEP_literature#Hogg1967|Hogg, 1967]]). This estimator "assumes that the square root of the radial miss distances follows the logarithmic generalized exponential power distribution." ([[CEP_literature#Williams1997|Williams, 1997]]). It is only available for <math>p = 0.5</math>.
 
* The [[CEP_literature#Ethridge1983|Ethridge (1983)]] estimator is not based on the assumption of bivariate normality of <math>(x,y)</math>-coordinates but uses a robust unbiased estimator for the median radius ([[CEP_literature#Hogg1967|Hogg, 1967]]). This estimator "assumes that the square root of the radial miss distances follows the logarithmic generalized exponential power distribution." ([[CEP_literature#Williams1997|Williams, 1997]]). It is only available for <math>p = 0.5</math>.
  
* The modified [[CEP_literature#RAND1952|RAND R-234]] estimator is an early example of CEP and is based on lookup tables that have later been fitted with a regression model to accomodate systematic accuracy bias ([[CEP_literature#Pesapane1977|Pesapane & Irvine, 1977]]). It assumes a mostly circular distribution of <math>(x,y)</math>-coordinates. In its original form it was only available for <math>p = 0.5</math>, but [[CEP_literature#McMillan2008|McMillan & McMillan (2008)]] proposed an extension to levels <math>p = 0.9</math> and <math>p = 0.95</math> based on numerical simulations.
+
* The modified [[CEP_literature#RAND1952|RAND R-234]] estimator is an early example of CEP and is based on lookup tables that have later been fitted with a regression model to accommodate systematic accuracy bias ([[CEP_literature#Pesapane1977|Pesapane & Irvine, 1977]]). It assumes a mostly circular distribution of <math>(x,y)</math>-coordinates. In its original form it was only available for <math>p = 0.5</math>, but [[CEP_literature#McMillan2008|McMillan & McMillan (2008)]] proposed an extension to levels <math>p = 0.9</math> and <math>p = 0.95</math> based on numerical simulations.
  
 
= Comparing CEP estimators =
 
= Comparing CEP estimators =
  
When comparing the relative merits of different CEP estimators, it is important to distinguish two situations:
+
If the true variances of '''''x'''''- and '''''y'''''-coordinates as well as their covariance is known then the closed-form general correlated normal estimator is ideal.
* The true variances of ''x''- and ''y''-coordinates as well as their true correlation are known. In practice, this is never the case.
 
* The variances and the correlation of <math>(x,y)</math>-coordinates have to be estimated from a sample with limited number of observations.
 
 
 
== Known variances and correlation ==
 
 
 
With known variances and correlation, the CEP estimator based on the exact bivariate normal distribution is superior to the Grubbs estimators (as they are only approximations), to the Rayleigh estimator (as this is a special case of the general bivariate normal estimator with stricter assumptions), and to the RAND estimator. The correlated normal estimator generalizes to three-dimensional data, can accomodate systematic accuracy bias, and is available for arbitrary values of <math>p</math>. When its assumptions are met, the only downside is the difficulty of computation without [[Measuring_Tools#shotGroups_Analysis_Package|specialized software]].
 
 
 
The Grubbs-Pearson estimator has the theoretical advantage over the Grubbs-Patnaik estimator that the approximating distribution matches the true distribution not only in mean and variance but also in skewness. However, the Grubbs-Pearson estimator does not seem to be used in [[CEP_literature#compStudies|comparison studies]], only the Grubbs-Patnaik estimator. All Grubbs estimators generalize to three-dimensional data, can accomodate systematic accuracy bias, and are available for arbitrary values of <math>p</math>. They are easy to calculate with standard software as long as the central <math>\chi^{2}</math>-distribution is available (e.g., in [[Measuring_Tools#Spreadsheet_Analysis|spreadsheets]]).
 
  
If systematic accuracy is taken into account, the Grubbs-Liu estimator has the theoretical advantage over the Grubbs-Pearson estimator that the approximating distribution matches the true distribution not only in mean, variance, and skewness but also in kurtosis. If systematic accuracy bias is ignored, the Grubbs-Liu estimator is equivalent to the Grubbs-Pearson estimator. Its calculation is less complicated than the exact correlated normal estimator but requires the non-central <math>\chi^{2}</math>-distribution. This distribution might not be available in standard tools like spreadsheets, but is implemented in all statistics packages.
+
When we are confident in asserting a bivariate normal model for shot dispersion the Grubbs estimators are excellent approximations for reasonable values of '''''p''''' and ellipticity.
 +
* To date most [[CEP_literature#compStudies|comparison studies]] have only used the Grubbs-Patnaik estimator.
 +
* The Grubbs-Pearson estimator has the theoretical advantage over the Grubbs-Patnaik estimator that the approximating distribution matches the true distribution not only in mean and variance but also in skewness.  Both the Grubbs-Pearson and Grubbs-Patnaik estimators are easy to calculate with standard software as long as the central <math>\chi^{2}</math>-distribution is available (as it is, for example, in [[Measuring_Tools#Spreadsheet_Analysis|spreadsheets]]).
 +
* If systematic accuracy bias is taken into account, the Grubbs-Liu estimator has the theoretical advantage over the Grubbs-Pearson estimator that the approximating distribution matches the true distribution not only in mean, variance, and skewness but also in kurtosis. If systematic accuracy bias is ignored, the Grubbs-Liu estimator is equivalent to the Grubbs-Pearson estimator. Its calculation is less complicated than the exact correlated normal estimator but requires the ''non-central'' <math>\chi^{2}</math>-distribution. This distribution might not be available in general tools like spreadsheets, but it is implemented in all statistics packages.
  
By assuming non-correlated bivariate normal data with equal variances of the <math>(x,y)</math>-coordinates, the Rayleigh estimator has more restrictive assumptions than the general bivariate normal and Grubbs estimators. It generalizes to three-dimensional data, can accomodate systematic accuracy bias, and is available for arbitrary values of <math>p</math>. It is easy to calculate in closed form with standard tools like [[Measuring_Tools#Spreadsheet_Analysis|spreadsheets]].
+
One shortcoming of the Grubbs estimators is that it is not possible to incorporate the confidence intervals of the variance estimates into the CEP estimate. This is particularly relevant to small samples where the variance estimates themselves are subject to considerable error.
  
The modified RAND-234 estimator can accomodate moderate systematic accuracy bias but does not generalize to three-dimensional data, and is limited to the 50% CEP.
+
In the special case where we assume uncorrelated bivariate normal data with equal variances the [[Closed_Form_Precision|Rayleigh estimator]] does provide true confidence intervals, and it is easy to calculate using [[Measuring_Tools#Spreadsheet_Analysis|spreadsheets]].
  
The Ethridge estimator stands out because it does not require bivariate normality of the <math>(x,y)</math>-coordinates. It generalizes to three-dimensional data, can accomodate systematic accuracy bias, but is limited to the 50% CEP.
+
The Ethridge estimator stands out because it does not require bivariate normality of the <math>(x,y)</math>-coordinates. It generalizes to three-dimensional data and can accommodate systematic accuracy bias, but it is limited to the 50% CEP.
  
== Unknown variances and correlation ==
+
== Small Samples ==
  
When variances and correlation of <math>(x,y)</math>-coordinates have to be estimated from limited samples, the situation is different. While it seems common to plug in the sample estimates into the formula for the distribution of radial error when all parameters are known, the resulting distribution is not strictly valid anymore. The reason is that even if the estimates are unbiased, the uncertainty from estimating the variances and correlation are not reflected in the distribution formulae (compare with the case of the [http://en.wikipedia.org/wiki/T-test ''t''-test] vs. the [http://en.wikipedia.org/wiki/Z-test ''z''-test]). With growing sample-size, the so-called sampling distribution of radial error will be asymptotically the same as with the case of know variances/correlation.
+
For small samples we are more sensitive to which estimator is least bias and most efficient. This question has been studied, e.g., by [[CEP_literature#Williams1997|Williams (1997)]].
  
For small samples, the question therefore is which estimator has the best characteristics: On average, it should give a good approximation to the true <math>CEP(p)</math> (no bias), and it should have little estimation variance even with moderate sample sizes(good efficiency). This question has been studied, e.g., by [[CEP_literature#Williams1997|Williams (1997)]]. A related question is which estimator is most robust to a very small number of outliers (fliers) that may result from clear operator error. See the literature overview for more [[CEP_literature#compStudies|comparison studies]].
+
A related question is which estimator is most robust to a very small number of outliers (fliers) that may result from clear operator error. See the literature overview for more [[CEP_literature#compStudies|comparison studies]].

Revision as of 12:35, 4 March 2014

Previous: Measuring Precision

Circular Error Probable (CEP)

The Circular Error Probable \(CEP(p)\) for \(p \in [0,1)\) is the estimated radius of the smallest circle that is expected to cover proportion \(p\) of the shot group.

Systematic Accuracy Bias

Some approaches to estimating CEP conflate the question of precision with the question of accuracy, or "sighting in."

The simpler case only tries to estimate precision, and computes CEP about the sample center.

The general case allows that the point-of-aim is offset from the true center point-of-impact. In the literature this is referred to as systematic accuracy bias. Including systematic accuracy bias sets the center of the circle to the point of aim, which means the sample center will probably be offset from that and CEP will be correspondingly larger.

Estimators

Several different methods for estimating \(CEP(p)\) have been proposed which are based on different assumptions about the underlying distribution of coordinates (see the CEP literature overview for references and the shotGroups package for a free open source implementation):

  • The general correlated normal estimator (DiDonato & Jarnagin, 1961a; Evans, 1985) is based on the assumption of bivariate normality of the shot coordinates. It allows the x- and y-coordinates to be correlated and have different variances. This estimate can be based on the closed-form solution for the distribution of radial error in the bivariate normal distribution re-written in polar coordinates (radius and angle; Hoyt, 1947; Paris, 2009).
    • If systematic accuracy bias is taken into account, numerical integration of the multivariate normal distribution around an offset circle is required. The calculation of the correlated normal estimator is difficult and requires numerical approaches only available in specialized software.
  • The Grubbs-Pearson estimator (Grubbs, 1964) shares its assumptions with the general correlated normal estimator. It is based on the Pearson three-moment central \(\chi^{2}\)-approximation (Imhof, 1961; Pearson, 1959) of the cumulative distribution function of radial error in bivariate normal variables. This approach has the advantage that its calculation is much easier than the exact distribution and does not require special software. For \(p \geq 0.25\), the approximation to the true cumulative distribution function is very close but can diverge from it for \(p < 0.25\) and for highly elliptic distributions.
  • The Grubbs-Patnaik estimator (Grubbs, 1964) differs from the Grubbs-Pearson estimator insofar as it is based on the Patnaik two-moment central \(\chi^{2}\)-approximation (Patnaik, 1949) of the true cumulative distribution function of radial error. For \(p < 0.5\) and for highly elliptic distributions the approximation can diverge significantly from the true cumulative distribution function.
  • The Grubbs-Liu estimate was not proposed by Grubbs but can be constructed following the same principle as his original estimators. It differs from them insofar as it is based on the recent Liu, Tang, and Zhang (2009) four-moment non-central \(\chi^{2}\)-approximation of the true cumulative distribution function of radial error.
  • The Ethridge (1983) estimator is not based on the assumption of bivariate normality of \((x,y)\)-coordinates but uses a robust unbiased estimator for the median radius (Hogg, 1967). This estimator "assumes that the square root of the radial miss distances follows the logarithmic generalized exponential power distribution." (Williams, 1997). It is only available for \(p = 0.5\).
  • The modified RAND R-234 estimator is an early example of CEP and is based on lookup tables that have later been fitted with a regression model to accommodate systematic accuracy bias (Pesapane & Irvine, 1977). It assumes a mostly circular distribution of \((x,y)\)-coordinates. In its original form it was only available for \(p = 0.5\), but McMillan & McMillan (2008) proposed an extension to levels \(p = 0.9\) and \(p = 0.95\) based on numerical simulations.

Comparing CEP estimators

If the true variances of x- and y-coordinates as well as their covariance is known then the closed-form general correlated normal estimator is ideal.

When we are confident in asserting a bivariate normal model for shot dispersion the Grubbs estimators are excellent approximations for reasonable values of p and ellipticity.

  • To date most comparison studies have only used the Grubbs-Patnaik estimator.
  • The Grubbs-Pearson estimator has the theoretical advantage over the Grubbs-Patnaik estimator that the approximating distribution matches the true distribution not only in mean and variance but also in skewness. Both the Grubbs-Pearson and Grubbs-Patnaik estimators are easy to calculate with standard software as long as the central \(\chi^{2}\)-distribution is available (as it is, for example, in spreadsheets).
  • If systematic accuracy bias is taken into account, the Grubbs-Liu estimator has the theoretical advantage over the Grubbs-Pearson estimator that the approximating distribution matches the true distribution not only in mean, variance, and skewness but also in kurtosis. If systematic accuracy bias is ignored, the Grubbs-Liu estimator is equivalent to the Grubbs-Pearson estimator. Its calculation is less complicated than the exact correlated normal estimator but requires the non-central \(\chi^{2}\)-distribution. This distribution might not be available in general tools like spreadsheets, but it is implemented in all statistics packages.

One shortcoming of the Grubbs estimators is that it is not possible to incorporate the confidence intervals of the variance estimates into the CEP estimate. This is particularly relevant to small samples where the variance estimates themselves are subject to considerable error.

In the special case where we assume uncorrelated bivariate normal data with equal variances the Rayleigh estimator does provide true confidence intervals, and it is easy to calculate using spreadsheets.

The Ethridge estimator stands out because it does not require bivariate normality of the \((x,y)\)-coordinates. It generalizes to three-dimensional data and can accommodate systematic accuracy bias, but it is limited to the 50% CEP.

Small Samples

For small samples we are more sensitive to which estimator is least bias and most efficient. This question has been studied, e.g., by Williams (1997).

A related question is which estimator is most robust to a very small number of outliers (fliers) that may result from clear operator error. See the literature overview for more comparison studies.