Difference between revisions of "User talk:Herb"

From ShotStat
Jump to: navigation, search
(Case 1, Equal variances and uncorrelated)
 
(48 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
Herb's talk page
 
Herb's talk page
  
= Modeling Dispersion =
+
Stack question?
  
Before considering the mathematical models that will be used for the actual statistical analysis, let's consider the assumptions of various dispersion models and hence the intrinsic functions of how shots are dispersed. The [http://en.wikipedia.org/wiki/Normal_distribution Normal distribution] is the broadly assumed probability model used for a single random variable and it is characterized by its mean <math>(\bar{x})</math> and standard deviation <math>(\sigma)</math>. Since we are interested in shot dispersion on a two-dimensional target we will assume that the two dimensional analog of the Normal distribution, the [http://en.wikipedia.org/wiki/Bivariate_normal_distribution Bivariate Normal Distribution]. This distribution  describes, at least approximately, the dispersion of a gunshots about their center point, (<math>\bar{h}</math> and <math>\bar{v}</math>). The bivariate normal distribution  also has separate parameters for the standard deviation in each dimension, <math>\sigma_h</math> and <math>\sigma_v</math>, as well as a correlation parameter ''ρ''. The full bivariate normal distribution is thus:<br >
+
I am interested in modeling the precision of shooting at a target. The general case is to assume that the fundamental distribution is the bivariate normal distribution. h and v being the horizontal and vertical axis for the target.  
&nbsp;&nbsp;&nbsp;&nbsp;<math>
 
    f(h,v) =
 
      \frac{1}{2 \pi  \sigma_h \sigma_v \sqrt{1-\rho^2}}
 
      \exp\left(
 
        -\frac{1}{2(1-\rho^2)}\left[
 
          \frac{(h-\bar{h})^2}{\sigma_h^2} +
 
          \frac{(v-\bar{v})^2}{\sigma_v^2} -
 
          \frac{2\rho(h-\bar{h})(v-\bar{v})}{\sigma_h \sigma_v}
 
        \right]
 
      \right)
 
  </math>
 
  
For the overall equation note that the following restrictions apply:<br />
+
This would allow for elliptical shot groups, or round groups. However assuming that there is no correlation between the axes, and that the variances along the axis are equal, and translating the coordinate system to the the center of impact (average shot along each axis),  then the distribution reduces to Rayleigh Distribution.
&nbsp;&nbsp;&nbsp;&nbsp;<math>-1 &le; \rho &le; 1</math><br />
 
&nbsp;&nbsp;&nbsp;&nbsp;<math> \sigma_h>0 </math> and <math> \sigma_v>0 </math>
 
  
Since we are primarily interested in the dispersion component, the overall assumption is that weapon is properly sighted so that the center of impact is the same as the point of aim. In practice this can be achieved with a simple translation of the horizontal and vertical coordinates from absolute values to values relative to the average point of impact. Therefore the terms controlled by [[FAQ#How_many_shots_do_I_need_to_sight_in.3F|sighting in the gun]] (<math>\bar{h}</math> and <math>\bar{v}</math>) drop out in the simplification of the dispersion equation.
 
  
== Simplifications into cases ==
+
As I understand it the correlation between the two axis is measured using "ordinary" linear least squares. That is assuming one axis is the independent variable and measuring the residual error perpendicular to that axis. In other words all the error is in the dependent axis.
  
Looking at the overall equation two different mutually exclusive simplifications can be made:
+
Consider using the total least squares line. Imagine the shots being marked on a transparent sheet where the center of impact was at the origin. As the shot pattern was rotated around the origin the correlation line would stay the same relative to the shot pattern.
  
* '''Either''' <math>\sigma_h = \sigma_v</math> (equal variances) '''or''' <math>\sigma_h \neq \sigma_v</math> (unequal variances).
+
What would be the result of using total least squares to fit the regression line in regards to the bivariate normal distribution reducing to the Rayleigh distribution?
: Obviously if we could measure both <math>\sigma_h</math> and <math>\sigma_v</math> with a very high precision (e.g 6 significant figures), then the two quantities would never really be equal. But in many cases the assumption is reasonable. In reality since shooters typically collect only a small amount of data, statistical tests will fail to detect a difference unless the difference is great. In such cases the shot pattern would be noticeably elliptical not round.
 
 
 
* '''Either''' <math>\rho = 0</math> (uncorrelated) '''or''' <math>\rho \neq 0</math> (correlated).
 
 
 
The pair of mutually exclusive assumptions thus results in four cases for analytical evaluation.
 
 
 
== Error Propagation ==
 
 
 
In general shooting has a number of different error sources. The "basic" assumption is that the error sources are normally distributed and independent. Thus for example the vertical error component would be something like:<br>
 
&nbsp;&nbsp;&nbsp;&nbsp;<math>\sigma_v^2 = \sigma_{v,Weapon}^2 + \sigma_{v,Ammunition}^2 + \sigma_{v,shooter}^2</math><br />
 
and there would similar equations for <math>\sigma_h^2</math> and <math>\sigma_{RSD}^2</math>.
 
 
 
Notes:
 
# The error sources can't be measured independently. Only the total error is observable. This is a very important consideration since the total error thus limits the precision which any of the individual error factors can be determined.
 
# If any one of the error sources is more than about three times the sum of the other two, then that error source essentially controls the overall error of the measurement.
 
# These error sources could be decomposed further into "finer" errors. For instance consider the multitude of variables that a handloader controls when loading a cartridge.
 
# The fact that a very small amount of data is typically collected by a shooter greatly limits the overall precision of the mathematical analysis. In statistics this is known as the "small sample" problem. For example to get "good" estimates for the normal distribution parameters <math>\bar{x}</math> and <math>\sigma</math>, 30 measurements are desired.
 
# In general for a given sample size of ''n'' measurements the mean, <math>\bar{x}</math>, is determined relatively (i.e as a percentage) much more precisely than the standard deviation, <math>\sigma</math> (or other dispersion factor).
 
 
 
== Stringing ==
 
 
 
To the extent that either <math>\sigma_h \neq \sigma_v</math> or <math>\rho \neq 0</math> then elliptical shot groups will result instead of circular shot groups. If the shot groups are not round then we have three options. We can:
 
# Isolate the error source experimentally and remove it (for instance weigh gunpowder carefully to remove vertical stringing).
 
# Use a mathematical model for analysis that allows for stringing.
 
# Scale the raw data to remove the stringing.
 
 
 
Obviously the experimental reason for stinging may not be obvious and easy to remove. Experimental designs to isolate and quantify the source of the stringing are beyond this basic discussion at this point, but possible.
 
 
 
There isn't any theoretical ballistic requirement that requires correlation between the horizontal and vertical dispersion of gunshots.  Therefore, most statistical measures implicitly assume <math>\rho = 0</math>. In general if <math>\rho \neq 0</math>, then there would be an elliptical shaped group with the major axis oriented at some angle to the horizontal or vertical axis.
 
 
 
We do know that targets can often exhibit vertical or horizontal stringing as evidenced by an elliptical shaped group along the vertical or horizontal axis respectively. Obviously in such cases <math>\sigma_h \neq \sigma_v</math>.
 
 
 
:(1) The primary source of horizontal stringing is crosswind. 
 
::If we measure the wind while shooting we can bound and remove a “wind correction” term from that axis.  E.g., "Suppose the orthogonal component of wind is ranging at random from 0-10mph during the shooting.  Given lag-time ''t'' this will expand the no-wind horizontal dispersion at the target by <math>\sigma_{wind}</math>."<ref>Wind deflection is a function of the ballistic curve and distance, but can be expressed as a simple product of the cross-wind velocity and lag time.  For more information on the "lag rule" see Bryan Litz, ''Applied Ballistics for Long Range Shooting, 2<sup>nd</sup> Edition'' (2011) A4; or Robert McCoy, ''Modern Exterior Ballistics, 2<sup>nd</sup> Edition'' (2012) 7.27.</ref>  Since variances are additive we could adjust <math>\sigma_h</math> via the equation <math>{\sigma'}_h^2 = \sigma_h^2 - \sigma_{wind}^2</math>.
 
:(2) The primary source of vertical stringing is muzzle velocity.
 
:: We can actually measure with a chronograph and then correct for that source of variation. E.g., If standard deviation of muzzle velocity is <math>\s_{mv}</math> then, given the bullet's ballistic model for the given target distance, the vertical spread attributable to that is some <math>\s_v</math>.  Here too we can remove this known source of dispersion from our samples via the equation <math>sigma_v^2 = \sigma_v^2 - f(\sigma_{mv}^2</math>.  This adjustment is shown in several of the examples:
 
::* [[22LR CCI 40gr HV 40-shot 100-yard Example]]
 
::* [[300BLK Subsonic 20-shot 100-yard Example]]
 
 
 
= Four Special Cases for Dispersion =
 
 
 
Neglecting flyers, and assuming perfect aim, the overall assumptions about shot dispersion result in four cases for statistical analysis.
 
 
Note that in the cases below the assumptions use ''approximately equal to'' <math>(\approx)</math>) rather than ''strictly  equal to'' (=). This is an acknowledgement that we are dividing the cases into ones that are close enough to be useful, even though they most certainly are not exact. There is absolutely no method by which the true population values for <math>\bar{h}, \bar{v},\sigma_h</math> and <math>\sigma_v</math> can be determined. We can only get experimental estimates from calculations based on sample data for the factors <math>\mu_h, \mu_v, s_h</math> and <math>s_v</math> and these estimates are at best only good to a scant few significant figures. Thus the difference between ''approximately equal to'' and ''strictly equal to'' is really under experimental control. In other words, we can theoretically make the measurements as precise as we want by collecting more data, but practically there are limits.
 
 
 
==Case 1, Equal variances and uncorrelated ==
 
Given: <br />
 
#<math>\sigma_h \approx \sigma_v</math><br />
 
#<math>\rho \approx 0</math><br />
 
# The radial distance for each shot, ''i'', is <math>r_i = \sqrt{(h_i - \mu_h)^2 + (v_i - \mu_v)^2}</math>
 
then the  mathematical formula for the dispersion distribution would be the Rayleigh distribution:<br />
 
&nbsp;&nbsp;&nbsp;&nbsp;<math>f(r) = \frac{r}{\sigma_{RSD}^2} e^{-r^2/(2\sigma_{RSD}^2)}, \quad r \geq 0,</math> and <math>\sigma_{RSD}</math> is the distribution shape factor known as the Radial Standard Deviation.<br />
 
 
 
This is really the best case for shot dispersion. Shot groups would be round. Strictly, for the Rayleigh distribution to apply, then  <math>\sigma_h = \sigma_v</math>, in which case <math>\sigma_{RSD} = \sigma_h = \sigma_v</math>. For the "loose" application of the Rayleigh distribution to apply, then <math>\sigma_{RSD} \approx (\sigma_h + \sigma_v)/2</math>
 
 
 
 
 
The following statistical measurements are appropriate:
 
* Circular Error Probable (CEP)
 
* Covering Circle Radius (CCR)
 
* Group Size (GS)
 
* Figure of Merit (FOM)
 
* Mean Radius (MR)
 
* Rayleigh Distribution
 
 
 
Notes:
 
# In this case that the FOM, and Group Size are different measurements.
 
# The group size would only depend on the two shots most distant in separation. The FOM would depend on two to four shots. For a large number of shots we'd typically expect four different shots to define the extremes for horizontal and vertical deflection.
 
# For the measures for the CCR, the GS and the FOM measurements a target would a ragged hole would be acceptable, but for the rest of the measures the {''h,v''} positions of each shot must be known.
 
 
 
==Case 2, Equal variances and correlated ==
 
Given:<br />
 
#<math>\sigma_h \approx \sigma_v</math><br />
 
#<math>\rho \neq 0</math><br />
 
# The {''h,v''} position of each shot must be known. 
 
 
 
The following statistical measurement is appropriate:
 
* Elliptic Error Probable
 
 
 
==Case 3, Unequal variances and uncorrelated ==
 
Given:<br />
 
# <math>\sigma_h \neq \sigma_v</math><br />
 
# <math>\rho \approx 0</math><br />
 
# The {''h,v''} position of each shot must be known.
 
then the mathematical formula for the dispersion distribution would be:<br />
 
&nbsp;&nbsp;&nbsp;&nbsp;<math>
 
    f(h,v) =
 
      \frac{1}{2 \pi s_h s_v}
 
      \exp\left(
 
        -\frac{1}{2}\left[
 
          \frac{h^2}{s_h^2} +
 
          \frac{v^2}{s_v^2}
 
        \right]
 
      \right)
 
  </math>
 
 
 
The following statistical measurements are appropriate:
 
* Diagonal
 
* Individual Horizontal and Vertical variances
 
 
 
In this case the horizontal and vertical standard deviations could be determined independently from the horizontal and vertical measurements respectively.
 
 
 
==Case 4, Unequal variances and correlated ==
 
Given:<br />
 
#<math>\sigma_h \neq \sigma_v</math><br />
 
#<math>\rho \neq 0</math><br />
 
# The {''h,v''} position of each shot must be known.
 
then the mathematical formula for the dispersion distribution would be:<br />
 
&nbsp;&nbsp;&nbsp;&nbsp;<math>
 
    f(h,v) =
 
      \frac{1}{2 \pi s_h s_v \sqrt{1-\rho^2}}
 
      \exp\left(
 
        -\frac{1}{2(1-\rho^2)}\left[
 
          \frac{h^2}{s_h^2} +
 
          \frac{v^2}{s_v^2} -
 
          \frac{2\rho h v}{s_h s_v}
 
        \right]
 
      \right)
 
  </math>
 
 
 
This is really the most complex case for shot dispersion. Shot groups would be elliptical or egg-shaped. The mathematical analysis would require the full version of the bivariate Gaussian distribution.
 

Latest revision as of 01:48, 5 June 2015

Herb's talk page

Stack question?

I am interested in modeling the precision of shooting at a target. The general case is to assume that the fundamental distribution is the bivariate normal distribution. h and v being the horizontal and vertical axis for the target.

This would allow for elliptical shot groups, or round groups. However assuming that there is no correlation between the axes, and that the variances along the axis are equal, and translating the coordinate system to the the center of impact (average shot along each axis), then the distribution reduces to Rayleigh Distribution.


As I understand it the correlation between the two axis is measured using "ordinary" linear least squares. That is assuming one axis is the independent variable and measuring the residual error perpendicular to that axis. In other words all the error is in the dependent axis.

Consider using the total least squares line. Imagine the shots being marked on a transparent sheet where the center of impact was at the origin. As the shot pattern was rotated around the origin the correlation line would stay the same relative to the shot pattern.

What would be the result of using total least squares to fit the regression line in regards to the bivariate normal distribution reducing to the Rayleigh distribution?