Difference between revisions of "Excluding Worst"

From ShotStat
Jump to: navigation, search
m (Fixing image links again)
(Clarified contaminated normal, marked worst shots with triangles)
 
Line 7: Line 7:
 
Monte-Carlo simulations show that in a small group this approach does not work very well (because every shot is important), but in a 10-shot group it makes perfect sense.   
 
Monte-Carlo simulations show that in a small group this approach does not work very well (because every shot is important), but in a 10-shot group it makes perfect sense.   
  
Here are some examples. Solid red line is extreme spread, dashed blue line is extreme spread excluding worst shot.
+
Here are some examples. Worst shot is marked with a red triangle. Solid red line is extreme spread. Dashed blue line is extreme spread excluding worst shot.
  
 
[[File:Ex0-1.png|800px]]
 
[[File:Ex0-1.png|800px]]
Line 15: Line 15:
 
[[File:Ex1-1.png|800px]]
 
[[File:Ex1-1.png|800px]]
  
In presence of outliers this metric performs even better.
+
In presence of outliers this metric performs even better. The example below shows contaminated normal distribution: to simulate outliers, random 3% of shots were pulled from distribution with standard deviation three times higher than usual.  
  
 
[[File:Ex2-1.png|800px]]
 
[[File:Ex2-1.png|800px]]
  
 
Any ideas how to call this metric? Moderate spread? Less extreme spread? Trimmed group size?
 
Any ideas how to call this metric? Moderate spread? Less extreme spread? Trimmed group size?

Latest revision as of 21:09, 7 June 2016

Extreme Spread Excluding Worst Shot

Extreme spread is, by definition, sensitive to extreme events (outliers). One way to make this estimator more robust is to simply exclude the worst shot. It must be one of the two impacts defining extreme spread. If not sure which shot to exclude, measure extreme spread without either one and take the smaller number.

To avoid bias, it is important to exclude worst shot for all groups, not just the ones with obvious outliers.

Monte-Carlo simulations show that in a small group this approach does not work very well (because every shot is important), but in a 10-shot group it makes perfect sense.

Here are some examples. Worst shot is marked with a red triangle. Solid red line is extreme spread. Dashed blue line is extreme spread excluding worst shot.

Ex0-1.png

By convenient coincidence, extreme spread after excluding worst shot in a 10-shot group is about the same as extreme spread of regular 5-shot group.

Ex1-1.png

In presence of outliers this metric performs even better. The example below shows contaminated normal distribution: to simulate outliers, random 3% of shots were pulled from distribution with standard deviation three times higher than usual.

Ex2-1.png

Any ideas how to call this metric? Moderate spread? Less extreme spread? Trimmed group size?