handling outliers

< Previous | Next >

NewAmerica

Banned
Mandarin
Does "handling outliers" mean "dealing with abnormal values"?

**********
One way to do this is to write a program that creates alternative data sets by, for example, adding random noise or a hidden offset, moving participants to different experimental groups or hiding demographic categories. Researchers handle the fake data set as usual — cleaning the data, handling outliers, running analyses — while the computer faithfully applies all of their actions to the real data. They might even write up the results. But at no point do the researchers know whether their results are scientific treasures or detritus. Only at the end do they lift the blind and see their true results — after which, any further fiddling with the analysis would be obvious cheating.

Source: nature How scientists fool themselves – and how they can stop
 
  • owlman5

    Senior Member
    English-US
    That seems like a reasonable interpretation of what "handling outliers" should mean in that text. But I'm not sure how researchers determine abnormal values in a fake data set. If the data set is supposed to represent the values that a study of a population or group of people would reveal, I'm not even sure that the term "abnormal values" has any meaning in discussing the data set.
     

    entangledbank

    Senior Member
    English - South-East England
    Presumably the fake dataset should look statistically real. It could have outliers: values far from the main range. I'm not sure, but I believe it's considered acceptable to just ignore the most extreme values at each end if they don't fit. I don't know how else they might 'handle' them.
     

    kentix

    Senior Member
    English - U.S.
    If things aren't completely random data points should form some sort of (at least rough) pattern. Outliers are far from that pattern. There are techniques for how to include or exclude those points, or give different weight to them, in a responsible manner, scientifically.
     

    JulianStuart

    Senior Member
    English (UK then US)
    If things aren't completely random data points should form some sort of (at least rough) pattern. Outliers are far from that pattern. There are techniques for how to include or exclude those points, or give different weight to them, in a responsible manner, scientifically.
    :thumbsup::thumbsup:
    The selection and application of those techniques is what is meant by "handling".
     
    < Previous | Next >
    Top