Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not good enough at this to be explaining it, but statistical significance is based on probabilities.

if humans have a 10% chance of having cancer at any moment, and you have one human on an entire planet, the probability for that planet is still 1/10 and not 100% or 0 based on whether that person has or does not have cancer, and having one sample from that population of 1 does not increase your confidence in the answer as to whether his cancer represents a cancer cluster.

or something like that. If you need a sample of 30 people to achieve a certain degree of confidence, whether that sample comes from a population of 30 people or a population of 100,000 people, same confidence. We aren't trying to establish what percentage of people have cancer at any given moment, we are trying to establish whether this population matches what we know about the percentage who "should" have cancer.

I'm really bad at explaining this.

the more times you flip a coin, the more confidence you'll have in understanding the probabilities of that coin. for a given confidence level, you don't need to flip that coin more by 100x, you already know. There might be a population of a million coin flips, but you just need a sample of a certain size to obtain confidence that your sample is representative.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: