I'll have a dialogue with xnull about this. We can probably fix the nice raters versus mean raters problem completely, but the issue of a small number of raters still exists. And we can't exactly just call in all of our friends as raters because then they would be biased towards us. Ves's score in this contest was significantly higher, by an entire point, which was enough for me, but the other 3 contestants were not that different, and I ran a few Z tests and found that Ves's result was significant with 90% confidence, while the others were significant with much less. A typical pyschology experiment looks for 95% confidence, but then again, this isn't science it is opinion so I don't know...
Maybe we are all thinking about this to hard.
Maybe we are all thinking about this to hard.
Comment