Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

>implies that somebody can produce quality results in 50 minutes…

This is more a random fluke as much as sampling bias. Akin to someone either getting lucky or someone who is coming from a very similar environment. This isn’t a measurement that guarantees someone is competent, rather it’s a guarantee that someone is coming from just as incompetent an environment as you have.



sort by: page size:

<With 90 minutes driving data or monitoring more car components, they could pick out the correct driver fully 100 percent of the time.>

With 90 minutes of data, they could pick out one driver out of a sample size of 15. How meaningful is that, really?


This means they heard the difference - it is not possible to be significantly less accurate than 50% without hearing a difference.

For small sample sizes being far of is quite likely, for example the chances for getting at most 1/3 of 31 (11 or less) 50/50 guesses right is 1 in 14.


> Seems like it could be within a margin of error (3 v 5%).

There is no margin of error when your sample is the entire population.


I What would it mean, statistically, if it's getting it right less than 50% of the time?

I think the 50% part should be ignored and the message is "when dealing with an individual sample, the general probability distribution isn't that useful?" I don't actually agree with that, just trying to steel man the point a bit. I do kind of see what he means

I wrote this below, but several things are clear here:

- This isn't a quote and should be taken with a grain of salt. Oversimplification, poor wording, and basic misunderstanding on the part of the author are at fault.

- We don't know what the models outputs are. If they are simply SUCCEED / FAIL, then yes, 50% correct is not very helpful (unless of course it is right more than 50% of the time on big winners). If the outputs are more granular (likelihood of success, expected ROI, etc), then being "right" means a lot less and, to the extent that it does mean something, being right 50% of the time is much more helpful.

Imagine being right 50% of the time guessing about getting through airport security. If you're guesses are "WILL" or "WON'T", then 50% is terrible. If you're guesses are like "through in 23 min 53 sec" then 50% is incredible. If you're guesses are like "70% of being through in 15-20 minutes", what does "right" mean?


>95% accuracy is not good enough, those 5% of cases are what most of their training is for, missing it can mean a lost human life.

Isn't this very context dependent? E.g., a delay in lung cancer diagnosis may be a very big deal, but much less so for something like prostate cancer


> The odds of both of those people entering the same piece of data incorrectly is tiny.

This is just not correct. There may be a systematic reason why they are making a mistake (e.g. a miss-pronounced word) in which case increasing the confidence intervals does not increase the accuracy. Check out the concepts of accuracy, precision etc from physical sciences.


While I know what you're getting at, it is possible to have X > 50% of a sample be better (or worse) than the average -- just being pedantic :)

That's not a 5% chance it's a fluke. It means you'd only see this result 5% of the time if it was a fluke.

It's a big difference.


> 1. Presumably we are most interested in the 6% that are more accurate.

Why? At 94% failure, this 6% sounds like lucky guessing by a computer.


A test with a 50% chance of being wrong isn't a test

Is it just me or this sentence makes no mathematical sense at all?

"If you’re running squeaky clean A/B tests at 95% statistical significance and you run 20 tests this year, odds are one of the results you report (and act on) is going to be straight up wrong."


> with a 99% confidence interval

Confidence intervals assume a random sample. This wasn’t a random sample.


>the 60% was from 7.6% right to 12.5% right.

That seems like an absurd metric for judging the improvement in a test. It is wrong on both extremes, 1% to 5% is not really a 400% improvement and 99% to 100% is a drastic improvement despite only being a ~1% improvement by this metric.


> However the fact that our first n=1 sample happened to be red (and not something else) gives a small (and varying) amount of confidence towards red-heavier mixes rather than the red-scarce ones.

I wouldn't characterize this as a small amount of confidence, as conditional distribution of the mix-rate after the first sample drastically differs from the prior.

Originally each mix-rate has 1/101 probablity. After the sample having a mix with n reds in it has the probablity 2n/(100101).


50% means we can't "accurately" identify them at all. The article mentions that it is effectively like a random coin flip, but the title is misleading.

And it certainly does not imply "one chance in 100 the result is a fluke". Will science journalists never learn how to interpret a p-value into English?

> 2/40 poached employees stealing data is almost margin-of-error levels

This doesn't seem to be the sort of thing that can be measured with a margin of error.

next

Legal | privacy