Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> measured at arbitrary precision, your probability of being average is 0

You're talking about a site that currently measures with precision of 1 bit, in which case your probability of being average is basically 100% comparing to either extreme.



sort by: page size:

> with an average error

Eww, that's not how it works. The goal isn't to make the average near the limit, since the average of -100 and 100 means 0 average error! The goal is to make sure the measurement have a distribution that gives some reasonable confidence that it's within the limit.

This is PTT: https://www.spcpress.com/pdf/DJW244.pdf


> That's an absolute difference of 2.7%. Again, 100% random data.

I think I get what you're going for here -- you're trying to simulate a coin flip? -- but what you've actually done is made successive draws from a uniform random number generator. The software is designed to return numbers that fall along the interval [0,1) with equal probability. Thresholding the numbers and dividing their counts is not a meaningful transformation; the result is still just a uniformly distributed random number. It's like...the ratio of heads in two identical, unfair coins or something.

If all "random numbers" were uniform like this, then no, we wouldn't expect an X% difference to be any more or less likely based on the magnitude of the underlying sample. But when we're talking about something like a a population mean, then the behavior of the errors on estimates is very different indeed, and most estimates cluster around the true (aka population) value:

https://online.stat.psu.edu/stat415/lesson/9/9.4

As the sample size for an experiment of this sort gets larger, the bell curve of expected errors gets sharper and sharper, and it becomes increasingly less likely to see errors >= X, for any value X. In the limit of large N, the distribution of sample errors around a known mean approach a normal distribution:

https://www.jmp.com/en_us/statistics-knowledge-portal/t-test...

For what it's worth, the expected proportion of N heads in M coin flips is modeled using the binomial distribution, which is also bell-shaped and illustrates the same idea:

https://en.wikipedia.org/wiki/Binomial_distribution


> 1. Presumably we are most interested in the 6% that are more accurate.

Why? At 94% failure, this 6% sounds like lucky guessing by a computer.


>A couple more clicks after that, and we’re looking at a summarized version of a bill tackling cybersecurity that the software has considered and rendered a judgment on, when it comes to the probability that it will become law. We’re not talking a rough estimate. There’s a decimal: 78.1 percent.

No way in hell is it producing probabilities calibrated to 3 significant digits. If some kind of testing of calibration produces proof it's even within 1 percent calibrated I will be shocked.


> Which answer do you give? Whatever your software tells you (e.g., 87.14234%) or a number made of a small and fixed number of significant digits (e.g., 87%).

> The latter is the right answer in almost all instances.

and

> Most human beings are happy with a 1% error margin.

While he doesn't say it as a rule outright, the author repeatedly uses a "small and fixed" two significant digits. He later says this:

> You must choose the number of significant digits deliberately.

which I agree with, but is at odds with the "small and fixed" dictum with which he leads off the post.


> Pretty good chance that's not true

Or a 23.5% chance that's not true...


Didn't azakai just say that?

> That has a standard deviation of 0.5 in the very worst case

followed by

> the standard deviation of a random sample scales like 1 over the square root of the sample size, so 0.5 divided by 10 => 0.05 (5%).


> Seems like it could be within a margin of error (3 v 5%).

There is no margin of error when your sample is the entire population.


> Stop saying: “We’ve reached 95% statistical significance.”

> And start saying: “There’s a 5% chance that these results are total bullshit.”

Argh, no, no, no and no!

95% significance is NOT 95% probability! When you select a confidence level of a 95%, the probability that your results are nonsense is ZERO or ONE. There is no probability statement associated to it. Just because something is unknown does not mean that you can make a probability statement about it, and the mathematics around statistical testing all depend on the assumption that the parameter being tested is not random, merely unknown...

Rather, 95% statistical significance means, we got this number from a procedure that 95% of the time produces the right thing, but we have no idea whether this particular number we got is correct or not.

UNLESS!

Unless you're doing Bayesian stats. But in that case your procedure looks completely different and produces very different probability intervals instead of confidence intervals, and you don't talk about statistical significance at all, but about raw probabilities.


This headline seems factually incorrect given what the post actually claims was said. What's described isn't a 10% error rate, but rather 90% precision. It seems like the actual thing which would be called (type I) error rate isn't discussed at all.

I can only imagine that the reddit post was written with that title in bad faith/to promote fearmongering. It seems like this nuance escaped the majority of the commenters on HN as well.


> The odds of both of those people entering the same piece of data incorrectly is tiny.

This is just not correct. There may be a systematic reason why they are making a mistake (e.g. a miss-pronounced word) in which case increasing the confidence intervals does not increase the accuracy. Check out the concepts of accuracy, precision etc from physical sciences.


The claim I saw in the article is 98% precision. Which doesn't actually tell us the predictive value without the base rate which seems to be all over the place.

>> 6% seems pretty good.

That's 6% on the NIST dataset. Typically, results get much worse on real-world datasets, not least because trying to get good results on the same dataset year after year leads to subtle bias. Don't forget this is a dataset that's been around for 16 years, now.


> The correct number is somewhere around 50%

Do you have some source for this? I see random numbers being thrown around a lot, would be nice to have a citation for yours.


Is it just me or this sentence makes no mathematical sense at all?

"If you’re running squeaky clean A/B tests at 95% statistical significance and you run 20 tests this year, odds are one of the results you report (and act on) is going to be straight up wrong."


> no, the correct figure is 78%

Not credible. There should be some odd number of tenths: 78.3% is clearly more credible than 78%


<1% difference is statistical noise

It’s interesting how you described that as exactly 0.1F.

I have a very different way of looking at data. When I see a single significant digit I assume minimal precision. So, 0.1 is something like 95% chance of 0.1 +\- 0.05 would be optimistic, and 0.01 to 1 is pessimistic.

I suspect this is a common issue with both scientific and technological reporting for a general audience.


> Representing bot activity as less than 5% but using a sample size of 100, when you are claiming millions of active accounts, can't easily be seen as honest

The margin of error of a sample of a given size does not depend on the size of the universe from which the same is drawn; a sample size of 100 gives a too big of an MoE to support a claim of “less than 5%” at typically acceptable levels of confidence, but the size of the universe of accounts is irrelevant to it.

next

Legal | privacy