Hacker Read

Rastonbury · 2021-11-13 02:49:00

I'm not familiar with this realm to comment on veracity of claims but it could very well be

"Posting benchmark results is bad because it quickly becomes a race to the wrong solution. Someone misrepresented our performance in a benchmark, here are the actual results."

reply

throwaway984393 | karma 1368 | avg karma 1.41 · | 2021-11-13 00:41:19

"Posting benchmark results is bad because it quickly becomes a race to the wrong solution. But somebody showed us sucking on a benchmark, so here's our benchmark results showing we're better."

KerrickStaley | karma 1887 | avg karma 6.35 · | 2018-12-24 05:59:45

This title seems like an exaggeration of what is claimed in the article. In the article, they state that they benchmarked their solver in a biased way that made their solver look like it performed better than it did, not that they faked performance data altogether.

oneplane | karma 6399 | avg karma 2.47 · | 2020-11-13 21:32:02+00:00

No, that's because some of the people don't have first principles and trust at all and instead of validating they just make stuff up ;-)

I was trying to point out that disputing things is fine, but the whole basis of a website where benchmarks are uploaded is trust-and-reputation-over-time to the point where enough other people can re-run the same tests on their machines (one they get them) to validate the results. Heck, you might almost call that science!

Right now, there aren't additional results and you can't easily reproduce them because the machines aren't wide spread or available. But we can take the track record and reputation of the site and application and use that to value the integrity of the published benchmarks to be 'likely correct'.

reply

igouy | karma 1139 | avg karma 0.41 · | 2016-06-04 17:13:06

>> It's just that a few dedicated people decided to try to win the benchmarks game strictly for PR purposes [0]. <<

Perhaps you should check if the programs mentioned in that blog post are actually shown on the benchmarks game website :-)

(Also check if those tasks contribute to the summary comparisons and charts.)

reply

Karunamon | karma 16796 | avg karma 2.2 · | 2012-05-23 20:38:26+00:00

>Anybody else thing this sentence is silly?

Not really. It's a lose/lose proposition to publish benchmarks, especially on something that's so environment and dataset dependent. Either real world performance is way below the benchmark leading to all kinds of storm and strife, or way above and then you lose credibility and people wondered why you bother anyways.

reply

wstuartcl | karma 362 | avg karma 1.72 · | 2023-01-24 15:09:00

This is kind of the issue with an interested party/vendor running benchmarks like these. Be it by pure dumb luck or malfeasance you are much more likely to configure and be knowledgeable about your own product than the others and toss out responses and results that are wildly inaccurate/misleading.

igouy | karma 1139 | avg karma 0.41 · | 2016-12-28 13:16:50

> This site explicitly mentions that results of benchmarks mean almost nothing.

Quote?

reply

sanxiyn | karma 14687 | avg karma 3.61 · | 2023-05-04 00:07:09

They are probably not lying, but good performance on benchmarks does not imply good performance on your use cases.

HideousKojima | karma 5138 | avg karma 2.8 · | 2022-06-04 19:44:43

>The big deal is that it's slanderous.

Then you don't need to forbid benchmarking in the terms, if someone posts a slanderous/libelous benchmark then sue them for that.

reply

hitekker | karma 4286 | avg karma 3.02 · | 2024-02-02 23:33:45

The GP said the OP misrepresented their benchmarks before. It's good context and the rest of the Reddit thread is also informative.

"Lie" is a bit edgey but I think adults should be able to stomach a little sourness instead of, ironically, accusing people of dirt and malice.

reply

phsr | karma 3791 | avg karma 9.55 · | 2010-07-20 09:01:45

It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing

Hrun0 | karma 91 | avg karma 2.39 · | 2024-02-02 16:26:27

OP lied about benchmarks in the past:

https://www.reddit.com/r/javascript/comments/x2iwim/askjs_mi...

reply

carapace | karma 9859 | avg karma 1.41 · | 2018-09-29 21:08:00

> Benchmarking requires expertise that, it turns out, very few people have. I don't think I even have enough skills to do it correctly and meaningfully.

Very important and often overlooked point.

But I wonder, why not forbid public dissemination of inaccurate, non-reproducible benchmarks?

> Spreading wrong performance information can hurt a business.

Wouldn't that be libel? (IANAL)

reply

abstractbeliefs | karma 1620 | avg karma 5.05 · | 2017-10-16 16:32:05+00:00

They don't have a clue, no, they're mostly just turbomad about having poor benchmark results published in the past.

remon | karma 537 | avg karma 3.14 · | 2014-07-28 15:08:24

"But unclear about their benchmarking method". That can be said for every single performance claim they make. There's a rather distinct lack of objective facts.

dhruvdh | karma 453 | avg karma 3.46 · | 2022-10-16 07:58:47

Post and heading are written to attribute this to malice, without offering any proof. It seems likely given how new Bun is that the benchmark writers simply lacked familiarity.

zapov | karma 83 | avg karma 1.46 · | 2017-02-10 04:34:37+00:00

well there is also https://www.techempower.com/benchmarks/#section=data-r13&hw=...

Point being when someone says something is faster without providing verifiable numbers... it should not really be taken seriously

reply

lispm | karma 12407 | avg karma 2.21 · | 2024-01-25 04:27:04

> Benchmarks are kinda silly, but we're talking about really average performance.

I guess we are interested in real life application performance, not in low-level benchmarks or benchmark games.

reply

theg2 | karma 470 | avg karma 2.25 · | 2015-07-18 16:07:42+00:00

As I posted on another comment, the TL;DR is more like this:

"Well these tests don't meet my expectations so lets keep testing using other benchmarks until I'm validated".

reply