It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing
I'm not familiar with this realm to comment on veracity of claims but it could very well be
"Posting benchmark results is bad because it quickly becomes a race to the wrong solution. Someone misrepresented our performance in a benchmark, here are the actual results."
> also, this comparison would not register as a proper “benchmark” as it’s not even close to how you would perform a proper benchmark. it’s more of a data point.
I would prefer to not have to argue about that in court...
> Please - verify, verify, verify and think critically about what you read.
If you're going to excoriate someone for an improper benchmark, and then provide one of your own and advise your audience to "verify," then it might be wise to include instructions for how to reproduce your results.
Not really. It's a lose/lose proposition to publish benchmarks, especially on something that's so environment and dataset dependent. Either real world performance is way below the benchmark leading to all kinds of storm and strife, or way above and then you lose credibility and people wondered why you bother anyways.
> Doing an entire benchmark on that basis without making that clear is misleading.
A benchmark which provides source code and test data should never be categorized as "misleading." At worst it puts too much trust in the average reader's understanding of things. But look at it this way: if you littered every page that included a benchmark with all the relevant caveats, it'd be a total mess.
Or as David Simon says, "fuck the average viewer."
>just not by the 50% the test might show but it's still going to be 5% better.
Sort of, indeed. Yet, when you see any promotional/marketing material - you see all those phallic bar graphs, and how much bigger it is. Other than that - heavy cache utilization hides inferior memory subsystem (latency/throughput), the latter tends to be quite important in the real world. Overall benchmarks/tests that feature handful of MB as datasets, and run in 100s of ms - should not be used as representative... for most use cases.
> But the reality is, nearly no one even does comprehensive OS benchmarking anymore - so there isn't really a good alternative source to use.
Imho benchmarks only get you so far.
If yu are any kind of serious about performance you should do your own testing and benchmarking. Benchmarks from other people should only help you selecting candidates platforms on which to run your own benchmarks.
There is no such thing as a _universally_ "relevant benchmark". They will never agree on a testing suite since performance varies too much. They have no reason to believe the test manufacturer is impartial.
>>Since most benchmarks are unscientific, this one is as good as any.<<
>1) You seem to be using "unscientific" as a nothing more than an insult.
More like preemptively declaring I don't want to deal with HN crowd replying with "but benchmarks are worthless".
> 2) Your conclusion "this one is as good as any" doesn't follow from your premise that "most benchmarks are unscientific" -- "this one" might well be different from "most benchmarks".
IMO shootout benchmarks are better than random benchmarks on the web as they detail the testing environment, provide the source code and can be re-created
"Well these tests don't meet my expectations so lets keep testing using other benchmarks until I'm validated".
reply