> A nitpick, perhaps, but isn't that three orders of magnitude?
Perhaps the example was a best-case, and the usual improvement is about 10x. (That or 'order of magnitude' has gone the way of 'exponential' in popular use. I don't think I've noticed that elsewhere, though.)
Accurate, but a half truth. The prize was for a 10% improvement, but before that solution was produced they had already improved by 8.4%. The headline makes it sound like the improvement from zero to 10% was not worth the engineering cost, but really it was the improvement from 8.4% to 10% which cost too much.
It also just so happened to quintuple his returns. Not exactly comparing apples to oranges when you have a 4% (realistic) v. 10% (unrealistic) return built into the model.
> but in the grand of scheme of things, 1% absolute improvement may not be such game-changer, especially if it comes at the cost of other relevant metrics like model complexity, developer sanity or performance
fasttext makes errors about 10% of the time, and our approach makes errors about 5% of the time. It's certainly fair to say (although nitpicky) that "accuracy" isn't quite the right term here (I should have said "half the error").
But as for your general sigh/rant... absolute improvement is very rarely the interesting measure. Relative improvement tells you how much your existing systems will change. So if you're error goes from 5% to 4% then you have 20% less errors to deal with than you used to.
An interesting example: the Kaggle Carvana segmentation competition had a lot of competitors complaining that the simple baseline models were so accurate that the competition was pointless (it was very easy to get 99% accuracy). The competition administrator explained however that the purpose of the segmentation model was to do automatic image pasting into new backgrounds, where every mis-classified pixel would lead to image problems (and in a million+ pixels, that's a low error rate!)
> we examine the contribution of more computing power to better outcomes
No, they pick a set of problems where computational methods are known to have a beneficial impact and the plot every progress in that field against increased amounts of computing. Since amount of computing power used is monotonous and ELO score/Go performance/weather prediction success is trending monotonous the correlation is pretty high. However computation power is not the only thing that rose mostly monotonically during that time. At best they derived an upper bound of the contribution of more computing power to better outcomes.
For example in Mixed Integer Linear Programming studies were done to measure algorithmic vs hardware speedup. "On average, we found out that for solving LP/MILP, computer hardware got about 20 times faster, and the algorithms improved by a factor of about nine for LP and around 50 for MILP, which gives a total speed-up of about 180 and 1,000 times, respectively." https://arxiv.org/abs/2206.09787 This methodology would attribute the 1000 times effect to the increase in FLOPs alone.
And just a methodological concern, taking the logarithm of one axis, is applying a non-linear transformation, and then doing a linear fit results in distorted measure of distance between fit and data depending on the data. This effect was not discussed. It does only mess with R value so i would not feel comfortable applying that R value to derive an attribution.
they demonstrate improvement from previous 5% to 7%.
reply