Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I'm a scientist from a field outside ML who knows that ML can contribute to science. But I'm also really sad to see false claims in papers. For example, a good scientist can read an ML paper, see claims of 99% accuracy, and then probe further to figure out what the claims really mean. I do that a lot, and I find that accuracy inflation and careless mismanagement of data mars most "sexy" ML papers. To me, that's what's going to lead to a new AI winter.


sort by: page size:

The other day someone lamented that you can't get published as an honest ML researcher, because other scientists are rendering whole professions obsolete all the time...

This is a worryingly good question. Most ML papers represent real results, we're not going to see a replication crisis in that sense, but I've heard some fears about another AI winter arriving when people realize that the sum of our reported gains vastly exceeds our actual progress.

Hyperparameter tuning is one big concern here; we know it provides good results at the cost of lots of work, so there's a temptation to sic grad students on extensive tuning but publish on technique instead. Dataset bias is another, since nets trained on CIFAR or ImageNet keep turning out to embed database features.

Ironically, I'm not sure all this increases the threat of FAANG taking over AI advancements. It sort of suggests that lots of our numerical gains are brute-forced or situational, and there's more benefit in new models work than mere error percentages would imply.


What pains me is that these are the exact articles that make people outside of the field say "AI fails to deliver", "AI is just a bunch of ifs" or "AI is just snake oil the new winter is coming".

Most actually interesting ML/DL projects are not stuff that the public directly interacts with so you end up with a ton of inconsequential stuff like this article and real progress is just not visible.


ML publication is a complete mess right now.

Anyone can claim anything as long as they do a write-up and include some equations and pretty plots.

It was hard enough 5 years ago to filter out handful of good papers from the sea of bad research. Now it's getting near impossible.


I know there are a lot of ML researchers and practitioners here - and I have unfortunately only a very shallow experience with reading ML papers, more so with the recent output.

But, I have to ask, how do you get a feel that the content actually looks correct, and not just only quack? The improvements are usually in the 1% range, from old models to new models, and the models are complex. More often than not also lacking code, implementation / experiment procedures, etc.

Basically, I have no idea if the paper is reproducible, if the results are cherry picked from hundreds / thousands of runs, if the paper is just cleverly disguised BS with pumped up numbers to get grants, and so on.

As it is right now, I can only rely on expert assurance from those that peer review these papers - but even then, in the back of my mind, I'm wondering if they've had time to rigorously review a paper. The output of ML / AI papers these days is staggering, and the systems are so complex that I'd be impressed if some single post. doc or researcher would have time to reproduce results.


Some of what you mentioned flew over my head, but I agree that in many cases journals don't seem to even validate stuff.

Some of the machine learning papers in the AI field are a bit better, they are disclosing the training data and in some cases might even link the juniper notebook for reproducibility. Modern AI seems to lack the open honesty in what training data was used, which while I fundamentally dislike, I also see it as anticompetitive. The inability to validate the methodology used, or to scrutinize the training data is what disturbs me. While I personally think that AI that has been dumbed down with safeguards and censored is harmful not only to the user but also the development of future progress, the inability to analyze what data is being trained on is concerning as backdoors and bias can be purposely introduced that harms both people and privacy.

Take a look at this chemistry video, disclosure that I'm subbed to this channel and many like it. https://www.youtube.com/watch?v=hVJXOi933cc

One major thing that some might gloss over is the sheer cost of stuff.

The benefit if seen from PHD researchers seems to be the ability to research topics or ideas that have no immediate commercial use. I agree they defiantly focus and pressure these people on publishing results....and that between science youtubers and businesses people, that the implications of the research gets stretched to seem like a miracle or some grand discovery. While those are cool, its also just as important to make the little discovery because it broadens our understanding of reality and how everything works and what is possible.

Check out this really good explanation regarding this problem. https://www.youtube.com/watch?v=czjisEGe5Cw


Even in ML, it's common knowledge that the long tail of papers demonstrate brittle effects that don't really replicate/generalize and often do uncomparable evaluations, fiddle with hyperparameters to fit the test data, use various evaluation tricks (Goodhart's Law) to improve the metrics, sometimes don't cite better prior work, etc. etc. Industry people definitely know not to just take a random ML paper and believe that it has any use for applications.

This isn't to say there are no good works, but in a field that produces >10,000 papers per year, the bulk of it can't be all that great, but academics have to keep their jobs, PhD students have to graduate etc. So everyone keeps pretending.


If you're interested in machine learning/deep learning/artificial intelligence, for example, I've got bad news about a lot of those papers that appear at NIPS...

One day, an AI will generate a truly groundbreaking research paper, and nobody will read it because they've seen too many bogus ones.

I think people should read at least slightly reputable ML papers instead of AI papers by crazy autodidactic AI experts who are making up all of their terms and won't stop telling you it's going to take over the world because of some math they did once.

Like some of the other ML/AI posts that made it to the top page today, this research too does not give any clear way to reproduce the results. I looked through the pre-print page as well as the full manuscript itself.

Without reproducibility and transparency in the code and data, the impact of this research is ultimately limited. No one else can recreate, iterate, and refine the results, nor can anyone rigorously evaluate the methodology used (besides giving a guess after reading a manuscript).

The year is 2019, many are finally realizing it's time to back up your results with code, data, and some kind of specification of the computing environment you're using. Science is about sharing your work for others in the research community to build upon. Leave the manuscript for the pretty formality.


I'd say the opposite as a member of a group at my university who review ML papers. First off right now there seems to be a drive to explain many phenomena in ML in particular why neural networks are good at what they do. A large body of them reaches a point of basically "they are good at modeling functions that they are good at modeling". The other type of papers that you see, is researchers drinking the group theory kool-aid and trying to explain everything through that. At one point we got 4 papers from 4 different groups that tried to do exactly that. All of them are flawed, either in their mathematics or assumptions (that will most likely never be true, like assumptions of linearity and your data sey being on a manifold). Actually speaking of math, many papers try to use very high level mathematics (functional analysis with homotopy theory) to essentially hide their errors as nobody bothers to verify it.

Anecdotally: I've tried to replicate some recent AI/ML papers and failed. So have some of my acquaintances.

It's somewhat heart-warming to read the comments here about machine learning. I did my PhD in machine learning from 2007 to 2012, and the main reason I left research was because of the widespread fraud.

Most papers reported an improved performance over some other methods in very specific data sets, but source code was almost always not provided. Once, I dug so deeply into a very highly cited paper that I understood not only that the results were faked, but precisely the tricks that were used to fake them.

I believe scientific fraud arises primarily from two causes:

- Publish or perish. Everyone's desperate to publish. Some Principal Investigators have a new paper roughly every other week!

- Careerism. For some highly ambitious people, publishing papers comes before everything else, even if that means committing fraud. This happens even with highly successful researchers, who have the occasional brilliant, highly cited paper, but who also publish a lot of incremental, dubious work.

P.S. Mildly off-topic, but I love the Ethereum research community at https://ethresear.ch/ , precisely because it is so open and transparent! I wish an equivalent community existed for machine learning.


There's also "pseudoscience". Unfortunately all too common in ML, where the ratio of published papers to genuine new developments is (charitably) at least 2:1.

I agree, but all of that is also true of a lot of papers in a lot of other fields. I've seen papers about better complex simulations of natural processes - I don't think I've ever seen one with complete code. I've seen papers that described the experiment and the results of their analysis with hand-wavy descriptions of which algorithms and statistical methods they used - but no code.

My point is that the findings of this article are not necessarily indicative of machine learning being junk science, but it's more likely indicative of a systemic problem with all research, where there's incentive to publish the paper, but not the data, code, models, or even generally enough information of any kind to successfully replicate the experiment more often than 10-15% of the time.

I remember when LIGO first detected a black-hole collision, they released the raw data and the Jupyter notebooks that were actually used to go from the raw data to the published results. Everyone was floored. It shouldn't be that amazing - it should be the standard. For every field.


Well I am not defending the paper thesis but no it's time to realize that we are in a new AI winter where progress has stopped. Sure we can make accuracy progress on tasks that were underesearched before, moreover we do make extremely slow (and with increasingly diminishing returns) accuracy gains on core tasks. But the diminishing returns are diminishing fast to the point that progress in terms of applications has stopped for core AI tasks such as NLU.

However there is still some hope as the vast majority of papers bring an innovation but almost never attempt to merge/synergize with other papers innovations. If human resources where allocated at merging the top 10 papers on a given task, I'm sure it would lead to a major accuracy improvement.


It's very weird to me that a team of researchers who are credibly doing good work and are making an effort to clearly explain their research garners such suspicion.

Are you similarly skeptical of Deepmind, Distill, or Two Minute Papers?

Honestly, the norm of providing two versions of your research (arxiv and blog post) makes ML easier to follow and is something I would like more academics to attempt


I think if the data is valid, open access, and reproducible, it shouldn't matter if the paper was written by AI.
next

Legal | privacy