Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I didn't ask if they'd be able to make do. I asked if they'd be satisfied.

Also, wrt

> Again, ML models are not deterministic)

ML models are absolutely deterministic if you have the discipline to do so (which is necessary at higher scale ML work when hardware is stochastically flaky).



sort by: page size:

I guess computer is much more deterministic than what is required for ML to be useful.

ML, in a very inaccurate way, can be seen as:

1.We have observations and conclusions.

2.We don't know exact those observations leads to the conclusions.

3.The assumed procedure that leads the observations to conclusions is called model.

4.With enough pairs of (observation, conclusion), we can train a good model that is good enough to make good decision on future observations.

Problem for traditional computer science is that, the system is so deterministic that we know EXACTLY how it works on instruction level, while ML is good at dealing problem that is inherently probabilistic.


Traditional computing is good at being precise at the cost of being rigid, ml computing is good at being flexible at the cost of being ambiguous.

I don't trust ml computing to be precise in the same way I don't trust humans to be precise (we can all potentially write code with bugs) because the process is fundamentally non deterministic.

The way we solve this with humans is to create tools that test verify what we make, backed by some sort of proof.

I guess once ml computing can get feedback from these tools, the situation will improve?


Large ML models tend to be uncorrectably non-deterministic simply from doing lots of floating point math in parallel. Addition and multiplication of floats is neither commutative nor associative - you may get different results depending on the order in which you add/multiply numbers.

Yes, creating the actual ML model and curating it's input takes effort, a lot of it in fact. Taking someone else's ML model and shitting prompts into it does not

Ironically, exactly these sorts of "statistics can't..." arguments are sort of theoretically bankrupt.

Either the thing you want to do is impossible or else a learned model can do it at least almost as well as... idk what the alternative even is, something not learned?

Taking this to the extreme, on an example where I have first-hand experience: I would never recommend replacing your compiler passes with transformers. The latter will be buggier, at best marginally faster, and it will take several orders of magnitude more effort to get them to work well enough for production. ML isn't the right tool. But, I mean, you can do it. You shouldn't. But you can.

To be fair to the article, the title is "won't", not "can't". And that's easier to believe, at least for me. It's not that X is unattainable using ML; it's that some other approach will get there first.


To paraphrase Kahan, it's not interesting to me whether a method is accurate enough or not, but whether you can predict how accurate you can be. So, if ML methods can predict that they're right 98% of times then we can build this in our systems, even if we don't understand how they work.

Deterministic methods can predict result with a single run, ML methods will need ensemble of results to show the same confidence. It is possible at the end of day that the difference in cost might not he that high over time.


Perhaps using a ML to craft the deterministic rules and then have a human go over them is the sweet spot.

I don't think you understand how ML models work at all.

It's also not a binary choice between "ML" or "hand-crafted model" either.

The most successful applications of ML in my own career have always been judiciously building small task-specific models that fit inside a broader system. And there is also usually quite a lot of consideration being given to the model input data as well.

Similarly, I have never seen a "just throw a model at it" project succeed in a finite amount of time.


> Ml based systems will always make errors

Sure, but error should be randomly distributed. This is stats 101. Any decent ML practitioner will check for this before releasing a model.


I think you make an interesting point, ml models and algorithms are two different things.

I also think it's reasonable to ask for a model so you can test and validate it yourself.


Thanks for that! Some people I work with are constantly asking for ML, they invoke like its magic and will figure shit out by itself. Then when I push back asking how they would make the decisions themselves, their answers tend to be in the line of "it's ML, it should figure out by itself", and when I ask about the data to be used, "it sshould adapt itself and find the data". Getting to have a heuristic in the first place is so hard.

Reminds me of the book "Everything is obvious", where they experimented a few times and showed that in complex systems, advanced prediction systems made on many available and seamingly relevant variables are only marginally better (2 to 4% in the experiments) than the simplest heuristics you can use. They interpreted that as a limit of predictability, because systems with sufficient complexity behave with a seemingly irreducible random part.


My counterclaim would be that doing ML in code is actually easier these days than doing stuff in Excel (and a 1080 is sufficient). So why not solve the problem with a sufficiently easy to use tool? Your other point is valid. If it can be solved deterministically than you should probably do that (but I'd probably not reach for C since I don't need the C-efficiency hammer right away)

I overall agree with the sentiment on delivery and needing to deal with a variety of issues, but one nit:

If it were my business or my team I would want an ML Engineer to relentlessly knock down any barriers to getting models into production.

My favorite is when they knock down the question of whether a certain project even needs ML to focus on getting ML into production.

Many teams fail at ML because there was an essential task that nobody wanted to do that didn't get done.

Many teams fail at ML because there was a task that didn't need ML (or at least anything more than a linear model) that is made opaque to anyone but an ML engineer after they implement it.


An ML model might be able to, even if you can’t.

They should, it shouldn't be difficult. I think the issue is more that ML gives more tools that a reasearcher could employ to get any arbitrary result if they knew what they are looking for.

What difference would this make? Using the ML model's output as input would merely make the model think it is indeed correct.

The ML models would not be ruined beyond how they already are. And if their current state is insufficient to force anything, this would just be more of the same.


The best reason to hire a person to do something is they'll give you what you need, not what you asked for. An ML model does not do this.

It’s hard enough making a ML model consistent with itself.
next

Legal | privacy