Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> all existing AI systems are obviously halting computations simply because they are acyclic dataflow graphs

No they aren't. Think about LSTMs for example.



sort by: page size:

> I must have missed the part when it started doing anything algorithmically. I thought it’s applied statistics, with all the consequences of that.

This is a common misunderstanding. Transformers are actually Turing complete:

* On the Turing Completeness of Modern Neural Network Architectures, https://arxiv.org/abs/1901.03429

* On the Computational Power of Transformers and its Implications in Sequence Modeling, https://arxiv.org/abs/2006.09286


> When one is coming from Erlang, and sees an artificial neural network diagram/topology, one cannot help but see the 1-to-1 mapping between the Erlang concurrency model, and neural networks; it’s almost eerie.

Is that some kind of a joke?

A quick google image search for 'Erlang concurrency models' [1] reveals nothing that has an eerie similarity to a NN.

The only thing noteworthy is that there are graph diagrams spread here and there. (Please tell me it is not true that you are clueless about a graph being the most fundamental construct in computer science, something that would pop up in pretty much every CS topic in one form or the other.)

[1] https://www.google.com/search?q=Erlang+concurrency+model&tbm...


> There is the biological analogue, which has inspired neural network

I'm somewhat aware but it seems like the idea of a computational graph is the most generic computational idea I can think of and I'm surprised it's not more explored.

> Another ancestor would be the Data-Flow paradigm:

Oh yeah, data flow is definitely another one.


> What so-called neural networks do should not be confused with thinking, at least not yet.

I disagree:

I think neural networks are learning an internal language in which they reason about decisions, based on the data they’ve seen.

I think tensor DAGs correspond to an implicit model for some language, and we just lack the tools to extract that. We can translate reasoning in a type theory into a tensor DAG, so I’m not sure why people object to that mapping working the other direction as well.


>The problem is data movement, not the computation.

This is why I said it is a problem passing them around.

>the fact of the matter is that they're still far behind RNNs/LSTMs.

Technologies that are far more developed purely based on the number of man-years poured into them.


> as I scientist I am very uninterested in AI based on neural nets because of the lack of explication

Neural nets are more like reflexes than reasoning. Most of them are feed forward and some don't even have memory and can't solve references. So it's unfair to expect a job that is best done based on graphs or on memory-attention to be done by a rudimentary system that only knows to map X to y.

But they are not totally unexplainable - you can get gradients on the data and see what parts of the input data most influenced the output, then you can perturb the inputs to see how the output would change.


> The neural engine is small and inference only

Why is it inference only? At least the operations are the same...just a bunch of linear algebra


> The problem you describe (model architecture will work for which problem) is not lacking rigour, but due to the Turing complete expressive power of neural networks.

Non-recurrent neural networks are not Turing-complete in any sense.


>I think these days with neural nets being better understood perhaps we dont fall into this thought trap so much.

From what I've read, the designers of AI/ML systems are less and less able to definitively explain how the algorithm works, or how the system is going to respond to a given input. I suppose for 'sentient' AI thats the goal, but I think it is a bit scary if we get a result from a system, and nobody can tell you why or how it was computed.


> doesn’t it then mean that whatever process is encoded in the NN, it should both be possible to represent in some more efficient representation...?

Not if NNs are complex systems[1] whose useful behavior is emergent[2] and therefore non-reductive[3]. In fact, my belief is that if NNs and therefore also LLMs aren't these things, they can never be the basis for true AI.[4]

---

[1] https://en.wikipedia.org/wiki/Complex_system

[2] https://en.wikipedia.org/wiki/Emergence

[3] https://en.wikipedia.org/wiki/Reductionism, https://www.encyclopedia.com/humanities/encyclopedias-almana..., https://academic.oup.com/edited-volume/34519/chapter-abstrac...

[4] Though being these things doesn't guarantee that they can be the basis for true AI either. It's a minimum requirement.


> LLMs can form new memories dynamically. Just pop some new data into the context.

No, that's an illusion.

The LLM itself is static. The recurrent connections form a soft-of temporary memory that doesn't affect the learned behavior of the network at all.

I don't get why people who don't understand what's happening keep arguing that AIs are some sci-fi interpretation of AI. They're not. At least not yet.


> we may likely discover that a huge portion of [a human brain] is redundant

Unless one's understanding of algorithmic inner workings of a particular black box system is actually very good, it is likely not possible not only to discard any of its state, but even implement any kind of meaningful error detection if you do discard.

Given the sheer size and complexity of a human brain, I feel it is actually very unlikely that we will be able to understand its inner workings to such a significant degree anytime soon. I'm not optimistic, because so far we have no idea how even laughingly simple, in comparison, AI models work[0].

[0] "God Help Us, Let's Try To Understand AI Monosemanticity", https://www.astralcodexten.com/p/god-help-us-lets-try-to-und...


> AI doesn’t operate like that

Wrong. Various regularization schemes are used in AI models which essentially introduce “flaws” and noise into the process. The flaws are more optimal than what the brain does, but they are there.


>> your comment about runtime complexity does not make much sense when there exist problems which provably cannot be solved in linear time.

Look, it's obvious the human mind manages to solve such problems in sub-linear time. We can do language, image processing and a bunch of other things, still much better than our algorithms. And that's because our algorithms are going the dumb way and trying to learn approximations of probably infinite process from data when that's impossible to do in linear time or best. In the short term, sure, throwing lots of computing power at that kind of problem speeds things up. In the long term it just bogs everything down.

Take vision, for instance (my knowledge of image processing is very shaky but). CNNs have made huge strides in image recognition etc, and they're wonderful and magickal, but the human mind still does all that a CNN does, in a fraction of the time and with added context and meaning on top. I look at an image of a cat and I know what a cat is. A CNN identifies an image as being the image of a cat and... that's it. It just maps a bunch of pixels to a string. And it takes the CNN a month or two to train at the cost of a few thousand dollars, it takes me a split second at the cost of a few calories.

It would take me less than a second to learn to identify a new animal, or any thing, from an image and you wouldn't have to show me fifteen hundred different images of the same thing in different contexts, different lighting conditions or different poses. If you show me an image of an aardvark, even a bad ish drawing of one, I'll know an aardvark when I see it _in the flesh_ with very high probability and very high confidence. Hell- case in point. I know what an aardvark is because I saw one in a Pink Panther cartoon once.

What we do when we train with huge datasets and thousands of GPUs is just wasteful, it's brute forcing and it's dumb. We're only progressing because the state of the art is primitive and we can make baby steps that look like huge strides.

>> It is dangerous to discourage research on that simplistic basis

It's more dangerous to focus all research efforts on a dead end.


> The model's intuition doesn't work like a human's

The model doesn't have intuition, it is just a series of computations.


> But this argument that because of some very theoretical view on the problem the current engineering solutions should be abandoned

That's not the author's argument at all. Even the subtitle of the article says "We’ll need both deep learning and symbol manipulation to build AI.".


> build a system of models, each of which simplifies the function of a particular neural region to something computationally tractable

I think I've worked on too many large software systems, but my guess?

The human brain is the world's worst spaghetti code and this will be impossible.


> That’s not what I’m talking about. This is a basic analysis topic:

It's the same basic flaw: requiring continuous functions. Not all functions are continuous, therefore this is not sufficient.

> And you’re still ignoring the cybernetics, and perceptrons movement I keep referring to which was more than 100 years ago, and informed by Turing.

What about them? As long as they're universal, they can all simulate brains. Anything after Church and Turing is just window dressing. Notice how none of these new ideas claimed to change what could in principle be computed, only how much easier or more natural this paradigm might be for simulating or creating brains.


> So the real question is this: do those features (spiking, conduction delays) actually make biological neural networks capable of computing something that Turing Machines and Artificial Neural Networks cannot?

Are Turing machines proven to be able to compute anything computable, so isn't this known absolutely to be a "no"?

next

Legal | privacy