Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

There is more of a point to actual learning than to create a language model.


sort by: page size:

Language models show that language is predictive, not rule based.

Every bit of research in the last 30 years shows this.

If you want to ignore that and use rule based approaches, that’s fine, but you’re a) wrong and b) it doesn’t work.

/shrug

Of course people want rules, they think it’s a shortcut to learning.

…but there is no shortcut to learning. You have to do the hard work.


Reflection is the thing. Language models don't reflect.

And I don't see any fundamental reason to believe that language models can learn or reason at all.

I feel like the big language models have proved this style of learning a language is the wrong approach.

I learnt Japanese; I studied it for 4 years and spent a year in japan.

You know what worked?

Lots of examples of people using particles.

What did not work?

Text books explaining what the particles do.

A grammatical study of particles is only useful after you’ve gained an understanding of when you should use them from shed loads of examples.

It helps you refine specific fine detail points of when to use them technically, and in formal writing.

For early learning, I posit it’s next to useless.

Language is not a well designed programming language full of orthogonal concepts.

This has long been an argument, but language models reallly nail down the fact that a probabilistic approach to “similar to existing examples” approach to language is categorically superior to attempting to construct semantically correct statements from “rules”.


Large language models are also not databases of text.

It's a language model, not a mathematical model

It's a language model. It models language not knowledge.

Language models aren't built to do that, and if you want to make predictions or calculations, they're probably not the best choice.

This is by no means a practical exercise... and is to demonstrate the capabilities of a large language model, not to reduce boilerplate.

It is difficult to see an argument that the output of a language model is not derived from the language model, other than people would prefer it wasn't.

But there isn't such a thing as a raw model, is it? In order to receive anything from a language model it has to 'learn' some objective. And this objective has to be imposed from above.

Yeah the issue is you can generate data, but it won’t be good data. Training over random strings won’t make you learn language, but it’s technically data.

What's really interesting is that these models are using some non-trivial portion of all easily accessible human writing -- yet humans learn language really well with significantly less input data. What's missing in the field to replicate human performance in learning?

Humans use language to accomplish tasks in their environment - establishing relationships, making deals, coaxing others, etc. By contrast, all neural language models do is predict the next word as a function of the previous word. So far, these language models have nothing at all to do with language learning. They're only valuable insofar as they advance downstream engineering tasks like machine translation.

Large language models does not an intelligence make. We have a very very long time.

A language model is just a language model. It may well be an important part of an AI at some point, but it’s not going to be the whole thing.

Not all language tasks, even, are going to be best handled by these models.

Yeah, that's why just updating the weights on the models such as they are doesn't work. But they're right that it's desirable to have some sort of online learning, whether on top of a frozen language model, or through some not yet invented way to do it end to end.

I don't know where this idea comes from that we can get more from language models then what we put inside. Thinking we can process any amount of data and get a competent surrogate mind out of it borders on magical thinking.
next

Legal | privacy