Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

You're just describing information theory. Every message is contextualized within the codebooks used to encode/decode it. Sometimes we call that codebook "language", sometimes it uses another representation.


sort by: page size:

Information theory has a meaning that is (I think) distinct from the one you are giving it. https://en.wikipedia.org/wiki/Information_theory

Pornhub was just an exaggerated example to explain the big differences in the domains of words in languages.

I got curious in general about how one could encode all the knowledge in a system, whatever the system.

In my mind I perceive this question as an extention of the intormation theory concept of measuring the transmitted information.


Makes sense, I thought it was something completely unrelated to information theory.

> Every byte of information, if it actually "informs", contains with it an implicit agenda and context

What about math? You think that has an agenda?


I'm assuming it's just the general term for a 'piece of transmitted information', no matter what the medium.

data isn't the carrier, it isn't the signal (information), and it certainly isn't the meaning (interpretation). A reasonable first approximation is that data is _message_.

If all information has to be accounted for and stored somewhere, and context is part of the information, then you can't store any information without storing all information, everywhere. Because every bit of information exists inside the context of the entire universe.

Appropriate that you're talking about idiosyncratic domain specific usage of terminology, I'm not sure what you mean by "information theoretic" here. I've always heard it used to refer specifically to the information theory of Claude Shannon.

Information in "entropy" sense is objective and meaningless. Meaning only exists within a context. If we think "data" represent information, "interpreters" bring us context and therefore meaning.

Or what you mean to say; a bit of controlled information, may only be classified by virtue of its context.

>Scientists started with written texts from 17 languages, including English, Italian, Japanese, and Vietnamese. They calculated the information density of each language in bits—the same unit that describes how quickly your cellphone, laptop, or computer modem transmits information.

I feel like the whole "bits" calculation is a neat way to get into the media, but not actually related to "information density".

Edit: Been informed I'm deeply ignorant on Information Theory.


“Information theory” might be a misnomer.

I would say that data is "encoded information". But then again is there any other kind?

They're using information in a technical sense, in the sense of "information theory" and maybe "Kolmogorov complexity".

I love maths when it takes complicated problems and models them in simple terms. I think "Information Theory" actually does the opposite - it's taken a problem about *data encoding* and articulated that problem as something that can tell us about the properties and nature of *information* itself. The theory implies that pieces of information are comparable with other pieces of information but this is only true inside a strict set of boundaries defining what can possibly be encoded.

Investigating the nature of information and finding ways to measure it is a subject that will never sit comfortably in the sciences. That's okay, let's not force it! Maybe it's time we reconsider labeling this with the very grand title of "Information Theory".


Information theory addresses semantics through mutual information. 'Meaning' entails reference, and when A refers to B they have mutual information. Mutual information is how Shannon measures the carrying capacity of a channel.

From someone who has dabbled in information theory (the real one), I am just as confused as you. What I have observed in the past decade is that calling things “information theory of something” makes it somehow more palatable for a broader audience.

It's the metadata. Not the message contents.

This hits on an interesting point.

There is an entropy limit to the message, but the message isn't actually the only data.

One thing humans are great at is integrating existing knowledge into a messy situation and intuiting more than is available just from the raw message.

I.e. The message has an entropy limit, but the message isn't the whole dataset.

next

Legal | privacy