Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Yes! Like 'data' - it's singular ffs, so say 'the data in this paper shows' not 'the data show'. Looking at you TWiV!!

Maybe if it still makes sense semantically to replace 'data' with 'data points', then the plural is ok. However mostly we are talking about a 'wodge' of data, with discussion, caveats, etc, more than a simple list of 'datums'.



view as:

I'm pretty prescriptivist when it comes to grammar, but I also think that "data" should be referred to in the way you describe, and for those reasons. When people say "data" they generally mean "dataset", which is unambiguously singular. It just doesn't make sense to say "dataset" when "data" will do.

I would argue that prescriptivism in English still refers solely to English. The idea that since data is plural in Latin, it must also be plural in English doesn't follow for me. Plenty of words change plurality when being taken into another language, e.g.

* Laos is plural in French but singular in English.

* Cherise is singular in French, but was rebracketed as a plural in English (and then transformed to cherry as the singular).


Wow, super interesting. Apparently that is called "back-formation", and applies to other words like "caper", "pea", and "burgle" (from "burglar"). [1]

1: https://en.wikipedia.org/wiki/Back-formation


"Data" as a grammatic plural seems to be more common in UK english than US.

I think the way to properly pluralize it is by adding a unit, the same way we'd do with words like "information" (bits of information), "work" (Joules of work), or other meta/physical quantities (rays of sunshine, pangs of heartache, etc.).


What gives you the idea that data is singular? And then what is datum?

Ya, what is a datum? Is that like a piece of data?

I've heard that data is actually supposed to be plural, but that "correct" usage is so far outside of my experience that it just seems weird and confusing.

The one I actually find annoying though is when people use "an" before a word that starts with a (pronounced) h, like "an historic building". It's like, dude, I know you're trying to seem all smart and superior and it isn't working. (Of course, if they said it with a french accent and a silent h that would make more sense.)


> The one I actually find annoying though is when people use "an" before a word that starts with a (pronounced) h, like "an historic building". It's like, dude, I know you're trying to seem all smart and superior and it isn't working. (Of course, if they said it with a french accent and a silent h that would make more sense.)

This is something that has puzzled me as well. My wife, a historian (not "an historian!") will write this way, and when I've asked her about it she says it's just the proper way. When I looked into the history of it, I concluded that it probably made sense for the British (who say "historian" with a somewhat silent "H"), and probably for early Americans. But for those of us who speak Standard American English, it seems somewhere between unnecessary and stuffy.

I wonder how spell checking software deal with this? IIRC it's an exception for words starting with "H" where there emphasis is not on the first syllable (which would result in the "H" being enunciated).


I don't know for sure, but I suspect (in keeping with the fetishization of Latin that motivates many prescriptivist rules, like the injunction to never split an infinitive) that this arises from Ancient Greek[0], wherein there is not a separate letter "H" but rather a modification ("breath", if I remember rightly?) on the vowel. If you (mis)apply this rule to English, "hospital" starts with a vowel.

[0] this might be true in Modern Greek as well, I don't know - but English professors don't have such a raging boner for the modern version.


> like the injunction to never split an infinitive

Whoa, whoa, I believe you mean "the injunction never to split an infinitive"! /s



How do you pronounce the sentence

“This is a historic building”

out loud? I don’t pronounce the h. I propose we start writing it as “an ‘istoric” to correct the mismatch.


I say hiss-storic but have always used "an" as the sound still fits the aural pattern rule in my head. It never even occurred to me that people would ever think it was wrong or pretentious to use "an". Though, I was always conscious that "a historic" wasn't wrong, either; yet it still looks and sounds off to me, like someone is deliberately exaggerating a hard h sound.

I say "historic" with more of an "H" than "hour" (which I pronounce the same as "our").

If I may ask, which dialect of English do you speak that the "h" is silent in "historic"?

I’m from New England, but not like Boston or Rural Maine, so I don’t have one of those super strong paak tha caa accents.

It isn’t always silent, just when preceded by… a vowel I think? I dunno, I’m in public so I don’t want to look like a weirdo trying out test sentences.


> The one I actually find annoying though is when people use "an" before a word that starts with a (pronounced) h, like "an historic building". It's like, dude, I know you're trying to seem all smart and superior and it isn't working.

Do what now? I wonder what accent you have and which ones you’ve regularly heard spoken. For me, using “an” is essentially mandatory, because there is absolutely no “h” sound at the beginning of “historic” at normal speaking speed. It’s just like “it has been an honor” or “it will take an hour.” The idea that it’s being done deliberately to sound superior is hysterical.


> I wonder what accent you have and which ones you’ve regularly heard spoken

Not parent, but I have the same experience as parent. I am from CA and have lived in PA and DC as well. I also studied linguistics in college and so have a reasonably well-tuned ear for this.

It is absolutely not "hysterical" to say that people do this to sound superior. The vast majority of Americans who say "an historic" on a regular basis that I've come across are historians. I have hung out with plenty of other well-educated and pedantic folks (in linguistics, economics, and law), and even they don't do this. YMMV, of course. From where do you hail?


The first time I realised that Americans drop the "h" in herb I thought it was some hoity-toity affectation, but apparently it's the way you say it! (Aussie here and I think Brits and Kiwis keep the H )

We are weird about herbs! Ask me how I say "basil"...

Out of curiosity, does that mean you pronounce "historic" substantially different from "history"? And how about "the historic" as opposed to "a(n) historic"?

I'm not a native speaker, but the thought of essentially pronouncing "the 'istory of ..." or "the 'istoric ..." feels a bit comical, but perhaps that's what you do to be consistent?


With “historic” the stress is on the second syllable. With “history” the stress is on the first syllable.

True. Interestingly, while I'm pretty sure that I automatically do that correctly, I don't perceive it as a substantial difference.

So perhaps part of the explanation is around how speakers of different languages / accents perceive emphasis.


That's interesting, I don't think I've heard a native English speaker say historic with a silent h. Which part of the world are you from?

To be fair, if the h is silent then it actually makes sense.


> Ya, what is a datum? Is that like a piece of data?

Yes. It’s like a fact. It’s a thing that’s known. The more common modern equivalent would be “data point”.


Interestingly there seems to be a trend to refer to an individual fact point as a "factoid". I wonder where that originated.

EDIT: I just looked it up and a factoid is a manufactured fact, coined by Norman Mailer in 1973. I seems many use it differently. https://en.wikipedia.org/wiki/Factoid#%3A%7E%3Atext%3DThe_te...


By analogy with the example from the article, water, data : datum :: water : drop.

It's a mass noun, per the article. Considered grammatically singular.

Just because it can be singular doesn't mean that it must be singular.

I think the difference is in the perspective. The data is the substance you process, while the data are the observations or measurements you analyze. People who use "data" as plural tend to be more interested in the content than in operating the machinery.

Data are plural when you can also refer to a single datum. Data is singular if it's a mass noun, like water.

There is a similar conundrum with media / medium.


‘Data’ is the Latin plural, so the traditional prescriptivist advice to say ‘the data are’ is based on the assumption that the word is grammatically plural, not any argument about semantics. There are actually lots of grammatically plural nouns that are used to refer to things that can’t be split into discrete chunks. For example, while ‘the news’ in English has been reanalyzed as singular, it remains ‘las noticias’ in Spanish - without any implication that a specific number of discrete news items are being referred to.

Legal | privacy