Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> What specifically is the definition of race, and how does that definition help with mapping population movement?

Quoting from OED, 2nd Ed., a race is "a group of persons, animals, or plants, connected by common descent or origin".

> they exist on a spectrum.

Possibly, but that doesn't tell you anything useful about whether there are useful categories or not. It depends significantly on the shape of the distribution whether useful categories arise in the first place, how many and so on. Fixating on the connectivity of the kernel would be ditching an awful lot of information. That this connectivity exists is a trivial consequence of the fact that we're not talking about different species.

> For example, check out this plot of genetic differences in Europe. The differences smoothly smear out over the geography.

Yes, and that tells you a lot about the population of Europe. So clearly this is useful.



sort by: page size:

> large-scale population movements ... population genetics ... Those groups are racial categories.

The tool that you're using to map population migrations isn't race, it's genetic similarity of geographically close populations. What specifically is the definition of race, and how does that definition help with mapping population movement?

Furthermore, those are not "categories" because they exist on a spectrum. You measure similarity as a continuous value. There doesn't happen to be natural demarcation lines between different "racial categories". Populations are composed of lots of different people who tend to be similar, but who don't fall into categories. Genetic similarities cross populations, and human populations don't tend to have neat boundaries.

For example, check out this plot of genetic differences in Europe. The differences smoothly smear out over the geography. The other anchors of Beijing, Tokyo, and Yoruba are indeed separate, but that is because they are far away. You'd continue to have a spectrum connecting those geographies if the chart also measured the genetics of the connecting geography.

https://en.wikipedia.org/wiki/Genetic_history_of_Europe#/med...


>There really shouldn't be a debate that there are genetic differences amongst 'races'. It shouldn't be something that is "taboo" to discuss or talk about, since it is a factually correct statement to make.

I disagree. Who defines what a "race" is?

Divide a population into two groups arbitrarily, and you'll find statistical differences.

Pick any metric, and you'll get clusters.

There are so many data scientists here on HN - so I hope many do understand how sensitive clustering is to the metric, algorithm, and other parameters!

And for everyone else: race is like colors in a rainbow. Yes, the rainbow is not monochromatic. And yet there is no division in the continuous spectrum in nature.


> Could you list what races you think exist, and some kind of paper that establishes a scientific method for where the dividing lines in the genetic gradients are drawn and why such a line needs to be drawn at all?

Let's define our terms. By "race", I mean the classic continental groupings of people whose ancestors come from areas centered on Europe and the near east, east Asia, and Sub-Saharan Africa. When we say "race", we're talking about that category to which people readily self identify and to which we can easily assign others. When we say "race exists", it means that these categories are not arbitrary.

More generally, they're big ancestor clusters. You can see the clusters yourself if you take genome corpuses and run principal component analysis (or other grouping algorithms) on them. If you select k=3, the classic continental races come out of the data.

I'm not sure how much more real a taxonomic classification scheme can get. We use the same genomic approach for organizing the rest of life.

With more groupings, you get finer-grained population clusters nested inside the larger ones. If you look for enough clusters, you start seeing an "Irish" grouping. Can we agree that, say, the Irish, the Italians, and the Slavs are distinct hereditary groups? Can we agree that they're more similar to each other than, say, the Irish are to the Pygmies?

You could, in principle, put everyone into her own cluster. Sure, at k=7e9, race doesn't exist. But that's not a very useful classification scheme, because it ignores the reality that there are high-level classifications that we can see with our own eyes.

> How do you convince yourself you're not one of them and therefore actually the cause of the very problem you lambast scientists for?

That's an excellent question. Epistemology is hard. The best we can do is try to explain observations using the best-predicting theories we can find. I reject the "race does not exist" theory because it fails to explain observable facts about the world. This theory requires, in order to explain our observations, elaborate systems of oppression. It's full of epicycles. Even so, it fails to predict the result of studies like the Minnesota Transracial Adoption Study.

The "race corresponds to allele clusters" theory makes better predictions. It explains heritable and persistent differences in measurable characteristics. It agrees with genomic observations. It requires no hidden assumptions. This latter theory isn't politically correct, but this status can't affect its truth value. We delude ourselves about things all the time.

Look: the traditional continental race classification scheme is a crude folk theory. It's very embarrassing for science when even a crude folk theory makes better predictions than the best theory to come out of the academy.

Eppur si muove.


> The issue is that racial categories are completely arbitrary to begin with

It's not entirely arbitrary. If you run a clustering algorithm on the human genome you get clusters that match quite well the usual definition of races and ethnicities.


>This concept of a "race", in terms of genetics, makes sense for classification purposes, just like all taxonomical classifications of living things.

I could classify computers based on the case they are in, and I would see some interesting trends given the extent that computer casing is decided by the people the build the computer and there is a very few entities who are responsible for the majority of computers. But does it really make sense to use it when you can compare it based on other factors and combinations of factors, like the internal hardware, manufacturer, and year made?

Race is extremely oversimplified and groups too many people together, and comes with a lot of historical baggage in not just its usage, but the ways we considered certain groups members of certain races.

I think humans can be grouped based on common genetics, but race focuses too much on certain phenotypes instead of being based on overall genotypes. Instead of looking at European or African descent, what about descent of historical population groups? Are groups from northern Africa closer to those from southern Europe or Southern Africa?

Groups that share phenotypes are likely to have similar genotypes on average than those who don't, but why depend on this when we now have access to the genotypes?


> but it is not based on any formal, rigourous analytical division of intrinsic properties

Actually it often is. For example one way to determine species is as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring. That's a far more rigorous and biologically based definition than race is. And taxonimists are looking at actual genes now. That is a formal, rigorous, analytical division based on intrinsic properties. So already there it is a poor comparison. Species is far more useful as an analytical tool because it is based in biology.

Scientists (in biology related fields) don't use race anymore as a category because it is an outdated and failed model of ancestry differences in the human population. In fact, race often gets the biology completely wrong. Because society, not science, determines someone's race.

Race is mostly useful to society and so will reflect the various motivations and ideologies of a society. People who choose to continue to use a failed scientific model as a useful analytical tool most often are pushing an ideology, and very often a racist one. The choice to use a failed model over more accurate categorization should always be suspect.

So the comparison to species is a very poor one. Taxonomists are broadly not ideologically motivated in their decisions like society is. And taxonomists continually refine their categorizations based on new scientific information, including genetic information. So species categorizations continue to get more and more accurate and increasingly based in science. That's why species continues to be a useful analytical tool. It is largely based in science and society has no say. Race is the opposite.


>I looked at how well self-reported ancestry corresponds to hapmap populations.

>The mapping is very noisy.

>"race" as such isn't important

Sounds like quite the leap to reach the conclusion that you're trying to make.


> Only smart people could be convinced of such foolishness. A classification system as simple as 'race' may not provide an accurate taxonomy for all scientific purposes. It does however provide a sensible abstraction which has persisted uniformly across cultures and time because of its utility.

Notions of race (either at the high level of what race in general is, or at the lower level of what defines specific races) are not, even approximately, consistent across either space or time. I mean, I get that it makes building arguments much easier if you can start with your conclusion and just invent whatever premises are convenient to support it, but...


> In this particular case, while there does exist a set of alleles against which you can cluster populations in a way that maps roughly to the set of races we define socially, that only represents < 10% of the genetic variation across populations. Basically, that means you need to arbitrarily toss out > 80% of the dimensionality of your data for that clustering to meaningfully map against what we socially construe as "race".

This doesn't make sense to me. Can you provide more information?

I think you're referring to the fact that genetic differences between individuals within a group are larger than differences between groups. But that does not invalidate the differences between groups, or mean that we have to make "arbitrary" decisions to measure those differences.

I don't agree with the rest of your post but don't see the point of going into detail.


> there is no evidence for the existence of a race that does not rely on referencing the idea of a race. the idea is self-referential. to understand this, you must understand philosophy.

Huh? Race is defined by a combination of genetics in the form of phenotypes and culture. I don't see the circular paradox here.

> another way of saying this is that assigning a person to a race doesn't give us any more information than the information we used to classify him in the first place.

But we can? We can say that Chinese are more likely to be shorter than africans. That asians are more likely to have black hair than blonde hair or that africans are more likely to have higher testosterone than european whites or indians.

Ask any doctor or pharmacist and they can list you a few ways they are taught about how to treat people of different races differently because of biological differences.

We can even go further and say for instance that ashkenazi jews have a higher average IQ compared to Africans due to a large part their genetic differences and not purely environmental reasons.

The bell curve which was published a few decades ago proves many of these kind of claims so I don't know where you get this idea that we cannot determine anything useful from race.

> ok, I'll bite: what defines a race? what test do I apply to a specimen in order to sort him/her into a race category?

I gave a definition previously which is pretty accepted, for how you test for a race the answer is that's pretty easy, 99% of the time you can just ask someone and they will tell you.

We can prove this through cluster analysis, where you take a sample set of people and first ask them what race they are from a set number of selections. They then can read SNPs from each of the participants and hand that to a computer without the self reported data the people provided.

What you see is that even with 100 random SNP's you have a near perfect correlation between what people say their race is and the computer correctly grouping them into the race they chose.

If you want to see some sources, here is an article that links off and summarizes the studies I am referencing http://thealternativehypothesis.org/index.php/2016/04/15/329...


>"The trouble with "race" is that it's pretty meaningless biologically;"

One of the things that shocked me about taking my first Bioinformatics class was how casually the professor talked about race. It seems like the concept of race is well accepted in the field, at least as it denotes large haplogroups of the human species. If you gave me the mitochondrial DNA from an Indian, then I am pretty sure that I could tell you it came from an Indian.

Also, I think well-designed statistical studies would easily meet all your objections.

As to whether or not such studies would be useful - I think they would. A common public policy problem is the disparate performance of students from different racial groups in school. As long as public policy sees racial groups as meaningful, it is probably useful to study them.


> "african" is not a category that has any meaningful genetic definition.

If you want me to be specific, the negroid, is that specific enough for you? I assume not.

> it has value in organizing information. another system of organizing information might be harmful. suppose you had a system of organizing information that weighted irrelevant data and disregarded relevant data. would it have positive value or negative value?

Again, you haven't proved that race can't predict. You haven't even tried to disprove what im saying, you just keep banging on about this stupid aragument that "well what is the color orange? Is it red or is it yellow? or gold even?".

Should we throw away colors because someone might see a yellow and another sees gold? Are colors useless now?

> the computer is just doing what the programmer told it to do based on a bunch of genetic categories that are artificially constructed. you get a guy's genes. what makes you say that guy is asian? oh thats right, a hand wavy mix of phenotypes and culture. thats entirely objective, dont know what I was thinking.

So you agree that cluster analysis proves race can be objectively measured? Thanks.

> you're going to classify him as a hippie because he smokes weed and predict that he smokes weed based on the fact that he is a hippie. and you dont see the circularity?

No see you are just strawmanning, you know damn well I said self reported hippie and i was making a point about self reporting vs actions.

> actually it is much more fluid than you seem to think. parts of africa and asia are geographically close, and have mingled for millennia.

Ok but you agreed with the cluster analysis work i linked to that shows we can still put them into nice little groups with around 99% correctness so I don't see what legs you have left to stand on here.

> thats more an artifact of the limitations of your studies, if you know anything at all about this subject you should know just how genetically diverse the human species is.

The more SNP's the studies add the more clear the race picture is, how on earth could you get the opposite picture from the studies?

Are you even reading the studies or do you just assume they say what you believe? Because at this point your just strawmanning over and over and it's just not a productive conversation.


> The reason i'm speaking in probabilities is because that is what the facts are, that we have more variance within races than between.

thank you, exactly.

> africans are genetically predisposed to be taller.

"african" is not a category that has any meaningful genetic definition.

> genes play a predictive role in attributes like height and IQ through twin studies, but we have already begun to find genes that can predict physical attributes like height.

none of which means that "race" has any meaningful existence.

>Category theory is made up, does that make it not real? Something that has no value whatsoever?

it has value in organizing information. another system of organizing information might be harmful. suppose you had a system of organizing information that weighted irrelevant data and disregarded relevant data. would it have positive value or negative value?

> you seem to believe race is subjective

race is intersubjective, as is language and culture.

> to the point it has 0 predictive value

what are you trying to predict? either the trait that you select for when you delineate a sub-population is connected to the variable you predict, in that case you define the category by the specific trait, not some giant bag the size of a continent; or the trait you select for when you delineate the sub-population IS NOT connected to the variable you predict, so you're spouting nonsense. there is no third option.

> we can say that what people see as race almost perfectly corresponds to what a computer would sort them into based on purely genetics.

the computer is just doing what the programmer told it to do based on a bunch of genetic categories that are artificially constructed. you get a guy's genes. what makes you say that guy is asian? oh thats right, a hand wavy mix of phenotypes and culture. thats entirely objective, dont know what I was thinking.

> We can say that a self proclaimed hippie is more likely to smoke weed than a self proclaimed Muslim or Christian. Do you believe we cannot do this?

you're going to classify him as a hippie because he smokes weed and predict that he smokes weed based on the fact that he is a hippie. and you dont see the circularity?

> Race isn't some wavy concept where you have asians thinking of themselves as africans

actually it is much more fluid than you seem to think. parts of africa and asia are geographically close, and have mingled for millennia.

> Sure you can point to exceptions but they are just that, not even 1% of the people in these studies when using more than just a dozen SNPs

thats more an artifact of the limitations of your studies, if you know anything at all about this subject you should know just how genetically diverse the human species is.


>I should probably also mention that one interesting and politically relevant result of these population studies is that the concept of race doesn't really have a solid statistical base, because we have found the variation inside the group to be so large compared to the variation across groups, that there is no unambiguous way to cluster the human population.

If you concentrate on a large pool of genes, yes. But then you might come to the same findings when studying dogs or cats and conclude that breeds don't have a solid statistical base.

With statistics everything is provable, given carefully selection of samples and traits.


> do the same thing for things like "hippie", "biker", "nerd"

while race is not the correct term to use there, breeding two nerds or two biker isn't likely to produce another nerd or biker, while the result of breeding two people of a common ancestry has a likely outcome.

> exactly, its a made up category, i.e. not a real thing.

race is a made up category, but you can't just hand wave all of it:

> what is it about those categories that makes it necessary for you to speak in probabilities [..] you would be able to rigorously tell me a relationship between gene(s) and features.

only because something hasn't been explored in its entirety doesn't make it less convincing of an argument. beside, genetic is rooted in statistics, because of recessive/dominant expressions of traits, so you have to talk in probabilities.


> As you must know, the ordinary understanding of race is based upon external appearance, primarily melanin: black, white, yellow, brown, Indian, and so on.

This is exactly my point. Race means different things in different contexts. Colloquially, the word is used as you used it here. But it has other definitions, like the one I posted. All words are like this.

> If you redefine race as a genetic population, then you will end up with new population categories that have absolutely nothing to do with race as it is ordinarily understood

I'm not redefining it. It has been defined this way by other people. There are real scientists who use the word race like this.

But I agree: different definitions of the word "race" will result in different race categories. When a scientist uses the term "race" they mean something different than someone using the word race colloquially.


> We can say that Chinese are more likely to be shorter than africans.

whats a chinese? whats an african? and what is it about those categories that makes it necessary for you to speak in probabilities and not certainties? if those categories accurately described features of reality, and were not mere artifacts that stem from your worldview, you would be able to rigorously tell me a relationship between gene(s) and features.

> Race is defined by a combination of genetics in the form of phenotypes and culture.

> for how you test for a race the answer is that's pretty easy, 99% of the time you can just ask someone and they will tell you.

exactly, its a made up category, i.e. not a real thing.

> We can prove this through cluster analysis, where you take a sample set of people and first ask them what race they are from a set number of selections. They then can read SNPs from each of the participants and hand that to a computer without the self reported data the people provided.

you can do the same thing for things like "hippie", "biker", "nerd" etc. it doesn't make those categories leave the intersubjective realm.


> which combinations of genes that express the features that we attribute to race

If an arbitrarily large grouping of genotype combinations is necessary to categorize people, perhaps that categorization scheme is not useful? I would imagine that the number of "races" generated via the mechanism you posit would measure in the hundreds or thousands, rendering it unrelated in practice to the word "race" as it is used.


> and the simplistic categorization of humans into ~5 races corresponds well to clusters on a genetic map

Are you familiar with principal component analysis? That’s what was used to generate the very clearly clustered charts that you linked to. It’s useful for analysis because it exaggerates the density and distinctness of clusters based on major features. But what that means is precisely that those clusters are not real! You’re mistaking a visual/analytical aid for raw data. A bit like looking at a false-color diagram of the human body and concluding based on it that blood is blue.

next

Legal | privacy