giovedì 12 febbraio 2015

Genetica di Popolazioni

A story of 69 ancient Europeans

A new study on the bioRxiv includes data on 69 ancient Europeans (remember when we got excited in anticipation for the single genome of the Iceman? that was only three years ago) and adds plenty of new info to chew on for those of us interested in prehistory. 

Two Near Eastern migrations into Europe

In 2011, I observed that West Eurasian populations were too close (measured by Fst) to allow for long periods of differentiation between them. By implication, there must have been a "common source" of ancestry uniting them, which I placed in a "womb of nations" of the Neolithic Near East. I proposed that migrations out of this core area homogenized West Eurasians, writing:
In Arabia, the migrants would have met aboriginal Arabians, similar to their next door-neighbors in East Africa, undergoing a subtle African shift (Southwest_Asians). In North Africa, they would have encountered denser populations during the favorable conditions of MIS 1, and by absorbing them they would became the Berbers (Northwest_Africans). Their migrations to the southeast brought them into the realm of Indian-leaning people, in the rich agricultural fields of the Mehrgarh and the now deserted oases of Bactria and Margiana. Across the Mediterranean and along the Atlantic facade of Europe, they would have encountered the Mesolithic populations of Europe, and through their blending became the early Neolithic inhabitants of the Mediterranean and Atlantic coasts of Europe (Mediterraneans). And, to the north, from either the Balkans, the Caucasus, or the trans-Caspian region, they would have met the last remaining Proto-Europeoid hunters of the continental zone, becoming the Northern Europeoids who once stretched all the way to the interior of Asia.
The new paper confirms the last two of these migrations. The remainder involve parts of the world from which no ancient DNA has been studied.

The first migration (early Neolithic) is already uncontroversial, but the paper includes data from Spanish early farmers that are also Sardinian- and LBK-like. The "Sardinian" Iceman was no fluke. It is now proven that not only the LBK but also the Spanish Neolithic came from the same expansion of Mediterranean populations which survives in Sardinia. The authors write:
Principal components analysis (PCA) of all ancient individuals along with 777 present-day West Eurasians4 (Fig. 2a, SI5) replicates the positioning of present-day Europeans between the Near East and European hunter-gatherers4,20, and the clustering of early farmers from across Europe with present day Sardinians3,4,27, suggesting that farming expansions across the Mediterranean to Spain and via the Danubian route to Hungary and Germany descended from a common stock.
The second migration went into eastern Europe:
The Yamnaya differ from the EHG by sharing fewer alleles with MA1 (|Z|=6.7) suggesting a dilution of ANE ancestry between 5,000-3,000 BCE on the European steppe. This was likely due to admixture of EHG with a population related to present-day Near Easterners, as the most negative f3-statistic in the Yamnaya (giving unambiguous evidence of admixture) is observed when we model them as a mixture of EHG and present-day Near Eastern populations like Armenians (Z = -6.3; SI7).
The EHG (Eastern European Hunter-Gatherers) are likely Proto-Europeoid foragers and the Yamnaya (a Bronze Age Kurgan culture) were a mixture of the EHG and something akin to Armenians.The "attraction" of later groups to the Near East is clear in the PCA: hunter-gatherers on the left side, the Near East (as grey dots) on the right side, and Neolithic/Bronze Age/modern Europeans in the middle. The second migration may very well be related to the Uruk expansion and the presence of gracile Mediterranoids and robust Proto-Europeoids in the Yamna:
The Yamna population generally belongs to the European race. It was tall (175.5cm), dolichocephalic, with broad faces of medium height. Among them there were, however, more robust elements with high and wide faces of the proto-Europoid type, and also more gracile individuals with narrow and high faces, probably reflecting contacts with the East Mediterranean type (Kurts 1984: 90).
The authors present a table of Fst values which confirms the homogenizing influence of migrations from the Near East. The WHG group has an Fst=0.086 with Armenians, but the LBK farmers have only 0.023. The EHG group has an Fst=0.067 with Armenians, but the Yamnaya steppe people have only 0.030. Someone might argue that it is the Armenians that are receiving genes from Europe, but the same pattern holds even for the Bedouins, for which admixture with Europeans seems far-fetched: 0.106 to 0.043 and 0.093 to 0.060. It is now clear that the "glue" that did not allow West Eurasian populations to drift very far apart were migrations from the Near East.

The (partial) demise of the farmers

It seems that the legacy of the early farmers suffered two hits, which is why only in Sardinia and (to a lesser degree) in southern Europe that they have persisted as the major component of ancestry. The first blow came during the Neolithic:
Middle Neolithic Europeans from Germany, Spain, Hungary, and Sweden from the period ~4,000-3,000 BCE are intermediate between the earlier farmers and the WHG, suggesting an increase of WHG ancestry throughout much of Europe.
And the coup de grâce after the 5kya mark:
We estimate that these two elements each contributed about half the ancestry each of the Yamnaya (SI6, SI9), explaining why the population turnover inferred using Yamnaya as a source is about twice as high compared to the undiluted EHG. The estimate of Yamnaya related ancestry in the Corded Ware is consistent when using either present populations or ancient Europeans as outgroups (SI9, SI10), and is 73.1 ± 2.2% when both sets are combined (SI10). [...] The magnitude of the population turnover that occurred becomes even more evident if one considers the fact that the steppe migrants may well have mixed with eastern European agriculturalists on their way to central Europe. Thus, we cannot exclude a scenario in which the Corded Ware arriving in today’s Germany had no ancestry at all from local populations.
Confirmation of the Bronze Age Indo-European invasion of Europe

In 2012 I had used the paltry data on a handful ancient DNA samples to observe that in ADMIXTURE modern Europeans had a West Asian genetic component (peaking in "Caucasus" and "Gedrosia") that pre-5kya Europeans didn't. I proposed that the Bronze Age migration of the Indo-Europeans spread this component:
But there is another component present in modern Europe, the West_Asian which is conspicuous in its absence in all the ancient samples so far. This component reaches its highest occurrence in the highlands of West Asia, from Anatolia and the Caucasus all the way to the Indian subcontinent. [...] Nonetheless, some of the legacy of the earliest Indo-European speakers does appear to persist down to the present day in the genomes of their linguistic descendants, and I predict that when we sample later (post 5-4kya) individuals we will finally find the West_Asian piece that is missing from the European puzzle.
This prediction is now confirmed:
This pattern is also seen in ADMIXTURE analysis (Fig. 2b, SI6), which implies that the Yamnaya have ancestry from populations related to the Caucasus and South Asia that is largely absent in 38 Early or Middle Neolithic farmers but present in all 25 Late Neolithic or Bronze Age individuals. This ancestry appears in Central Europe for the first time in our series with the Corded Ware around 2,500 BCE (SI6, Fig. 2b, Extended Data Fig. 1).
I was a little puzzled with the "Ancient North Eurasians" recently proposed as a "third ancestral population" for Europeans: it seemed to be a tertium quid that spread after 5kya, but very different geographically than the "West Asian" component. But:
These results can be explained if the new genetic material that arrived in Germany was a composite of two elements: EHG and a type of Near Eastern ancestry different from that which was introduced by early farmers (also suggested by PCA and ADMIXTURE; Fig. 2, SI5, SI6).
So, it seems that there is no contradiction after all and both EHG (which is related to "Ancient North Eurasians") and another type of Near Eastern ancestry (=West_Asian) arrived after 5kya.

1939 strikes back

It is amazing how well this was anticipated by Carleton Coon in 1939. Back then much of West Eurasia was an archaeological/anthropological terra incognita, there was no radiocarbon dating, no DNA, no computers, not even serious multivariate statistics. And yet:
We shall see, in our survey of prehistoric European racial movements, 8 that the Danubian agriculturalists of the Early Neolithic brought a food-producing economy into central Europe from the East. They perpetuated in the new European setting a physical type which was later supplanted in their original home. Several centuries later the Corded people, in the same way, came from southern Russia but there we first find them intermingled with other peoples, and the cul-tural factors which we think of as distinctively Corded are included in a larger cultural equipment. [...] On the basis of the physical evidence as well, it is likely that the Corded people came from somewhere north or east of the Black Sea. The fully Neolithic crania from southern Russia which we have just studied include such a type, also seen in the midst of Sergi's Kurgan aggregation. Until better evidence is produced from elsewhere, we are entitled to consider southern Russia the most likely way station from which the Corded people moved westward.
And in 2015:
Our results support a view of European pre-history punctuated by two major migrations: first, the arrival of first farmers during the Early Neolithic from the Near East, and second of Yamnaya pastoralists during the Late Neolithic from the steppe (Extended Data Fig. 5).
In 1939:
Linguistically, Indo-European is probably a relatively recent phenomenon, which arose after animals had been tamed and plants cultivated. The latest researches find it to be a derivative of an initially mixed language, whose principal elements were Uralic, called element A, and some undesignated element B which was probably one of the eastern Mediterranean or Caucasic languages. 5 The plants and animals on which the Somewhere in the plains of southern Russia or central Asia, the blending of languages took place which resulted in Indo-European speech.This product in turn spread and split, and was further differentiated by mixture with the languages of peoples upon whom it, in one form or other, was imposed. Some of the present Indo-European languages, in addition to these later accretions from non-Indo-European tongues, contain more of the A element than others, which contain more of the B. The unity of the original " Indo- Europeans," could not have been of long duration, if it was ever complete. 
In 2015:
These results can be explained if the new genetic material that arrived in Germany was a composite of two elements: EHG and a type of Near Eastern ancestry different from that which was introduced by early farmers (also suggested by PCA and ADMIXTURE; Fig. 2, SI5, SI6). We estimate that these two elements each contributed about half the ancestry each of the Yamnaya (SI6, SI9), explaining why the population turnover inferred using Yamnaya as a source is about twice as high compared to the undiluted EHG.
The EHG is still flimsy as it's only two individuals from Karelia and Samara who are very similar to each other. It's hard not to imagine that the hunter-gatherer from Russian Karelia (outside any proposed PIE homeland) would be speaking a similar language as his Samara counterpart. Did they both speak "element A" and was PIE formed when the "southern" steppe hunter-gatherers came into contact with "element B" people from the Caucasus? Short of a time machine, we can never say for sure. This might very well be an answer to the conundrum of Uralic/Proto-Kartvelian borrowings. There is simply no geographical locale in which these two language families neighbor each other: Northwest, Northeast Caucasian speakers and the pesky Greater Caucasus intervene. But, maybe there was no such locale, and these borrowings aren't due to some "PIE people" living adjacent to Uralic and Proto-Karvelian speakers but the "PIE people" being a mix of an element A (EHG) that was (or interacted with) Uralic and another element B (Armenian-like) that was (or interacted with) Proto-Kartvelian.

Urheimat (or not?)

The authors of the current paper are agnostic about the PIE homeland:
We caution that the location of the Proto-Indo-European9,27,29,30 homeland that also gave rise to the Indo-European languages of Asia, as well as the Indo-European languages of southeastern Europe, cannot be determined from the data reported here (SI11). Studying the mixture in the Yamnaya themselves, and understanding the genetic relationships among a broader set of ancient and present-day Indo-European speakers, may lead to new  insight about the shared homeland.
Whatever the ultimate answer will be, it seems that Coon was right that "The unity of the original " Indo- Europeans," could not have been of long duration, if it was ever complete." If PIE=EHG (as Anthony and Ringe suggest), then "from the crib", PIE got half its ancestry from a non-IE, Near Eastern source. Conversely, if PIE=Near East (as I suggested) then "from the crib", PIE got half of its ancestry from a non-IE, Eastern European source. The "Yamnaya" seems to max out in Norwegians at around half, which means that they are about a quarter Proto-Indo-European genetically, regardless of which theory is right.

These two possibilities (as well as the third one of PIE being neither-nor, but rather a linguistic mixture of the languages of the EHG and Near East) are testable. The Anthony/Ringe version of the steppe hypothesis predicts pre-Yamnaya expansions from the steppe. Whether these happened and what was their makeup can be tested: if they did occur and they did lack "Near Eastern" ancestry, then the steppe hypothesis will be proven. PIE in the Near East, on the other hand, predicts that some PIE languages (certainly the Anatolian ones) will be a "within the Near East" expansion. If such migrations did occur and they lacked "EHG" ancestry, then some variant of the Gamkrelidze/Ivanov model will be proven. Or, the truth might be that everywhere where Indo-Europeans arrive they carry a blend of "West Asian" and "EHG", supporting the third possibility. Time will tell.

In the interim, I am curious about how much Yamnaya ancestry existed in different parts of Europe (all of the post-5kya samples in this study come from Germany, with a couple from Hungary). In northern Europe, all populations seem to have less Yamnaya ancestry than the Corded Ware: there it must have declined. But, modern Hungarians have more than Bronze Age Hungarians: there it must have increased.

Germany and a slice of Hungary is a very narrow window through which to see the whole of Europe and these results must be tested by looking at samples from beyond the "heartland". I do hope that some kind of Moore's law operates in the world of ancient DNA, and in three more years we'll be reading studies about thousandsof ancient individuals.

bioRxiv doi:
Massive migration from the steppe is a source for Indo-European languages in Europe

Wolfgang Haak , Iosif Lazaridis , Nick Patterson , Nadin Rohland , Swapan Mallick , Bastien Llamas , GuidoBrandt , Susanne Nordenfelt , Eadaoin Harney , Kristin Stewardson , Qiaomei Fu , Alissa Mittnik , Eszter Banffy ,Christos Economou , Michael Francken , Susanne Friederich , Rafael Garrido Pena , Fredrik Hallgren , ValeryKhartanovich , Aleksandr Khokhlov , Michael Kunst , Pavel Kuznetsov , Harald Meller , Oleg Mochalov ,Vayacheslav Moiseyev , Nicole Nicklisch , Sandra L. Pichler , Roberto Risch , Manuel A. Rojo Guerra , ChristinaRoth , Anna Szecsenyi-Nagy , Joachim Wahl , Matthias Meyer , Johannes Krause , Dorcas Brown , DavidAnthony , Alan Cooper , Kurt Werner Alt , David Reich

We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost four hundred thousand polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies and to obtain new insights about the past. We show that the populations of western and far eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ~8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary, and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ~24,000 year old Siberian6. By ~6,000-5,000 years ago, a resurgence of hunter-gatherer ancestry had occurred throughout much of Europe, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ~4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ~3/4 of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ~3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for the theory of a steppe origin of at least some of the Indo-European languages of Europe.