UDK 902(4)''63i/634":577-2 Documenta PraehistoricaXXXIII (2006) Y-chromosome haplogroup I prehistoric gene flow in Europe Siiri Rootsi Estonian Biocentre and Tartu University, Estonia sroots@ebc.ee ABSTRACT - To investigate which aspects of contemporary human Y-chromosome variation in Europe are characteristic of primary colonization, late-glacial expansions from refuge areas, Neolithic dis- persals or more recent events in gene flow haplogroup I was analyzed. The analysis of Hg IY chro- mosomes revealed several sub-clades with distinct geographic distributions. Sub-clade I1a accounts for most of Hg I in Scandinavia, with a rapidly decreasing frequency towards the East European Plain and the Atlantic fringe; but microsatellite diversity reveals that the Iberian Peninsula/Southern France refugial area could be the source region of the early spread of both I1a and the less common I1c. I1b* extends from the eastern Adriatic to Eastern Europe, and declines noticeably towards the southern Balkans, and abruptly towards North Italy. This cladeprobably diffused after the Last Gla- cial Maximum from a homeland in the Balkans or Eastern Europe. In contrast, I1b2 most probably arose in southern France/Iberia, underwent a post-glacial expansion, and marked the human coloni- zation of Sardinia about 9000 years ago. IZVLEČEK - Da bi raziskali, kateri aspekti sodobne človeške variacije kromosoma Y v Evropi so zna- čilni za primarno kolonizacijo, pozno glacialno širitev iz refugijev, neolitske širitve in bolj recentne dogodke v genskem zapisu, smo analizirali haploskupine I. Analize kromosoma Y haploskupine I, so pokazale sub-klade z jasnimi geografskimi porazdelitvami. Sub-klad I1a označuje Hg I v Skandina- viji, s hitrim pojemanjem v smeri proti Vzhodnoevropski ravnici in obrobju Atlantika. Mikrosatelit- ska raznolikost je pokazala, da je bil Iberijski polotok/južna Francija verjetni refugij in izvorno po- dročje, iz katerega sta se razširila oba sub-klada I1a in manj pogosti I1c. I1b* se razteza od vzhod- nega Jadrana do vzhodne Evrope in opazno pojema v smeri proti južnemu Balkanu ter strmo upa- de v smeri proti severni Italiji. Ta klad se je verjetno razširil po zadnjem glacialnem maksimumu iz domovine na Balkanu ali v vzhodni Evropi. I1b2 najverjetneje izhaja iz južne Francije/Iberije in kaže post-glacialno širitev in človekovo kolonizacijo Sardinije pred okoli 9000 leti. KEY WORDS - phylogeny of Y-chromosomal markers; haplogroup I sub-clades; late-glacial expansions; Neolithic dispersals Introduction Although there is sound evidence that the majority of present-day European genes descend from indige- nous Palaeolithic ancestors, it is to be expected that the pre-LGM landscape of the spread of genetic varia- tion was profoundly re-shaped during and after the LGM - Late Glacial Maximum (Richards et al. 2000; Semino et al. 2000). The aim of this investigation was to apply the phylogeographic approach to Y-chro- mosomal haplogroup I, the only known Y-chromoso- mal haplogroup that probably arose in Europe in Pa- laeolithic times, and which is still common and wide- spread there, whilst being virtually absent elsewhere. The distribution of haplogroup I was intriguing, with its two high frequency peaks in distant parts of Eu- rope (the Balkan region and Scandinavia); and the further large-scale I haplogroup study by Rootsi et al. (2004) concentrated on achieving a better phylo- genetic and phylogeographic resolution of this hap- logroup, informative for the reconstruction of long- distance gene flows in space and time. Results and Discussion It has been shown earlier that the high frequency of hg I is characteristic of two distant and distinct re- gions - around the Dinaric Alps (Semino et al. 2000; Barać et al. 2003) and in Nordic populations of Scan- dinavia (Semino et al. 2000; Passarino et al. 2002; Tambets et al. 2004). In a study by Rootsi et al. (2004), concentrating on haplogroup I phylogeography and phylogeny, more than seven thousand individuals from Europe and the surrounding regions were assessed for the mar- ker M170, which defines hg I. 1104 Y chromosomes from 48 European and 12 po- pulations from surrounding regions which showed the derived M170 C- allele were further genotyped with a set of markers (M253, P37, M26 and M223) that define distinct sub-clades of I, respectively I1a, I1b, I1b2 and I1c. Thanks to the new informative markers used, the improved resolution of phylogeny of hg I enabled to reveal distinct phylogeographical patterns of sub- clades I1a, I1b and I1c, which jointly cover about 95% of hg I individuals. Sub-clade I1a is widely dis- tributed in northern Europe, with its highest fre- quencies in Scandinavia: in Norwegians, Swedes and Saami, accounting for 88-100% of hg I individuals in these populations and showing rapidly decreasing frequency towards both the East European Plain and the northwestern coastal areas of Europe. Combined analysis of STR diversity and a relative portion of I1a sub-clade among all I lineages suggests that France or possibly more precisely - the Franco-cantabrian refugial area - could have been the source region of the spread of I1a during the post-LGM re-colonization of Europe. The same may apply to the spread of the less common sub-clade I1c. This scenario is also sup- ported by a high positive correlation (0.75) between the geographic distributions of I1a and I1c. I1c co- vers a wide range in Europe, with the highest fre- quencies in north-west coastal Europe, and a lower frequency elsewhere (Fig. 1). A totally different distribution pattern can be seen for I1b*, which is the most frequent haplogroup I clade in Eastern Europe and the Balkans. It reaches its highest frequencies in Croatian and Bosnian po- pulations, encompassing almost 80-90% of hg I there. When comparing frequencies in different re- gions of Croatia (Barać et al. 2003), clear and signi- ficant differences between the three southern islands with higher frequency and the mainland and the northerly island of Krk with lower frequency, be- came apparent. More than half of Croatian hg I sam- pled individuals - 126 out of 221 (57%) - share an identical STR haplotype, which was named the Di- naric Modal Haplotype. This haplotype was not pre- sent in 102 hg 2 chromosomes (according to Job- ling's nomenclature) as reported by Helgason et al. (2000); the most frequent among them was labeled as the Nordic Haplotype (Barać et al. 2003). The phylogenetic network of hg I STR haplotypes points to characteristic haplotype patterns in diffe- rent sub-clades, which allows us to identify possible founder haplotypes for the different sub-clades, and calculate their possible expansion times according to the method described in Zhivotovsky et al. (2004). The estimates for possible expansion times suggest that the expansion phase of I1a and I1b occurred around the early Holocene and only the less frequent sub-clade I1c shows an earlier age for its STR varia- tion, suggesting that the corresponding mutation arose earlier. High frequency combined with the high diversity of sub-clade I1b in the Croatian population (both main- land and island populations) suggests that during the LGM there might have been a refugium nearby. According to our knowledge, placing of the western Balkans on the list of human refugia during the LGM has not been confirmed so far unambigously by ar- chaeologists. However, the northern part of the Ad- riatic Sea, including the Dalmatian Islands, was at that time a part of dry land, being covered by water only much later, at the boundary of Holocene (refe- rences within Barać et al. 2003). Therefore, one may speculate that the wealth of the traces of human occupancy of the area lies submerged. Nevertheless, it is justified to suggest that the present-day western Adriatic was the reservoir of M170 (I1b) lineages, as well as a starting point for the spread of these linea- ges during the post-glacial re-colonization of Europe. Meanwhile, the star-like pattern of both, I1a and I1b* STR haplotypes, might be explained by simultaneous re-colonization of Europe from different refugia. I1b* sub-clade dissipates very rapidly west of the Balkans, being virtually absent among Italian, French and Swiss populations, but extending eastwards at Fig. 1. A - Median-joining net- work of combined haplotypes of six STR loci (DYS19, 388, 390,391,392,393) in 25 popu- lations/584 individuals (Nor- wegians, Estonians, Saami, Swedes, Hungarians, Czechs and Slovaks, Poles, Ukraini- ans, Croats, Bosnians, Mace- donians, Albanians, Greeks, Moldavians, Gagauz, Turks, Italians, Sardinians, French, Dutch, Andalusians, Bearnais, Basques and Swiss). Only ha- plotypes with frequency >1 were used. Nodes indicate ha- plotypes with sizes proportio- nal to their frequency (smal- lest node corresponds to 1 in- dividual - only in case of over- lap between subclades, other- wise haplotypes with frequency >1 are presented). Haplotypes of different sub-clades are indicated with different patterns. The most frequent haplotype of I1a sub-clade (14-14-23-10-11-13) corresponds to the earlier named Nor- dic Haplotype and the dominant in I1b* haplotype is the so-called Dinaric Modal Haplotype (16-13-24-11- 11-13) according to Barac et al. (2003). The possible founder lineage for the third sub-clade, I1c (15-13- 23-10-12-14), was revealed in Rootsi et al. (2004). B - Median joining network of I1b(xI1b2) and I1b2 lineges based on seven STRs (DYS 19, 388, 390, 391, 392,393 and YCAIIa,b). I1b* (YCAIIa,b mostly 21,21 alleles) andI1b2 (YCAIIa,b; 21,10 alleles) lineages are clearly separated from each other by the difference inYCAIIa,b haplotypes. They differ by 10 repeats in YCAIIb allele length that is probably a result of one single mutational event rather than step-by step de- letions - in particular because no intermediate repeat variants have been detected. notable frequencies, mostly in the north Balkans and among Slavic-speaking populations, including more eastern Ukrainians. This finding suggests that I1b* may have expanded from a glacial refuge area, which may have been located in the Balkans. As indicated above, there is only limited archaeological evidence for such a refugium in this region at present. Never- theless, data on the re-occupation of northern Eu- rope from the Balkan region by mammals such as the brown bear Ursus arctos (Taberlet and Bouvet 1994) and European hedgehog Erinaceus europeus (Hewitt 2000), birds such as the European great tit Parus major (Kvist et al. 1999; Kvist 2000) and in- sects such as the meadow grasshopper Chortippus parallelus (Hewitt 2000), supports its existence in- directly. It seems somewhat less likely though not impossible that I1b* was preserved during the LGM in an area of much better documented Periglacial refugium in the present-day Ukraine. It appears less likely be- cause (a) not only its frequency, but also diversity is higher in the Adriatic region; (b) a branch of I1b* - I1b2-M26 - has a clearly western pattern of distri- bution, being totally absent in Ukrainians. On the other hand, a clearly visible difference that can be observed in the distribution patterns of I1b* and an outshoot of it - I1b2 - suggests that their se- paration may have occurred even before the LGM, whereas isolation, genetic drift during the LGM, re- colonization and an unknown number of putative more recent demographic events created the pattern that one observes among extant populations. Meanwhile, the extremely high incidence of I1b2 among Sardinians (about 40%) can be explained by the presence of carriers of I1b2 lineage among the first inhabitants of the island early in Holocene and by the influence of genetic drift thereafter. Certain extent of similarity in distribution patterns of some mtDNA haplogroups, in particular V (Tor- roni et al. 1998; Torroni et al. 2001), with Y-chro- mosomal hg I sub-branches has been suggested by Rootsi et al. (2004). These findings show that distri- bution patterns characteristic to Y-chromosomal ha- plogroup I sub-clades are supported by parallel evi- dence from other genetic markers and probably indi- cate more general patterns in human past demogra- phic movements. REFERENCES BARAĆ L., PERIČIĆ M., KLARIĆ I. M., ROOTSI S., JANIČIJE- VIĆ B., KIVISILD T., PARIK J. et al. 2003. Y chromosomal heritage of Croatian population and its island isolates. European Journal of Human Genetics 11:535-42. HELGASON A., SIGURETH ARDOTTIR S., NICHOLSON J., SYKES B., HILL E. W., BRADLEY D. G., BOSNES V. et al. 2000. Estimating Scandinavian and Gaelic ancestry in the male settlers of Iceland. American Journal of Human Genetics 67: 697-717. HEWITT G. 2000. The genetic legacy of the Quaternary ice ages. Nature 405:907-913. KVIST L., RUOKONEN M., LUMME J. & ORELL M. 1999. The colonization history and present-day population structure of the European great tit (Parus major major). Heredity 82: 495-502. KVIST L. 2000. Phylogeny and phylogeography of Euro- pean Parids. Dissertation. Oulu University. PASSARINO G., CAVALLERI G.L., LIN A. A., CAVALLI-SFOR- ZA L. L., BORRESEN-DALE A. L., UNDERHILL P. A. 2002. Different genetic components in the Norwegian popula- tion revealed by the analysis of mtDNA and Y chromo- some polymorphisms. European Journal of Human Ge- netics 10:521-9. RICHARDS M., MACAULAY V., HICKEY E., VEGA E., SYKES B., GUIDA V., RENGO C. et al. 2000. Tracing European founder lineages in the Near Eastern mtDNA pool. Ameri- can Journal of Human Genetics 67:1251-1276. ROOTSI S., MAGRI C., KIVISILD T., BENUZZI G., HELP H., BERMISHEVA M., KUTUEVI. et al. 2004. Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in Europe. American Journal of Human Genetics 75:128-37. SEMINO O., PASSARINO G., OEFNER P. J., LIN A. A., ARBU- ZOVA S., BECKMAN L. E., DE BENEDICTIS G. et al. 2000. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Sci- ence 290:1155-9. TABERLET P. and BOUVET J. 1994. Mitochondrial DNA po- lymorphism, phylogeography, and conservation genetics of the brown bear (Ursus arctos) in Europe. Proceedings of the Royal Society of London Series B 255:195-200. TAMBETS K., ROOTSI S., KIVISILD T., HELP H., SERK P., LOOGVALI E. L., TOLK H. V. et al. 2004. The Western and Eastern Roots of the Saami-the Story of Genetic "Outliers" Told by Mitochondrial DNA and Y Chromosomes. Ameri- can Journal of Human Genetics 74: 661-82. TORRONI A., BANDELT H. J., MACAULAY V., RICHARDS M., CRUCIANI F., RENGO C., MARTINEZ-CABRERA V. et al. 2001. A signal, from human mtDNA, of post-glacial reco- lonization in Europe. American Journal of Human Gene- tics 69: 844-852. TORRONI A., BANDELT H.-J., D'URBANO L., LAHERMO P., MORAL P., SELLITTO D., RENGO C. et al. 1998. mtDNA analysis reveals a major late Paleolithic population expan- sion from southwestern to northeastern Europe. Ameri- can Journal of Human Genetics 62:1137-52. ZHIVOTOVSKY L. A., UNDERHILL P. A., CINNIOGLU C., KAYSER M., MORAR B., KIVISILD T., SCOZZARI R. et al. 2004. The effective mutation rate at y chromosome short tandem repeats, with application to human population-di- vergence time. American Journal of Human Genetics 74: 50-61.