Who We Are and How We Got Here Read online




  Copyright © 2018 by David Reich and Eugenie Reich

  All rights reserved. Published in the United States by Pantheon Books, a division of Penguin Random House LLC, New York, and distributed in Canada by Random House of Canada, a division of Penguin Random House Canada Limited, Toronto.

  Pantheon Books and colophon are registered trademarks of Penguin Random House LLC.

  Library of Congress Cataloging-in-Publication Data

  Name: Reich, David [date], author.

  Title: Who we are and how we got here : ancient DNA and the new science of the human past / David Reich.

  Description: First edition. New York : Pantheon Books, [2018]. Includes bibliographical references and index.

  Identifiers: LCCN 2017038165. ISBN 9781101870327 (hardcover). ISBN 9781101870334 (ebook).

  Subjects: LCSH: Human genetics—Popular works. Genomics—Popular works. DNA—Analysis. Prehistoric peoples. Human population genetics. BISAC: SCIENCE/Life Sciences/Genetics & Genomics. SCIENCE/Life Sciences/Evolution. SOCIAL SCIENCE/Anthropology/General.

  Classification: LCC QH431 .R37 2018. DDC 572.8/6—dc23.

  LC record available at lccn.loc.gov/​2017038165.

  Ebook ISBN 9781101870334

  www.pantheonbooks.com

  Cover design by Oliver Uberti

  Illustrations and map by Oliver Uberti

  v5.2

  a

  For Seth and Leah

  Contents

  Cover

  Title Page

  Copyright

  Dedication

  Map

  Acknowledgments

  Introduction

  Part I The Deep History of Our Species

  1 How the Genome Explains Who We Are

  2 Encounters with Neanderthals

  3 Ancient DNA Opens the Floodgates

  Part II How We Got to Where We Are Today

  4 Humanity’s Ghosts

  5 The Making of Modern Europe

  6 The Collision That Formed India

  7 In Search of Native American Ancestors

  8 The Genomic Origins of East Asians

  9 Rejoining Africa to the Human Story

  Part III The Disruptive Genome

  10 The Genomics of Inequality

  11 The Genomics of Race and Identity

  12 The Future of Ancient DNA

  Notes on the Illustrations

  Notes

  About the Author

  30 Population Mixtures

  The mixture of highly differentiated populations is a recurrent process in our history. This map provides a key to thirty great mixture events discussed in this book. (Locations are not meant to be precise.)

  CHAPTER 2

  2a 54,000–49,000 years ago

  All non-Africans

  Neanderthals + modern humans

  CHAPTER 3

  3a >70,000 ya Siberian Denisovans

  Superarchaic lineage +

  Neanderthal-related lineage

  3b 49,000–44,000 ya

  Papuans and Australians

  Denisovans + modern humans

  CHAPTER 4

  4a 19,000–14,000 ya

  Magdalenian expansion

  Aurignacian + Gravettian lineages

  4b >14,000 ya

  Late Near Eastern hunter-gatherers

  Basal Eurasians + early Near Eastern hunter-gatherers

  4c ~14,000 ya

  Bølling-Allerød expansion

  Southwest + Southeast European hunter-gatherers

  4d 8,000–3,000 ya

  Copper and Bronze Age Near East

  Iranian + Levantine + Anatolian farmers

  CHAPTER 5

  5a 9,000–5,000 ya

  First European farmers

  Local hunter-gatherers + Anatolian farmers

  5b 9,000–5,000 ya

  Steppe pastoralists

  Iranian farmers + local hunter-gatherers

  5c 5,000–4,000 ya

  Northern European Bronze Age

  Eastern European farmers

  + steppe pastoralists

  5d >3,500 ya

  Aegean Bronze Age

  Iranian farmers + European farmers

  5e 3,500 ya – present

  Present-day Europeans

  Northern + Southern European Bronze Age populations

  CHAPTER 6

  6a >4,000 ya

  Ancestral South Indians

  Iranian farmers + indigenous

  Indian hunter-gatherers

  6b 4,000–3,000 ya

  Ancestral North Indians

  Steppe pastoralists + Iranian farmers

  6c 4,000–2,000 ya

  Present-day Indians

  Ancestral South Indians + Ancestral North Indians

  CHAPTER 7

  7a >15,000 ya

  First Americans

  Ancient North Eurasians + East Asians

  7b 5,000–4,000 ya

  Paleo-Eskimos

  Far Eastern Siberians + First Americans

  7c >4,000 ya

  Amazonians

  Population Y + First Americans

  7d 2,000–1,000 ya

  Na-Dene speakers

  Paleo-Eskimos + First Americans

  7e 2,000–1,000 ya

  Neo-Eskimos

  Far Eastern Siberians + First Americans

  CHAPTER 8

  8a 5,000–4,000 ya Austroasiatic speakers

  Yangtze River Ghost Population + indigenous Southeast Asian hunter-gatherers

  8b 5,000–3,000 ya

  Tibetans

  Yellow River Ghost Population + Tibetan hunter-gatherers

  8c 5,000–1,000 ya Present-day Han Chinese

  Yellow + Yangtze River Ghost Populations

  8d 4,000–1,000 ya

  Southwest Pacific islanders

  Papuans + East Asians

  8e 3,000–2,000 ya Present-day Japanese

  Mainland farmers + local hunter-gatherers

  CHAPTER 9

  9a >8,000 ya

  Malawi hunter-gatherers

  East + South African foragers

  9b 4,000–1,000 ya

  Bantu expansion

  Cameroon source population + local groups throughout eastern and southern Africa

  9c >3,000 ya

  East African pastoralists

  Levantine farmers + East African foragers

  9d >2,000 ya

  Present-day West Africans

  At least two ancient African lineages

  9e 2,000–1,000 ya

  Present-day Khoe-Kwadi herders

  East African pastoralists + indigenous San

  Acknowledgments

  First thing first. This book emerged out of a year of intense collaboration with my wife, Eugenie Reich. We researched the book together, prepared the first drafts of the chapters together, and talked about the book incessantly as it matured. This book would not have come into being without her.

  I am grateful to Bridget Alex, Peter Bellwood, Samuel Fenton-Whittet, Henry Louis Gates Jr., Yonatan Grad, Iosif Lazaridis, Daniel Lieberman, Shop Mallick, Erroll McDonald, Latha Menon, Nick Patterson, Molly Przeworski, Juliet Samuel, Clifford Tabin, Daniel Reich, Tova Reich, Walter Reich, Robert Weinberg, and Matthew Spriggs for close critical readings of the entire book.

  I thank David Anthony, Ofer Bar-Yosef, Caroline Bearsted, Deborah Bolnick, Dorcas Brown, Katherine Brunson, Qiaomei Fu, David Goldstein, Alexander Kim, Carles Lalueza-Fox, Iain Mathieson, Eric Lander, Mark Lipson, Scott MacEachern, Richard Meadow, David Meltzer, Priya Moorjani, John Novembre, Svante Pääbo, Pier Palamara, Eleftheria Palkopoulou, Mary Prendergast, Rebecca Reich, Colin Renfrew, Nadin Rohland, Daniel Rozas, Pontus Skoglund, Chuanchao Wang, and Michael Wi
tzel for critiques of individual chapters. I also thank Stanley Ambrose, Graham Coop, Dorian Fuller, Éadaion Harney, Linda Heywood, Yousuke Kaifu, Kristian Kristiansen, Michelle Lee, Daniel Lieberman, Michael McCormick, Michael Petraglia, Joseph Pickrell, Stephen Schiffels, Beth Shapiro, and Bence Viola for reviewing sections of the book for accuracy.

  I am grateful to Harvard Medical School, the Howard Hughes Medical Institute, and the National Science Foundation, all of which generously supported my science while I was working on this project, and viewed it as complementary to my primary research.

  I finally thank several people who repeatedly encouraged me to write this book. I resisted the idea for years because I did not want to distract myself from my science, and because for geneticists papers are the currency, not books. But my mind changed as my colleagues grew to include archaeologists, anthropologists, historians, linguists, and others eager to come to grips with the ancient DNA revolution. There are many papers I did not write, and many analyses I did not complete, because of the time I needed to write this book. I hope that those who read the book will emerge with a new perspective on who we are.

  Introduction

  This book is inspired by a visionary, Luca Cavalli-Sforza, the founder of genetic studies of our past. I was trained by one of his students, and so it is that I am part of his school, inspired by his vision of the genome as a prism for understanding the history of our species.

  The high-water mark of Cavalli-Sforza’s career came in 1994 when he published The History and Geography of Human Genes, which synthesized what was then known from archaeology, linguistics, history, and genetics to tell a grand story about how the world’s peoples got to be the way they are today.1 The book offered an overview of the deep past. But it was based on what was known at the time and was therefore handicapped by the paucity of genetic data then available, which were so limited as to be nearly useless compared to the far more extensive information from archaeology and linguistics. The genetic data of the time could sometimes reveal patterns consistent with what was already known, but the information they provided were not rich enough to demonstrate anything truly new. In fact, the few major new claims that Cavalli-Sforza did make have essentially all been proven wrong. Two decades ago, everyone, from Cavalli-Sforza to beginning graduate students such as myself, was working in the dark ages of DNA.

  Cavalli-Sforza made a grand bet in 1960 that would drive his entire career. He bet that it would be possible to reconstruct the great migrations of the past based entirely on the genetic differences among present-day peoples.2

  Through study after study over the subsequent five decades, Cavalli-Sforza seemed to be well on the path to making good on his bet. When he started his work, the technology for studying human variation was so poor that the only possibility was to measure proteins in the blood, using variations like the A, B, and O blood types that are tested by physicians to match blood donors to recipients. By the 1990s, he and his colleagues had assembled data from more than one hundred such variations in diverse populations. Using these data they were able to reliably cluster individuals by continent based on how often they matched each other at these variations: for example, Europeans have a high rate of matching to other Europeans, East Asians to East Asians, and Africans to Africans. In the 1990s and 2000s, they brought their work to a new level by moving beyond protein variation and directly examining DNA, our genetic code. They analyzed a total of about one thousand individuals from around fifty populations spread across the planet, examining variation at more than three hundred positions in the genome.3 When they told their computer—which had no knowledge of the population labels—to cluster the individuals into five groups, the results corresponded uncannily well to commonly held intuitions about the deep ancestral divisions among humans (West Eurasians, East Asians, Native Americans, New Guineans, and Africans).

  Cavalli-Sforza was especially interested in interpreting the genetic clusters among present-day people in terms of population history. He and his colleagues analyzed their blood group data by using a technique that identifies combinations of biological variations that are most efficient at summarizing differences across individuals. Plotting these combinations of blood group types onto a map of West Eurasia, they found that the one summarizing the most variation across individuals reached its extreme value in the Near East, and declined along a southeast-to-northwest gradient into Europe.4 They interpreted this as a genetic footprint of the migration of farmers into Europe from the Near East, known from archaeology to have occurred after nine thousand years ago. The declining intensity suggested to them that after arriving in Europe, the first farmers mixed with local hunter-gatherers, accumulating more hunter-gatherer ancestry as they expanded, a process they called “demic diffusion.”5 Until recently, many archaeologists viewed the demic diffusion model as an exemplary merging of insights from archaeology and genetics.

  The model that Cavalli-Sforza and colleagues proposed to describe the data was intellectually attractive, but it was wrong. Its flaws became apparent beginning in 2008, when John Novembre and colleagues demonstrated that gradients like those observed in Europe can arise without migration.6 They then showed that a Near Eastern farming expansion into Europe might counter-intuitively cause the mathematical technique that Cavalli-Sforza used to produce a gradient perpendicular to the direction of migration, not parallel to it as had been seen in the real data.7

  It took the revolution wrought by the ability to extract DNA from ancient bones—the “ancient DNA revolution”—to drive a nail into the coffin of the demic diffusion model. The ancient DNA revolution documented that the first farmers even in the most remote reaches of Europe—Britain, Scandinavia, and Iberia—had very little hunter-gatherer-related ancestry. In fact, they had less hunter-gatherer ancestry than is present in diverse European populations today. The highest proportion of early farmer ancestry in Europe is today not in Southeast Europe, the place where Cavalli-Sforza thought it was most common based on the blood group data, but instead is in the Mediterranean island of Sardinia to the west of Italy.8

  The example of Cavalli-Sforza’s maps shows why his Sforza’s grand bet went sour. He was correct in his assumption that the present-day genetic structure of populations echoes some of the great events in the human past. For example, the lower genetic diversity of non-Africans compared to Africans reflects the reduced diversity of the modern human population that expanded out of Africa and the Near East after around fifty thousand years ago. But the present-day structure of human populations cannot recover the fine details of ancient events. The problem is not just that people have mixed with their neighbors, blurring the genetic signatures of past events. It is actually far more difficult, in that we now know, from ancient DNA, that the people who live in a particular place today almost never exclusively descend from the people who lived in the same place far in the past.9 Under these circumstances, the power of any study that attempts to reconstruct past population movements from present-day populations is limited. In The History and Geography of Human Genes, Cavalli-Sforza wrote that he was excluding from his analysis populations known to be the product of major migrations, such as those of European and African ancestry in the Americas that owe their origin to transatlantic migrations in the last five hundred years, or European minorities such as Roma and Jews. His bet was that the past was a much simpler place than the present, and that by focusing on populations today that are not affected by major migrations in their recorded history, he might be studying direct descendants of people who lived in the same places long before. But what the study of ancient DNA has now shown is that the past was no less complicated than the present. Human populations have repeatedly turned over.

  Figure 1a. A contour plot made by Luca Cavalli-Sforza in 1993 (adapted above) suggested that the movement of farmers from the east could be reconstructed from the patterns of blood group variation among people living today, with the highest proportions of such ancestry in the southeast near Anatolia.

&n
bsp; Cavalli-Sforza’s transformative contribution to the field of genetic studies of human prehistory recalls the story of Moses, a visionary leader whose achievement was greater than that of anyone who followed him and who created a new template for seeing the world. The Bible says, “No prophet ever arose again in Israel like Moses,” but also tells how Moses was not allowed to reach the promised land. After leading his people for forty years through the wilderness, Moses climbed the mountain of Nebo and looked west over the Jordan River to see the land his people had been promised. But he was not allowed to enter that land. That privilege had been reserved for his successors.

  Figure 1b. Modern genome-wide data shows that the primary gradient of farmer ancestry in Europe does not flow southeast-to-northwest but instead in an almost perpendicular direction, a result of a major migration of pastoralists from the east that displaced much of the ancestry of the first farmers.

  So it is with genetic studies of the past. Cavalli-Sforza saw before anyone else the full potential of genetics for revealing the human past, but his vision predated the technology needed to fulfill it. Today, however, things are very different. We have several hundred thousand times more data, and in addition we have access to the rich lode of information contained in ancient DNA, which has become a more definitive source of information about past population movements than the traditional tools of archaeology and linguistics.

  The first five ancient human genomes were published in 2010: a few archaic Neanderthal genomes,10 the archaic Denisova genome,11 and an approximately four-thousand-year-old individual from Greenland.12 The next few years saw the publication of genome-wide data from five additional humans, followed by a burst of data from thirty-eight individuals in 2014. But in 2015, whole-genome analysis of ancient DNA went into hyperdrive. Three papers added genome-wide datasets from another sixty-six,13 then one hundred,14 and then eighty-three samples.15 By August 2017, my laboratory alone had generated genome-wide data for more than three thousand ancient samples. We are now producing data so fast that the time lag between data production and publication is longer than the time it takes to double the data in the field.