A new genetic tree sequence easily combines data from multiple sources and scales to accommodate millions of genome sequences of our ancestors.
The new genealogical network of human genetic diversity reveals how individuals across the world are related to each other, in unprecedented detail. Researchers from the University of Oxford’s Big Data Institute have mapped the entirety of genetic relationships among humans. The study has been published today in Science.
The past two decades have seen extraordinary advancements in human genetic research, generating genomic data for hundreds of thousands of individuals, including from thousands of prehistoric people.
This raises the exciting possibility of tracing the origins of human genetic diversity to produce a complete map of how individuals across the world are related to each other.
Until now, the main challenges to this vision we’re working out a way to combine genome sequences from many different databases and developing algorithms to handle data of this size.
Dr. Yan Wong, an evolutionary geneticist at the Big Data Institute and one of the principal authors, explained: ‘We have built a huge family tree, a genealogy for all of humanity that models as exactly as we can the history that generated all the genetic variation we find in humans today. This genealogy allows us to see how every person’s genetic sequence relates to every other, along with all the points of the genome.”
Since individual genomic regions are only inherited from one parent, either the mother or the father, the ancestry of each point on the genome can be thought of as a tree.
Advertisement
The study integrated data on modern and ancient human genomes from eight different databases and included a total of 3,609 individual genome sequences from 215 populations.
Advertisement
After adding location data on these sample genomes, the authors used the network to estimate where the predicted common ancestors had lived. The results successfully recaptured key events in human evolutionary history, including the migration out of Africa.
Although the genealogical map is already an extremely rich resource, the research team plans to make it even more comprehensive by continuing to incorporate genetic data as it becomes available.
Because tree sequences store data in a highly efficient way, the dataset could easily accommodate millions of additional genomes.
This study is laying the groundwork for the next generation of DNA sequencing. As the quality of genome sequences from modern and ancient DNA samples improves, the trees will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today.
While humans are the focus of this study, the method is valid for most living things; from orangutans to bacteria. It could be particularly beneficial in medical genetics, in separating true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history.
Source-Medindia