UK researchers create the world's largest 'family tree' linking 27 million people

Scientists have created the world's latest family tree in history, linking around 27 million people - both living and dead - from around the world.

Developed at the University of Oxford in England, the expansive genealogical network - which researchers say is the largest human genealogy to date - reveals how individuals across the globe are related to one another. 

"The characterisation of modern and ancient human genome sequences has revealed previously unknown features of our evolutionary past," researchers from the University of Oxford's Big Data Institute said.

"As genome data generation continues to accelerate - through the sequencing of population-scale biobanks and ancient samples from around the world - so does the potential to generate an increasingly detailed understanding of how populations have evolved." 

The meticulous research, encompassing a scientific method, research paper and video, effectively traces human populations over time and where and when they walked the earth. 

The project, published on Friday in the journal Science, is described by author and evolutionary geneticist Dr Yan Wong as "basically a huge family tree".

"We have basically built a huge family tree, a genealogy for all of humanity that models as exactly as we can, the history that generated all the genetic variation we find in humans today," Dr Wong said.

"This genealogy allows us to see how every person's genetic sequence relates to every other, along all the points of the genome.

"While humans are the focus of this study, the method is valid for most living things; from orangutans to bacteria. It could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history."

There have been unprecedented advances in human genetic research over the past 20 years, with genomic data generated for hundreds of thousands of people - including from thousands of prehistoric ancestors.

"We demonstrate the power of the method to recover relationships between individuals and populations as well as to identify descendants of ancient samples," the researchers said

"We use the foundational notion that the ancestral relationships of all humans who have ever lived can be described by a single genealogy or tree sequence… linking individuals to one another at every point in the genome. 

"This tree sequence of humanity is immensely complex, but estimates of the structure are a powerful means of integrating diverse datasets and gaining greater insights into human genetic diversity… [and] reveal features of human diversity and evolution."

Until now, scientists have struggled to process and integrate the highly diverse and expansive datasets, with samples from wide-ranging times, geographic locations and populations processed, sequenced, and analysed using a variety of techniques. As a result, the datasets contained genuine variation, but also complex patterns of error. 

"The [large quantity] of genetic sequencing data creates challenges for integrating diverse data sources," the researchers said.

"This makes combining data challenging and hinders efforts to generate the most complete picture of human genomic variation."

However, the team were able to introduce new statistical and computational methods to easily combine data from different sources and accommodate the millions of genome sequences.

Each line represents an ancestor-descendant relationship. The width of a line represents how many times the relationship is seen and lines are coloured based on the estimated age of the ancestor.
Each line represents an ancestor-descendant relationship. The width of a line represents how many times the relationship is seen and lines are coloured based on the estimated age of the ancestor. Photo credit: Wohns et al / Science

To date, the tree sequence integrates 3609 human genome sequences - 3601 modern and eight ancient - from across eight datasets and 215 populations. The ancient genomes included samples aged from the 10000s to more than 100,000 years. 

The resulting tree is a "compact representation" of 27 million ancestral haplotype fragments (sets of genetic determinants inherited from a single parent) and 231 million ancestral lineages. An additional 3589 ancient samples compiled from more than 100 publications were also used in order to trace the relationships.

"Essentially, we are reconstructing the genomes of our ancestors and using them to form a vast network of relationships," said lead author Dr Anthony Wilder Wohns, a postdoctoral researcher at the Broad Institute of MIT and Harvard. "We can then estimate when and where these ancestors lived."

The results also successfully traced key events in human history, including the migration out of Africa and archaic introgression - the transfer of genetic information from one species to another - in Oceania. The techniques used by the team are hoped to improve methods to better explain the paths and timings of historic migrations. 

The researchers determined that the very earliest ancestors, predating Homo sapiens, were tracked back to a geographic location now in contemporary Sudan, likely more than one million years ago. The very earliest ancestors the team identified were very likely Homo erectus, an extinct species of archaic human.

"These ancestors lived up to and over one million years ago – which is much older than current estimates for the age of modern humans (c 250,000 to 300,000 years ago) – so bits of our genome have been inherited from individuals that we wouldn't recognise as modern humans, but who most likely lived in northeast Africa. It's very likely that these very old ancestors were Homo erectus, but we cannot be sure of their identity or of their location without extremely ancient DNA," Dr Wong and Dr Wohns said.

"One important conclusion from our work is that the people we often label as representing 'the cradle of humanity' themselves had ancestors further back in time, whose descendants are still among us today."