CLICK ON weeks 0 - 40 and follow along every 2 weeks of fetal development
Developmental biology - Genes|
New Insights Into Human Evolution
By combining long-read sequence assembly with a hybrid genome scaffolding approach, researchers resolved a majority of gaps in human and ape genomes. Some of these gaps contained genes, now correctly annotated in the new genomes. To better understand gene structures, the authors sequenced more than 500,000 full-length genes from each species. This newest investigation provides the most comprehensive catalog of genetic variants gained or lost in different ape lineages. Some of these variants affect how genes differently function between humans and apes.
Researchers examined the possible influence of some of the genetic variants and gene function regulators in areas such as diet, anatomy, and brain formation. The comparative analysis of human and great ape genomes also included a gorilla genome, a new assembly of an African human genome, and a human haploid hydatidform mole (as the hydatidform mole contains only half of the human chromosomes, studies of these rare growths help researchers tell duplicated genes apart). All the genomes were sequenced and assembled using the same processes.
Additionally, researchers studied brain organoids - laboratory-grown tissues made from stem cells of apes or humans - which form a simplified version of organ parts. Brain organoids were made, then examined to try to understand how differences in gene function during brain development in humans and chimps, might account for chimps' smaller brain volume — three times less than human brain volume. There are also significant dissimilarities in cortical brain structures in human and chimps.
They saw in organoids that certain genes, particularly those in cells like the progenitor or ancestor cells to radial glial neurons, are decreased in humans as compared to chimps. Those genes are more likely to have lost segments of DNA, specifically in the human branch, important in regulating their function. This finding is consistent with a "less is more" theory proposed in the 1990s by Maynard Olson PhD, Professor of Genome Sciences (now retired), UW School of Medicine, and his colleagues. The theory proposes a loss of functional elements contributes to critical aspects of human evolution.
On the other hand, certain human genes appear to have increased neural progenitor cells and excitatory neurons in the human nervous system. These genes are more likely to have added additional copies via gene duplication in humans, compared to other apes.
Another discovery came from a fossil virus similar to present day retroviruses. It may have been present in the genome of the common ancestor to all African apes. The new high quality sequence also identified "source PtERV1" which is common to chimpanzees and gorillas. Modern day chimpanzees and gorillas carry hundreds of PtERV1 retroviral insertions that appear to have originated from this source — but never made it into the human genome. "Source PtERV1" was overlooked in the earlier genome studies as it mapped only to repeat-rich gaps in the genome.
In other aspects of this project, comparison of gorilla and human genome assemblies identified a new gorilla sequence inversion near an important gene controlling penile spine morphology, which humans don't have. These small, skin surface bumps occur only on apes and a few other mammals.
Humans also lost some genes involved in the synthesis of fatty acids which come from animal and vegetable fats and oils. Two essential fatty acids in our human diet are linoleic and alpha-linolenic, which we must get from plants and fish, in order to build specialized fats called omega-3 and omega-6 fatty acids. Other genetic changes related to dietary metabolism were identified that may play a relevant role in the evolution of the ape and great ape diet as their diets range from strictly vegetarian to eating almost anything.
The researchers predict there is more to come. Techniques in advanced, long-range sequencing and mapping, and even longer-read sequencing, will continue to increase specific knowledge of our evolutionary journey and that of the great apes.
"Our goal is to generate multiple ape genomes with as high a quality as the human genome. Only then will we be able to truly understand the genetic differences that make us uniquely human."
The accurate sequence and assembly of genomes is critical to our understanding of evolution and genetic variation. Despite advances in short-read sequencing technology that have decreased cost and increased throughput, whole-genome assembly of mammalian genomes remains problematic because of the presence of repetitive DNA.
The goal of this study was to sequence and assemble the genome of the western lowland gorilla by using primarily single-molecule, real-time (SMRT) sequencing technology and a novel assembly algorithm that takes advantage of long (>10 kbp) sequence reads. We specifically compare the properties of this assembly to gorilla genome assemblies that were generated by using more routine short sequence read approaches in order to determine the value and biological impact of a long-read genome assembly.
We generated 74.8-fold SMRT whole-genome shotgun sequence from peripheral blood DNA isolated from a western lowland gorilla (Gorilla gorilla gorilla) named Susie. We applied a string graph assembly algorithm, Falcon, and consensus algorithm, Quiver, to generate a 3.1-Gbp assembly with a contig N50 of 9.6 Mbp. Short-read sequence data from an additional six gorilla genomes was mapped so as to reduce indel errors and improve the accuracy of the final assembly. We estimate that 98.9% of the gorilla euchromatin has been assembled into 1854 sequence contigs. The assembly represents an improvement in contiguity: >800-fold with respect to the published gorilla genome assembly and >180-fold with respect to a more recently released upgrade of the gorilla assembly. Most of the sequence gaps are now closed, considerably increasing the yield of complete gene models. We estimate that 87% of the missing exons and 94% of the incomplete genes are recovered. We find that the sequence of most full-length common repeats is resolved, with the most significant gains occurring for the longest and most G+C–rich retrotransposons. Although complex regions such as the major histocompatibility locus are accurately sequenced and assembled, both heterochromatin and large, high-identity segmental duplications are not because read lengths are insufficiently long to traverse these repetitive structures. The long-read assembly produces a much finer map of structural variation down to 50 bp in length, facilitating the discovery of thousands of lineage-specific structural variant differences that have occurred since divergence from the human and chimpanzee lineages. This includes the disruption of specific genes and loss of predicted regulatory regions between the two species. We show that use of the new gorilla genome assembly changes estimates of divergence and diversity, resulting in subtle but substantial effects on previous population genetic inferences, such as the timing of species bottlenecks and changes in the effective population size over the course of evolution.
The genome assembly that results from using the long-read data provides a more complete picture of gene content, structural variation, and repeat biology, improving population genetic and evolutionary inferences. Long-read sequencing technology now makes it practical for individual laboratories to generate high-quality reference genomes for complex mammalian genomes.
Authors: David Gordon, John Huddleston, Mark J. P. Chaisson, Christopher M. Hill, Zev N. Kronenberg, Katherine M. Munson, Maika Malig, Archana Raja, Ian Fiddes, LaDeana W. Hillier, Christopher Dunn, Carl Baker, Joel Armstrong, Mark Diekhans, Benedict Paten, Jay Shendure, Richard K. Wilson, David Haussler, Chen-Shan Chin, Evan E. Eichler
We are grateful to A. Scally and Z. Ning for early access to the upgraded Kamilah gorilla assembly (gorGor4) and for discussion regarding its assembly. We thank M. Duyzend, L. Harshman, and C. Lee for technical assistance and quality control in generating sequencing data and H. Li for helpful suggestions for the PSMC analysis. The authors thank M. Heget, K. Gillespie, and M. Shender from the Lincoln Park Zoo for providing gorilla peripheral blood and T. Brown for assistance in editing this manuscript. This work was supported, in part, by grants from the U.S. National Institutes of Health (NIH grant HG002385 to E.E.E. and HG007635 to R.K.W. and E.E.E.; HG003079 to R.K.W.; HG007990 to D.H. and B.P.; and HG007234 to B.P.). E.E.E., J.S., and D.H. are investigators of the Howard Hughes Medical Institute. E.E.E. is on the scientific advisory board (SAB) of DNAnexus and was a SAB member of Pacific Biosciences. (2009–2013); E.E.E. is a consultant for Kunming University of Science and Technology (KUST) as part of the 1000 China Talent Program. M.J.P.C. is a former employee of (2009–2012) and owns shares in Pacific Biosciences. On 24 February 2011, Pacific Biosciences filed a patent entitled “Sequence assembly and consensus sequence determination” (U.S. patent no. US20120330566, issued 27 December 2012); M.J.P.C. is identified as inventor of this patent. Pacific Biosciences has filed two patents related to the Falcon assembler algorithm entitled “String graph assembly for polyploid genomes” (U.S. patent no. US2015/0169823 A1 filed 18 December 2014, and U.S. patent no. US2015/0286775 A1 filed 18 June 2015); C.C. is identified as inventor for both patents. The Susie3 assembly, PacBio and Illumina sequencing data for Susie, and clone sequences have been deposited in the European Nucleotide Archive under the project accession PRJEB10880. E.E.E., D.G., J.H., M.J.P.C., C.M.H., and Z.N.K. designed experiments; K.M.M., M.M., and C.B. prepared libraries and generated sequencing data; D.G., J.H., M.J.P.C., C.M.H., Z.N.K., L.W.H., and A.R. performed bioinformatics analyses; I.F., J.A., M.D., B.P., R.K.W., and D.H. analyzed gene accuracy. J.S. helped in the evaluation of Hi-C data. C.D. and C.-S.C. aided in Falcon assembler modifications. J.H. deposited SMRT sequencing data into SRA. E.E.E., D.G., J.H., M.J.P.C., C.M.H., and Z.N.K. wrote the manuscript.
Return to top of page
This western lowland gorilla, is one of the great apes. High resolution, comparative analysis of great ape genomes is providing new insight into our own primate evolution. Image credit: Alice C. Gray.