Mapping the Tree of Life: the ARBOR Project
Open Tree of Life met with ARBOR, a program funded by the National Science Foundation, to talk about what changes have been made featuring the synthetic tree of life. We spoke with Dr. Luke Harmon, an associate professor at the University of Idaho’s department of Biology. Dr. Harmon has been using comparative biology to determine what the tree of life can tell us about evolution over long time scales.
What has ARBOR been working on right now?
Comparative Biology is at the heart of the ARBOR project. Using the evolutionary relationships among species, we can learn something about trait evolution and the formation of new species. For example, there really is no basic ‘ladder of life’ stemming from simpler organisms to more complex; instead, evolution varies among groups and through time in complex and interesting ways. It’s hard to do what we do with traditional tools. Instead, we have to use new tools to analyze how species have diversified to generate the tree of life
How have phylogeny studies changed over time?
A lot of progress has been made in the last twenty years regarding our understanding of the relationships among different species. We now know a lot more about how species are related to one another and how they evolved from their common ancestors. The Open Tree of Life is the best possible example of this sort of synthesis – it’s almost like the human genome project in that it is generating a very good map that will connect all organisms on earth in a single phylogenetic tree. One problem, though, is that there is just so much information contained in large phylogenetic trees, and we don’t always know how to extract information about how organisms evolve. ARBOR is developing tools to read the stories of evolution from these phylogenies.
Free webinar: Putting all species in a graph database
Biology + Technology = OTOL
One of the developers of the Open Tree of Life demonstrates Thursday, during a free webinar, how graph databases are used to construct a tree of life. The lecture is organized by Neo Technology, which is the maker of Neo4j, an open-source database that is used for OTOL.
Stephen Smith, an ecology and evolutionary biology professor at the University of Michigan, is going to explain how Neo4j and other digital technologies are assisting in constructing the tree of life. Starting at 10:00 PDT (19:00 CEST), he will also discuss other aspects of the interface of biology with next generation technologies.
“Our project is building the tools with which scientists in the community can continually improve the tree of life as we gather new information. Neo4j allows us to not only store trees in their native graph form, but also allows us to map trees to the same structure, the graph. So in fact, we are facilitating the construction of the graph of life,” says Smith.
Neo4j approached the Open Tree of Life team to present a webinar because it is a project that utilizes the Neo4j graph database to represent the interconnectedness of biological data. The company considers the project a great example of how a graph database can better model the natural world.
The online lecture is intended for a broad audience including beginner computer programmers, advanced hackers, data scientists, natural scientists, and anyone interested in the cross-section of science and technology, especially data modeling. Over 150 people have already registered online.
The registration form: LINK
Update: The video from this webinar is available on vimeo: http://vimeo.com/67870035
Interview with Open Tree of Life investigator
Crandall featured on PeerJ blog
Open Tree of Life investigator Keith Crandall is featured on the blog of PeerJ, which is a peer-reviewed, open access journal on the Internet. Crandall is an Academic Editor for PeerJ and is the director of the Computational Biology Institute at George Washington University. He was the editor for the “living fossil” manuscript that got much news media attention last week. Here’s the link to the interview.
Building an API for the Open Tree of Life database
Do you want an app for this?
The developers of the Open Tree of Life would like to know from the phylogenetic community what kind of information they want to extract from its database when the first draft is released later this year. With those preferences, it is possible to develop an API that gives scientists the opportunity to build their own websites or software packages that use the data.
An API (application programming interface) is a digital tool that allows one website or software program to “talk” to another website to dig up certain pieces of data. For instance, a lot of people use Tweetdeck to navigate the ongoing bombardment of messages in the Twittersphere. In that case, Tweetdeck is connecting to Twitter, through its API, to receive and order the messages according to the preferences of the user.
In case of the Open Tree of Life, an API gives researchers advanced access to the data of about two million species, the phylogenies that have been created to illustrate possible relationships between them, and the underlying data and methods of synthesis. “For example, it will be possible to select smaller trees for specific species or find out how many studies there are for a particular node within the database,” says Karen Cranston, the lead investigator of the project. (more…)
“We need a sense of ownership of phylogenetic trees”
Where are the fungi datasets?
A couple thousand fungi phylogeny studies have been published in the past twelve years. Clark University postdoc researcher Romina Gazis has gone through all of them. Now she is working on a bigger challenge: finding all the trees and datasets that were the foundation of those studies.
Ideally, all scientists who publish a phylogenetic tree would also deposit the datasets they used to create such trees at a publicly available online database. That allow other researchers to synthesize data from different sources to advance the knowledge about relationships between certain species and their evolutionary history.
Unfortunately, most of those datasets are not publicly available. Gazis only found datasets for about a quarter of the two-thousand fungi articles she surveyed. “Around 600 studies had tree files available, but not necessarily complete,” she concluded. “Some scientists deposited one but not all the trees.” (more…)
Tree of Life: Are big changes looming on the horizon?
All species like some gadgets
While movie hero James Bond gets his spy gadgets from his loyal developer Q, almost every other species on Earth has to put a little more effort in armoring themselves. But that does not mean they cannot rely on some good ol’ friends to do so. In fact, the acquisition of genes from two or more species through lateral gene transfer can lead to innovations that at times can be painful—sometimes even deadly—to others.
One of those evolutionary novelties is noticeable for certain types of jellyfish that developed the ability to sting after their ancestors acquired a gene from a bacterium and incorporated that material in their own DNA. This gene transmission helped jellyfish to create an innovative defense tool to fend off other species that could endanger them. The result is quite frightening: more humans get killed by jellyfish than sharks. (more…)
Small portion of phylogenetic data is stored publicly
‘The glass is still pretty empty’
Sometimes you wonder whether the glass is half full or half empty.
But when it is only filled for four percent—the other 96 percent is just air—there is only one conclusion: it is time for more.
At least that is what some scientists in the phylogenetic community argue, because only about four percent of all published phylogenies are stored in places such as TreeBASE or Dryad. Their message is quite simple: it is time to bring together more databases with estimations on how species are possibly related to each other.
Several journals in the evolutionary biology field recently adopted policies that encourage or require contributors to make their data publicly available online. Yet, this only leads to the storage of a very small percentage of ten-thousands of phylogenies that have been constructed in the past few decades.
Of course, there are also other ways to receive data that are not stored on the Internet, but those alternatives are commonly not the most efficient routes. For instance, it is possible to send an email to a scientist who published a phylogenetic tree and “sometimes wait for six months to maybe get a response—either with or without the data,” says Keith Crandall, one of the Open Tree of Life investigators and the founding director of the Computational Biology Institute at George Washington University.
Connecting millions of pieces
Creating the entire tree of life is like completing a jigsaw puzzle with more than two million pieces. And to make it even harder; the illustration of how the solved puzzle would look like is missing.
No one knows precisely how all pieces are related.
This disparity is unmistakably demonstrated by disagreements between evolutionary biologists about how certain species and branches are linked together. Throughout the years they have created a large variety of trees with specific groups of species that contradict each other. For example, one researcher maintains that species A is the closest living relative of species B, but another scientist thinks that species C is actually most closely related to B. (more…)
Is it a plant? Or is it a monkey?
It should not be hard to recognize the differences between furry night monkeys and the bright yellow flowers of golden peas. But they have something peculiar in common that leads to some confusion once in while: their name. Both genera are officially known as Aotus.
There are about two million known species on the planet, so it should not come to a surprise that scientists accidentally have given certain species, or groups of species, similar names. For instance, Proboscidea is considered an order of elephants, but it is also the name for the genus of devil’s claws. Other examples include Myrmecia pyriformis (insect and green algae), Ficus elegans (mollusc and plant), Ormosia nobilis (insect and plant), and Trigonidium grande (orchid and katydid).
Across disciplinary boundaries
What do a fungal evolutionary biologist and a computer scientist have in common?
It is usually easier to name a long list of differences, but that does not mean that those scholars are investigating different issues all the time. They may be very much interested in the same problems, yet apply different perspectives and methods in search for answers.
Those scientists could continuously work on their individual research projects for may years. However, in some cases only an interdisciplinary collaboration leads to a solution. The investigators of the Open Tree of Life project hope this will be the case for them as well. Their goal: creating a tree of life that includes all 1.9 million known species. (more…)
Wanted: All your favorite trees
With eleven investigators, the Open Tree of Life project is already a large-scale research endeavor. But that does not mean that they can add all 1.9 million known species to a database by themselves. In fact, they are looking for help.
A lot of help.
The main goal of the project is to merge all existing phylogenetic trees in one overarching tree of life. In the past few months, the researchers have been working on software applications to make it possible to store all known species and, more important, to specify how they are all linked to each other in evolutionary terms.
Put on your quiz hats! We need some good questions!
SUBMIT YOUR QUESTIONS HERE
• Sponges fall within which major group on the tree of life? (animal, plant, bacteria)
• Which are mushrooms more closely related to: (animals, red algae or plants?)
• How many origins of life were there on Earth? (1, 2, 3)
• Which organisms represent the greatest biomass on Earth?
(bacteria and archaea, mammals, fish)
• How many major groups of organisms are represented in a ham sandwich? (1, 2, 3)
• Genes (i.e. portions of genomes) yield the same estimate for the ToL? (Yes, No, Sometimes)
• The top 10 infectious agents on earth appear where on the tree? (bacteria only, in both bacteria and eukaryotes, in both bacterial and archaea)
• Each gene sequenced and analyzed yields the very same answer for the ToL? (Yes, No, Sometimes)You can submit up to three questions with this form, but feel free to submit more by starting a new one!
What data should we collect about the input trees for the tree of life?
The absence of a formal reporting standard for phylogenetic analyses is a major impediment for digital access and reuse of published gene trees and species trees. Efforts are underway to develop a standard for Minimal Information About Phylogenetic Analyses (MIAPA). An important part of this process is community input on metadata – what is important for use and evaluation, and what is reasonable to expect from producers of trees?
Results from this survey will inform two efforts: the collection of digital phylogenetic data for Open Tree of Life and the development of a minimum information standard for reporting phylogenetic analyses (MIAPA, http://www.evoio.org/wiki/MIAPA). If you have any questions, please contact Karen Cranston, National Evolutionary Synthesis Center (firstname.lastname@example.org).
Please add your opinion here
What are your favorite species?
We need your help creating a list of exemplar species from across the tree of life for our public tree!
Please click this link to vote for your 5 best exemplars.
NSF’s press release on the Open Tree of Life
Press Release 12-106 (original article)
Assembling, Visualizing and Analyzing a Tree of All Life
National Science Foundation grants will bring together what’s known about how species are related
The “Open Tree of Life” is one of three major new scientific projects funded by the NSF.
June 4, 2012
A new initiative aims to build a comprehensive tree of life that brings together everything scientists know about how all species are related, from the tiniest bacteria to the tallest tree.
Researchers are working to provide the infrastructure and computational tools to enable automatic updating of the tree of life, as well as develop the analytical and visualization tools to study it.
Scientists have been building evolutionary trees for more than 150 years, since Charles Darwin drew the first sketches in his notebook.
Darwin’s theory of evolution explained that millions of species are related and gave biologists and paleontologists the enormous challenge of discovering the branching pattern of the tree of life.
But despite significant progress in fleshing out the major branches of the tree of life, today there is still no central place where researchers can go to visualize and analyze the entire tree.
Now, thanks to grants totaling $13 million from the National Science Foundation’s (NSF) Assembling, Visualizing, and Analyzing the Tree of Life (AVAToL) program, three teams of scientists plan to make that a reality.