Assembling, Visualizing, and Analyzing the Tree of Life

We need your help!

Building an API for the Open Tree of Life database

Do you want an app for this?

Screen Shot 2012-08-29 at 9.22.20 PMThe developers of the Open Tree of Life would like to know from the phylogenetic community what kind of information they want to extract from its database when the first draft is released later this year. With those preferences, it is possible to develop an API that gives scientists the opportunity to build their own websites or software packages that use the data.

An API (application programming interface) is a digital tool that allows one website or software program to “talk” to another website to dig up certain pieces of data. For instance, a lot of people use Tweetdeck to navigate the ongoing bombardment of messages in the Twittersphere. In that case, Tweetdeck is connecting to Twitter, through its API, to receive and order the messages according to the preferences of the user.

In case of the Open Tree of Life, an API gives researchers advanced access to the data of about two million species, the phylogenies that have been created to illustrate possible relationships between them, and the underlying data and methods of synthesis. “For example, it will be possible to select smaller trees for specific species or find out how many studies there are for a particular node within the database,” says Karen Cranston, the lead investigator of the project. (more…)


Connecting millions of data points in a graph database

Creating ‘Facebook’ for species

Neo4j screenshotThe Open Tree of Life database is not just a list with about two million species. Information is added about their special characteristics and possible relationships with others as well. “It may become tens or hundreds of million pieces of data when we are all done.”

Stephen Smith, an evolutionary biology professor at the University of Michigan, is working together with the other researchers of the Open Tree of Life project to develop the programs and tools that will be used to construct the full tree of life. Scientists from all over the world can then synthesize all the information in the database.

“We are currently building the back-end of the Open Tree of Life. We need to create software that allows us to put all our information in a graph network, so that we can easily retrieve the information that researchers are specifically looking for.” (more…)


“We need a sense of ownership of phylogenetic trees”

Where are the fungi datasets?

FungiA couple thousand fungi phylogeny studies have been published in the past twelve years. Clark University postdoc researcher Romina Gazis has gone through all of them. Now she is working on a bigger challenge: finding all the trees and datasets that were the foundation of those studies.

Ideally, all scientists who publish a phylogenetic tree would also deposit the datasets they used to create such trees at a publicly available online database. That allow other researchers to synthesize data from different sources to advance the knowledge about relationships between certain species and their evolutionary history.

Unfortunately, most of those datasets are not publicly available. Gazis only found datasets for about a quarter of the two-thousand fungi articles she surveyed. “Around 600 studies had tree files available, but not necessarily complete,” she concluded. “Some scientists deposited one but not all the trees.” (more…)


Small portion of phylogenetic data is stored publicly

‘The glass is still pretty empty’

Sometimes you wonder whether the glass is half full or half empty.

But when it is only filled for four percent—the other 96 percent is just air—there is only one conclusion: it is time for more.

At least that is what some scientists in the phylogenetic community argue, because only about four percent of all published phylogenies are stored in places such as TreeBASE or Dryad. Their message is quite simple: it is time to bring together more databases with estimations on how species are possibly related to each other.

Several journals in the evolutionary biology field recently adopted policies that encourage or require contributors to make their data publicly available online. Yet, this only leads to the storage of a very small percentage of ten-thousands of phylogenies that have been constructed in the past few decades.

Of course, there are also other ways to receive data that are not stored on the Internet, but those alternatives are commonly not the most efficient routes. For instance, it is possible to send an email to a scientist who published a phylogenetic tree and “sometimes wait for six months to maybe get a response—either with or without the data,” says Keith Crandall, one of the Open Tree of Life investigators and the founding director of the Computational Biology Institute at George Washington University.

(more…)


Puzzling:

Connecting millions of pieces

Creating the entire tree of life is like completing a jigsaw puzzle with more than two million pieces. And to make it even harder; the illustration of how the solved puzzle would look like is missing.

No one knows precisely how all pieces are related.

This disparity is unmistakably demonstrated by disagreements between evolutionary biologists about how certain species and branches are linked together. Throughout the years they have created a large variety of trees with specific groups of species that contradict each other. For example, one researcher maintains that species A is the closest living relative of species B, but another scientist thinks that species C is actually most closely related to B. (more…)


Help!

Wanted: All your favorite trees

With eleven investigators, the Open Tree of Life project is already a large-scale research endeavor. But that does not mean that they can add all 1.9 million known species to a database by themselves. In fact, they are looking for help.

A lot of help.

The main goal of the project is to merge all existing phylogenetic trees in one overarching tree of life. In the past few months, the researchers have been working on software applications to make it possible to store all known species and, more important, to specify how they are all linked to each other in evolutionary terms.

(more…)


Quiz time!

Dear Colleagues,

Put on your quiz hats! We need some good questions!

As our team works to build an Open Tree of Life for professionals we are also working on a educational version of the tree for the everyone else, meaning educators, students, and the public in general.This public site will have a FUN QUIZ to test people’s knowledge of evolution, and we need questions for it!

SUBMIT YOUR QUESTIONS HERE

SAMPLE QUESTIONS:
Easy:
• Sponges fall within which major group on the tree of life? (animal, plant, bacteria)
• Which are mushrooms more closely related to: (animals, red algae or plants?)
• How many origins of life were there on Earth? (1, 2, 3)
Medium:
• Which organisms represent the greatest biomass on Earth?
(bacteria and archaea, mammals, fish)
• How many major groups of organisms are represented in a ham sandwich? (1, 2, 3)
• Genes (i.e. portions of genomes) yield the same estimate for the ToL? (Yes, No, Sometimes)
Expert:
• The top 10 infectious agents on earth appear where on the tree? (bacteria only, in both bacteria and eukaryotes, in both bacterial and archaea)
• Each gene sequenced and analyzed yields the very same answer for the ToL? (Yes, No, Sometimes)You can submit up to three questions with this form, but feel free to submit more by starting a new one!

Follow

Get every new post delivered to your Inbox.

Join 252 other followers