Species trees

The Compara pipelines use two main species trees.

  1. The NCBI taxonomy (i.e. topology only) in:
    • the Protein-trees pipeline
    • the ncRNA-trees pipeline, where the three sub-trees Eutheria, Sauria, and Clupeocephala are flattened
    • the CAFE pipelines (Gene Gain/Loss trees), with branch lengths coming from the TimeTree database
  2. A tree with branch-lengths computed in-house (available for download here) in: We pass the unmasked whole genome sequences to Mash to compute pairwise distances, and then we generate the species tree using distance-based neighbour-joining guided by taxonomy.