Compara database

From VectorBase Development

Jump to: navigation, search

from June 2006

correct as of May 2008

Contents

Compara v.46

  • date: Dec-07 / Jan-08
  • who? K.Megy to Rob
  • what's new:
    • Culex gene set
  • comments:
    • DNA/DNA: 3 mosquitoes, Drosophila
    • Orthologs/paralogs: 3 mosquitoes, Drosophila, Human and C.elegans
    • uses the 6-species tree
    • 'new' orthologs/paralogs pipeline

Compara v.45

  • date: Jul-07
  • who? K.Megy to EO Stinson
  • what's new:
    • new Anopheles gene set
  • comments:
    • extract five species from Compara@Ensembl
    • uses the 5-species tree
    • 'new' orthologs/paralogs pipeline

Compara v.41

  • date: 10-Nov-06
  • who? K.Megy to EO Stinson
  • what's new:
    • run Compara pipeline with our 5 species
    • uses the 5-species tree
  • comments:
    • more 'paralogs' than with the all-species tree. Compara pipeline will use the longest transcript of each gene, and cluster them (diff. species). If the clusters are too big, which happen often when using all the Ensembl species, the clusters will be divided in 2. Both clusters will then be treated independently. So two genes could be identified as paralogs (within_species or between_species) if they were put in the same cluster, or not appear being related if they were put in different clusters. This artefact should be solved in the next Compara release by using NJ-tree. In VB case, most of the time we will have small clusters, so they won't be divided and all the genes hat are related will be shown as related.
    • more 'orthologs' and less 'apparent_orthologs'. Same reason as above .. and because there are less species, less 'fake' loss of genes have to be created.
    • missing Aedes homolog for some Anophele gene (eg: ENSANGG00000000004 - histone) - Was there when using the all-species tree. Explanation-1: that both genes might have been put in different cluster (hypothesis .. but I'm not convinced by it!). The genes were found as paralogs and not ortholog ... the tree is very flat for the histones ... in this case, cDNA should be used instead of proteins. Will change when using NJ-Joining (next release).

Compara v.40

  • date: Aug-06
  • who? K.Megy to EO Stinson
  • what's new:
    • extract five species from Compara@Ensembl
    • uses the all-species tree
    • 'new' orthologs/paralogs pipeline

Compara v.39

  • date: June-06
  • who? K.Megy to EO Stinson
  • what's new:
    • copy of Compara@Ensembl
    • 'old' orthologs/paralogs pipeline
Personal tools