Bgee: Gene Expression Evolution

This is a previous release of Bgee. Access last version of Bgee.

News

2012 10th Dec

New version of the database (Bgee Release 12)

  • RNA-Seq data: this release introduces the use of RNA-Seq data in Bgee!
  • Affymetrix dataset: our filtered dataset is now available for download.
  • As usual, all data updated to the latest version of the data sources

RNA-Seq data

Bgee now provides RNA-Seq based present/absent expression calls. We have integrated an experimental dataset of 33 RNA-Seq libraries in human and mouse, from the experiment GSE30352. Our approach is based on Hebenstreit et al., Mol. Syst. Biol., 2011, that uses the intergenic background transcription level, to distinguish biologically relevant signal of expression from experimental noise or background activity of the transcription machinery. More details are available in our documentation.

You can now for instance retrieved all human genes with expression based on RNA-Seq data, or all mouse genes with expression based on RNA-Seq data, or eventually find homologous genes in human and mouse with homologous expression in the cerebellum, based on RNA-Seq data. As usual, all our data can be directly downloaded, a description of the data structure is provided in the documentation.

This procedure is still experimental and might be subject to change in the next releases. Please note that "absent" expression calls are only available for now from direct data download.

Filtered Affymetrix dataset

In the release 11 of Bgee, we have introduced new quality controls, and identified as much as 13.6% of duplicated content in our Affymetrix dataset from GEO and ArrayExpress. We now provide for direct download the raw data of these 12,875 normal, annotated, high quality, duplicated-free Affymetrix chips, on a FTP server.

Other changes

  • Bgee is now based on Ensembl release 69.
  • All data updated to the latest versions of the data sources.
  • Update of the vHOG and HOG ontologies.
  • We moved our data available for download to a FTP server. Also, the data for the current release will from now always be available in the same directory, so that the links to download the last version of our data will always be the same. Check these changes out in our download section.
  • Modification of the autocompletion tool to search for genes, on the basic expression search tool: searches are now triggered after entering two letters (three letters in the previous versions); searches are now performed also on gene IDs; when selecting a gene, the corresponding species is now automatically selected; a warning message is now displayed if the selection of species is incompatible with the selection of genes.
2012 26th Jul

New version of the database (Bgee Release 11)

This release notably introduces major changes in the Affymetrix data analyses, and provides a unique dataset of duplicated-free, high quality Affymetrix data. It also introduces a new human developmental ontology, to which human data have been reannotated.

Advanced users should read this release note carefully.

The database is now based on [Ensembl release 67] (May 2012), and is updated to the latest version of all data sources.

Major modifications of the Affymetrix analyses

Affymetrix chips are now filtered before inclusion into Bgee, based on new quality measurements, on the identification of duplicated content, and on the control of chip types for incompatibility or errors:

We have developed a new quality measurement to filter Affymetrix chips, used in combination with MAS5 percent present:

  • arIQR score: the distribution of the average rank of probesets, computed using the rank of probes, based on their signal intensity ("Inter Quartile Range of average rank").
  • percent present score: percentage of probesets identified as "present" by MAS5.
We have determined thresholds for each chip type independently, based on the distribution of the quality scores, computed from all data available in [GEO]. These measurements seem to be better estimators of quality than other methods previously used. More details are available in the documentation. The quality score thresholds for each chip type used in Bgee are available here, and from the database (see Chip types table).

We have identified duplicated content in the source databases: fully or partially duplicated experiments from independent data submissions, duplicated chips used in several experiments, duplicated chips inside experiments. We have implemented a procedure to identify and remove such duplicates, detailed in the documentation.

We have identified chip types incompatible with the analyses used in Bgee, and have removed them. The list of incompatible chip types is available here, and from the database (see Chip types table). We have also identified CEL files for which the chip type provided in the experiment description was wrong. We have corrected this problem using the CDF name present in CEL files.

Some Affymetrix chip and experiment IDs used in Bgee 10 were obsolete, because they have changed in the source databases. All IDs used in Bgee have been resynchronized with the source databases. The correspondences between IDs previously used and the IDs used, starting from Bgee 11, are available here. We now use GEO IDs by default, and ArrayExpress IDs for experiments not present in GEO.

Advanced users: because of this modification, the field affymetrixChipId was not unique anymore in the Bgee database. This results in some database schema modifications. See the affymetrix chips table description in the documentation for more details.

New human developmental ontology

We have developed a new human developmental ontology, in coordination with [neXtProt]. All human Affymetrix data have been reannotated to this more granular ontology. EST data will be reannotated for the next release. This results in some modifications in the differential expression analyses, see the documentation for more details.

Update of the Drosophila in situ data from BDGP

Bgee was still based on the [outdated release 2] of BDGP. Bgee is now based on [the release 3 of BDGP], but currently includes only a subset of the data, because of problems regarding term annotations. We will try to improve this import for future releases.

Other changes

  • For advanced users: URLs to perform API calls to BgeeMart have changed. A documentation to perform API calls should be provided in the future. Meanwhile, manual queries should first be performed to obtain correctly formatted URLs.
  • Bgee now uses eVoc 2.9 instead of eVoc 2.7 for the human adult anatomical ontology EV.
  • The vHOG and HOG ontologies were updated using the latest versions of the species-specific anatomical ontologies used, and are available from the download section.
2011 1st Dec

New version of the database (Bgee Release 10)

  • The database is now based on [Ensembl release 64] (September 2011).
  • Some incorrect mappings from Affymetrix probesets to Ensembl genes resulted in the absence of a chip type in Bgee (Affymetrix GeneChip Mouse Genome 430 2.0). This has been corrected.
  • In situ assays from MGI that include anatomical structures not present in the ontologies used by Bgee are not included anymore. They were previously partially included.
  • As usual, all data updated to the latest version of the data sources. Bgee now includes 13,560 Affymetrix chips and 3,364 EST libraries annotated by our curators, as well as 231,992 in situ evidences.

Update of the multi-species HOG and vHOG ontologies

  • The homology-strict multi-species HOG ontology has been updated to the latest versions of the linked species-specific anatomical ontologies, and includes some new terms. The ontology now includes 1,175 terms and involves 5,192 species-specific terms. Check it out in our download section.
  • The corresponding CARO-compliant, vertebrates-only, homology-strict, vHOG ontology has also been updated accordingly. The ontology now includes 1,184 terms and involves 5,129 species-specific terms. Check it out in our download section.
2011 27th May

New version of the database (Bgee Release 09)

  • The database is now based on [Ensembl release 62] (13 April 2011).
  • The namespace ID rules of developmental ontologies used in Bgee have been changed. Previously, these ontologies included a mix of ID rules, to distinguish between terms created by Bgee curators, and by source databases (for instance, "OGES:000007: Hatching" created by Bgee, or "ZFS:0000033: Hatching:Long-pec" created by Zfin). IDs will now be consistent, and xrefs to original data sources will be provided when available. A mapping file between previous and new IDs is available here.
  • 525 Affymetrix chips from [ArrayExpress] have been manually curated and annotated to ontologies by our curators, for human (18), mouse (290), fly (199), and zebrafish (18).
  • 9,231 in situ evidences have been added, coming from [ZFIN] for zebrafish (572), and [MGI] for mouse (8,659).
  • 52 EST libraries from [UniGene] for fly have been annotated and added.
  • As usual, all data updated to the latest version of the data sources.

Update of the multi-species HOG ontology

  • The homology-strict multi-species HOG ontology has been updated to the latest versions of the linked species-specific anatomical ontologies, and 60 terms have also been added by our curators. The HOG ontology now includes 1,167 terms, involving 5,180 species-specific anatomical structures from 5 species. Check it out in our download section.
  • The corresponding CARO-compliant, vertebrates-only, homology-strict, vHOG ontology has also been updated accordingly. Check it out in our download section.
2011 20th Jan

New version of the database (Bgee Release 08)

  • The database is now based on [Ensembl release 60] (8 November 2010).
  • Data quality assignment is now more stringent, and very low quality data are now removed. For more details, see the data analyses documentation, Affymetrix data description (section "Affymetrix probesets"), and in situ data description (section "in situ spots").
  • 1,182 Affymetrix chips from [ArrayExpress] have been manually curated and annotated to ontologies by our curators, for the 5 species included in Bgee.
  • 123,739 in situ evidences have been added, coming from [ZFIN], [Xenbase], [MGI], and [BDGP].
  • 52 EST libraries from [UniGene] for Drosophila have been annotated and added.
  • Bug correction: a part of the new in situ evidences come from the correction of a bug in our pipeline regarding the import of data from [Xenbase].
  • Other bug corrections: miRNAs gene families are now properly defined, and selecting the gene biotype "miRNAs" from the interface should return proper results.
  • As usual, all data updated to the last version of the data sources.

New multi-species HOG ontology

  • The homology-strict multi-species HOG ontology has been updated and now includes 1,107 terms, involving 4,986 species-specific anatomical structures from 5 species. New references have been added. Check it out in our download section.
  • We also have developed a CARO-compliant, vertebrates-only, homology-strict, ontology called vHOG. Check it out in our download section.
2010 27th Apr

New version of the database (Bgee Release 07)

  • The database is now based on [Ensembl release 57] (3 March 2010).
  • Bgee now includes in situ hybridizations from [Xenbase] for Xenopus tropicalis.
  • Bgee now includes in situ hybridizations from [BDGP] and Affymetrix data from [ArrayExpress] for Drosphila melanogaster.
  • Bgee now includes "over-expression" and "no expression" information. These data are not yet available from the website interface, but can be retrieved from the download section. More details can be found in the data analyses documentation and the files available for download description.
  • The multi-species HOG ontology has been updated to be more "homology-strict": only historical homology relationships are now used to build the ontology. The HOG ontology now includes 1,002 terms, with 1,411 relationships amongst them, and involving 4,459 species-specific anatomical structures from 5 species. All undefined relationships ("broader_than") amongst terms have been manually reviewed.
  • A new expression search tool has been developed, intended to make easier single-species and cross-species queries.
  • 3,687 Affymetrix chips from [ArrayExpress] have been manually annotated and added into Bgee. Bgee now includes 11,853 Affymetrix chips.
  • 14,608 in situ experiments, mainly for Drosophila and Xenopus, have been added. Bgee now includes 30,599 in situ experiments.
  • Expression data and species-specific ontologies updated.

The files available for download are up-to-date.

2009 17th Sep Major update of Bgee, regarding the data, the application, and new services proposed!

New version of the database (Bgee Release 06)

  • The database is now based on [Ensembl release 55] (14 July 2009).
  • Drosophila melanogaster has been added to the database, including anatomy and development, and EST expression data.
  • Bgee now includes expression data for miRNAs. A "data parameters" form allows you to display expression data for a specific gene type, including miRNAs.
  • 115 EST libraries from [UniGene] have been manually annotated and added into Bgee. Bgee now includes 3,364 EST libraries.
  • 33 in situ experiments from [ZFIN] have been added. Bgee now includes 13,534 in situ experiments.
  • The Homologous Organs Groups (HOGs) have been updated: Bgee currently integrates 1,286 HOGs, which involve 5,509 anatomical structures (195 more than in the previous release).

The files available for download have been updated.

New DAS webservice

Bgee now proposes a DAS webservice. You can for instance access it from [Ensembl]:

  1. On a gene page (for instance [zebrafish dlx4a]), click on the link "Manage your data" on the left, click on the link "Attach DAS" on the popup window.
  2. Enter "bgee" in "Filter sources" and click "Next".
  3. Wait for the Bgee DAS to appear.
  4. Click "Configure folder" and check "Bgee" in the checkbox.
  5. Click "Next" and wait for a confirmation message to appear.

You will then be able to display expression data from Bgee on Ensembl, by using the menu "External data" on the left. See the documentation in Ensembl for more information: http://www.ensembl.org/info/data/ensembl_das.html

New version of the application

  • Bgee should now be faster: many algorithms have been greatly optimized
  • Bgee now displays a hold message on pages involving intensive computations.
2009 25th Jun

New version of the database (Bgee Release 05)

  • The database is now based on Ensembl Release 54.
  • 3314 Affymetrix chips from ArrayExpress, for human and mouse, have been manually annotated and added into Bgee. Bgee now includes 8166 Affymetrix chips.
  • Bgee now includes in situ hybridization data for the adult mouse from [MGI].
  • The Homologous Organs Groups (HOGs) have been updated: Bgee currently integrates 1241 HOGs, which involve 5314 anatomical structures (1253 more than in the previous release).

The files available for download have been updated.

2009 19th Feb

New version of the database (Bgee Release 03)

  • The database is now based on Ensembl Release 52.
  • Bgee now includes in situ hybridization data for the mouse from [MGI].
  • The Homologous Organs Groups (HOGs) have been updated: Bgee currently integrates 1003 HOGs, which involve 4061 anatomical structures.

The files available for download have been updated.

New interfaces

The interfaces have been updated to make them more understandable, notably by the add of tips explaining how to use the Bgee tools.

2008 30th Nov Important performance issues when using BgeeMart have been fixed.
2008 24th Nov Major update of Bgee, regarding both the data and the application !

New version of the database (Bgee Release 02)

  • 547 Homologous Organs Groups have been manually reviewed and extended. 2,124 anatomical structures are now directly evolved in a homology relationship.
  • 4403 Affymetrix chips have been manually annotated, and expression data added in the database.
  • The database is now based on EnsEMBL Release 50.

Data are now fully available: new download section

  • The full content of the Bgee database is now available as TSV files.
  • The homology relationships are now provided as an OBO file with an association file.
  • The Homolonto software and source code, used to generate the homology relationships, are released.
  • The developmental ontologies and the ontology of the metastages [?] in OBO format, and the mapping between developmental stages and metastages, are available.

New data query and export facilities: BgeeMart

BgeeMart is a search engine inspired from BioMart. It allows to search for, and compare, gene expression across species. Data retrieved can be exported as html, tsv or csv.

More detailed information about expression data

The pages displaying information about expression data has been extensively modified to provide more details and better linking to the original data sources. See for instance all the expression data retrieved for the gene ENSDARG00000016454.

New documentation

A documentation has been added. It currently describes all the data stored in Bgee, explains the process of generating homology relationships, and the statistical analysis applied to the data. The "About" page has also been updated to be more precise.

Other modifications

  • The basic search engine has been optimized.
  • When comparing the expression patterns of homologous genes, a distinction is now made between "no expression data" and "no gene expression detection". See for instance the gene family fam52v00000000374.
  • On the gene pages and gene family pages, a link is now provided to retrieve all the expression data concerning this gene / gene family. See for instance the page for the gene ENSDARG00000016454, and the gene family fam52v00000000374.
  • When retrieving the genes expressed in an anatomical structure or at a developmental stage, from an ontology browsing page, the page will now uses a BgeeMart query, providing more information and export facilities. See for instance the zebrafish genes expressed at the stage embryo.
  • Bugs when using Internet Explorer have been fixed.
2008 12th Sep Help tooltips updated with more accurate descriptions.
Update of the interface of the gene family pages (e.g. fam52v00000000374).
2008 30th Jul Expression data (EST, Affymetrix, ...) can now be retrieved, and link to the source databases
2008 25th Jun Help tooltips added
2008 9th Jun In situ hybridization data from ZFIN added.
New Affymetrix data from ArrayExpress annotated and added.
Gene expression patterns are now display as a developmental ontology browsing, an anatomical ontology browsing, and a table view (and not anymore only as an anatomical ontology browsing)
2008 11th Apr Bug fixes.
Interface modifications.
SQL queries optimization to speed up page display.
2008 21st Mar New version of Bgee.
Species Xenopus tropicalis added.
2007 27th Nov Easier search for homologous organs / Developmental stages mapping
2007 23rd Nov Gene family view added
2007 21st Nov Homology relationships between organs updated.
2007 16th Nov News module added (Enjoy!).
2007 12th Oct Mouse adult anatomy ontology added.
2007 11th Oct Homology relationships between organs updated.
2007 31st Aug Affymetrix microarray data added.