TopAnat is an anatomical entity enrichment analysis tool based on the topGO R package.
It is similar to a GO enrichment test but rather than using Gene Ontology annotations it is based on anatomical Uberon annotations manually curated by Bgee. For example, given a set of genes that are up-regulated under certain conditions, TopAnat will find which anatomical entities have over or under-represented expression using annotations for that gene set.
On top of the page from left to right you have
The main section is the Gene list. This is where identifiers of your genes of interest must be entered. Be careful to provide gene identifiers (e.g. ENSG00000244734) and not gene names (e.g. HBB).
The Advanced Options section is closed by default.
To open that section, click on the corresponding dark grey banner. It contains options allowing you to tune both Bgee data used to process the enrichment analysis and the parameters of the algorithm itself. Read the Advanced options section for more details.
Below the Advanced Options section, the Email field allows you to receive an email once the analysis is complete. The Job description field allows you to give a title to your analysis.
At the bottom of the page the Submit your Job button allows you to submit an analysis. It is greyed out by default and becomes clickable once genes have been entered.
The entry point of TopAnat is a set of genes from one species you are interested in. In this quick start tutorial, we will focus on a set of pigmentation genes from rabbits. TopAnat will be used to detect in which anatomical entities the presence of expression of those genes is over or under-represented.
TopAnat uses gene identifiers (e.g. ENSG00000244734) and automatically detects the species of interest. You have to provide one gene identifier per line without space or delimiter, as shown in the screenshot below. The list of gene identifiers used in this example is available here.
Once you enter your list of genes the web interface is updated.
On top of your gene list, you can now see a sentence describing the number of genes you entered and the corresponding species. You can also see a picture of the species.
Additionally, two new subsections appeared: Background and Analysis options.
The Background subsection allows the user to select the universe of the analysis, which is described in the Properly choose your background section of this documentation. In our example, we keep the default background which corresponds to all genes from the species.
The Analysis options subsection allows the user to limit the analysis to expression data coming from a subset of the datatypes integrated in Bgee. In this example, we want to use as much data as possible and then do not modify the default behavior which is to select expression data coming from all available datatypes. To remove one datatype from your TopAnat analysis uncheck the corresponding datatype checkbox.
Now add your email address to receive an email once the processing of the analysis is over and enter the title Pigmentation genes in rabbit
to easily find the analysis when using the Recent jobs button. This title will also be used to name the email you will receive.
You are now ready to run TopAnat. Click on the Submit your job button and wait for your analysis to be processed on our server.
A TopAnat analysis can take up to 1 hour to finish processing. To leave the page without losing the results you have 2 options:
enter your email address: you will then receive an email containing a link to the results of your analysis
wait to see the page shown below and then bookmark the permanent URL of this page by clicking on Copy permanent link in the footer of the page
Once the processing is complete you will automatically be redirected to the result section of the web interface.
The header of this results section consists of a blue banner containing a sentence describing that the request was successful, the number of results, and the number of analyses launched.
Then, on top of the result table, you have the title of your analysis written in red.
Below the title on the left side, there is a light red button that allows you to download an archive containing the results of your analysis, as well as all the data to reproduce them. The following files are included in the download:
Below the title in the middle, there is a Filter field which allows you to perform a case-sensitive filter on all columns of the result table. For instance, in the Pigmentations genes in rabbit results coming from the analysis of the Quick start section, filtering with the word skin will return all anatomical entities containing the word skin and will show 5 results in the table.
To the right of the Filter field, a TSV button allows you to download the results table as a tabulated file.
In the same line, on the right side, you can change the number of lines visible in the results table. The default value is 20 but can be increased up to 1000.
the result table is composed of 8 columns:
The background, also called the universe, corresponds to the list of genes you want to consider in your analysis.
By default, the gene universe considered for the TopAnat enrichment analysis is all genes with data in Bgee for the selected species.
Let's imagine that you want to answer the question: where (which anatomical entities) are human genes enriched that are both present and differentially expressed in testis and ovary?
In this naive example, your topAnat Gene list will be the list of differentially expressed genes, and your background will consist of the list of all genes expressed in both testis and ovary.
It is possible to provide a custom gene universe as a list of gene IDs. To do so, click on the Custom data button.
As for your Gene list you have to enter one gene per line without space, quotes, or any delimiter. All gene IDs present in the foreground must be present in the background.
There are 2 types of advanced options. The first is related to the filtering of expression data used to run the enrichment test and the second is to tune the parameters of the enrichment algorithm itself.
By default, all developmental and life stages are considered for the enrichment analysis.
It is possible to remove a development stage by clicking on Custom stages and then unchecking the development stage(s) you are not interested in between embryo and post-embryonic stages.
For each expression call, Bgee assigns a level of confidence to the call: silver or gold. The Data quality option allows to specify whether the analysis should be based on data of any quality level (default) or data of high quality (Gold level) only. To limit to only high-quality calls, click on the Gold confidence button.
Decorrelation is an algorithm used to take into account the topology of the anatomical ontology, to decrease the number of false positives and highly general terms in the results, owing to the inheritance problem. A precise description of these algorithms can be found in the topGO documentation. Please note that using these decorrelation methods greatly increases the analysis time. By default, a Fisher test without any decorrelation is performed.
This parameter allows pruning of the anatomical ontology from the terms that have a number of genes with data lower than this cutoff.
The number of significant nodes to be displayed in the generated graph of results. The parameter has a visualization purpose only and has no impact on the results of the analysis.
Anatomical terms with an FDR higher than this threshold will not be considered as significant.
Anatomical terms with a p-value higher than this threshold will not be considered as significant.