This page describes large datasets of interest in Bgee, specifically how they were annotated and how to access the data.
In addition to the continuous growth of transcriptomics datasets, some specific projects produce large amounts of data, generated and accessible in a consistent manner, as, notably, the GTEx project. The GTEx project aims at building a comprehensive resource for tissue-specific gene expression in human. Here we describe how this dataset was integrated into Bgee.
We applied a stringent re-annotation process to the GTEx data to retain only healthy tissues and non-contaminated samples, using the information available under restricted access. For instance, we rejected all samples for 31% of subjects, deemed globally unhealthy from the pathology report (e.g., drug abuse, diabetes, BMI > 35), as well as specific samples from another 28% of subjects who had local pathologies (e.g., brain from Alzheimer patients). We also rejected samples with contamination from other tissues.
In total, only 50% of samples were kept; these represent a high quality subset of GTEx. All these samples were re-annotated manually to specific Uberon anatomy and aging terms.
The GTEx annotations can be browsed on our raw data interface: Curated GTEx data in Bgee.
All corresponding RNA-seq were reanalyzed in the Bgee pipeline, consistently with all other healthy RNA-seq from human and other species. These data are being made available both through the website, and through the BgeeDB R package (with sensitive information hidden).
SRP012682
.More information and examples can be found on the BgeeDB R package page.
SRP012682
.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("BgeeDB")
library(BgeeDB)
bgee <- Bgee$new(species = "Homo_sapiens", dataType = "rna_seq")
myAnnotation <- getAnnotation(bgee)
bgee <- Bgee$new(species = "Homo_sapiens", dataType = "rna_seq")
# This step can take a lot of time as all Bgee GTEx data have to be downloaded and uncompressed.
dataGTEx <- getData(bgee, experimentId = "SRP012682")
bgee <- Bgee$new(species = "Homo_sapiens")
myTopAnatData <- loadTopAnatData(bgee)
# Retrieve all genes with data in Bgee
allGenes <- unique(row.names(myTopAnatData$gene2anatomy))
# List of genes related to autism and epilepsy from Jabbari 2016
genesOfInterest <- c("ENSG00000183044", "ENSG00000085563", "ENSG00000006071", "ENSG00000153086",
"ENSG00000243989", "ENSG00000156110", "ENSG00000150594", "ENSG00000239900",
"ENSG00000141385", "ENSG00000038002", "ENSG00000142208", "ENSG00000275199",
"ENSG00000117020", "ENSG00000163631", "ENSG00000159423", "ENSG00000112294",
"ENSG00000164904", "ENSG00000033011", "ENSG00000182858", "ENSG00000101901",
"ENSG00000119523", "ENSG00000214160", "ENSG00000088035", "ENSG00000159063",
"ENSG00000086848", "ENSG00000242110", "ENSG00000137074", "ENSG00000124198",
"ENSG00000118520", "ENSG00000198844", "ENSG00000131089", "ENSG00000100299",
"ENSG00000113273", "ENSG00000004848", "ENSG00000104763", "ENSG00000108381",
"ENSG00000066279", "ENSG00000138363", "ENSG00000159363", "ENSG00000018625",
"ENSG00000174437", "ENSG00000182220", "ENSG00000185344", "ENSG00000171953",
"ENSG00000175054", "ENSG00000158321", "ENSG00000086062", "ENSG00000103507",
"ENSG00000074582", "ENSG00000176697", "ENSG00000157764", "ENSG00000106009",
"ENSG00000164061", "ENSG00000169814", "ENSG00000111678", "ENSG00000130921",
"ENSG00000131943", "ENSG00000197603", "ENSG00000141837", "ENSG00000007402",
"ENSG00000182389", "ENSG00000198668", "ENSG00000143933", "ENSG00000147044",
"ENSG00000036828", "ENSG00000110395", "ENSG00000015133", "ENSG00000108691",
"ENSG00000136861", "ENSG00000008086", "ENSG00000064309", "ENSG00000151849",
"ENSG00000103995", "ENSG00000173575", "ENSG00000100888", "ENSG00000168539",
"ENSG00000181072", "ENSG00000120903", "ENSG00000101204", "ENSG00000175344",
"ENSG00000274542", "ENSG00000160716", "ENSG00000114859", "ENSG00000073464",
"ENSG00000186510", "ENSG00000184908", "ENSG00000188603", "ENSG00000102805",
"ENSG00000128973", "ENSG00000182372", "ENSG00000278220", "ENSG00000184144",
"ENSG00000278728", "ENSG00000174469", "ENSG00000166685", "ENSG00000168434",
"ENSG00000213380", "ENSG00000142173", "ENSG00000173085", "ENSG00000088682",
"ENSG00000006695", "ENSG00000014919", "ENSG00000047457", "ENSG00000165078",
"ENSG00000157184", "ENSG00000169372", "ENSG00000147571", "ENSG00000160213",
"ENSG00000064601", "ENSG00000117984", "ENSG00000174080", "ENSG00000115827",
"ENSG00000077279", "ENSG00000100150", "ENSG00000181192", "ENSG00000091140",
"ENSG00000101152", "ENSG00000116675", "ENSG00000116641", "ENSG00000172269",
"ENSG00000000419", "ENSG00000136908", "ENSG00000179085", "ENSG00000188641",
"ENSG00000197102", "ENSG00000157540", "ENSG00000101210", "ENSG00000096093",
"ENSG00000111361", "ENSG00000119718", "ENSG00000070785", "ENSG00000115211",
"ENSG00000145191", "ENSG00000170370", "ENSG00000133216", "ENSG00000112425",
"ENSG00000178607", "ENSG00000140374", "ENSG00000105379", "ENSG00000171503",
"ENSG00000103089", "ENSG00000122591", "ENSG00000145982", "ENSG00000091483",
"ENSG00000112367", "ENSG00000196924", "ENSG00000162769", "ENSG00000119686",
"ENSG00000110195", "ENSG00000170345", "ENSG00000125740", "ENSG00000176165",
"ENSG00000160973", "ENSG00000087086", "ENSG00000179163", "ENSG00000022355",
"ENSG00000166206", "ENSG00000187730", "ENSG00000113327", "ENSG00000054983",
"ENSG00000141012", "ENSG00000130005", "ENSG00000171766", "ENSG00000105607",
"ENSG00000140905", "ENSG00000131095", "ENSG00000170266", "ENSG00000178445",
"ENSG00000074047", "ENSG00000145888", "ENSG00000109738", "ENSG00000173540",
"ENSG00000087258", "ENSG00000159921", "ENSG00000111670", "ENSG00000090581",
"ENSG00000135677", "ENSG00000108433", "ENSG00000171723", "ENSG00000233276",
"ENSG00000176884", "ENSG00000183454", "ENSG00000273079", "ENSG00000152822",
"ENSG00000169919", "ENSG00000138796", "ENSG00000170445", "ENSG00000172534",
"ENSG00000164588", "ENSG00000138622", "ENSG00000213614", "ENSG00000049860",
"ENSG00000165102", "ENSG00000153187", "ENSG00000158104", "ENSG00000174775",
"ENSG00000276536", "ENSG00000072506", "ENSG00000114378", "ENSG00000181873",
"ENSG00000010404", "ENSG00000127415", "ENSG00000134049", "ENSG00000166333",
"ENSG00000124313", "ENSG00000150995", "ENSG00000120071", "ENSG00000278458",
"ENSG00000275867", "ENSG00000111262", "ENSG00000169282", "ENSG00000069424",
"ENSG00000184408", "ENSG00000140015", "ENSG00000151704", "ENSG00000177807",
"ENSG00000187486", "ENSG00000156113", "ENSG00000281151", "ENSG00000075043",
"ENSG00000184156", "ENSG00000107147", "ENSG00000243335", "ENSG00000068796",
"ENSG00000276734", "ENSG00000168280", "ENSG00000185467", "ENSG00000118162",
"ENSG00000133703", "ENSG00000087299", "ENSG00000196569", "ENSG00000143815",
"ENSG00000108231", "ENSG00000121897", "ENSG00000138095", "ENSG00000187391",
"ENSG00000169032", "ENSG00000126934", "ENSG00000109339", "ENSG00000204406",
"ENSG00000090674", "ENSG00000147316", "ENSG00000169057", "ENSG00000081189",
"ENSG00000164073", "ENSG00000168282", "ENSG00000100427", "ENSG00000124615",
"ENSG00000164172", "ENSG00000129255", "ENSG00000178802", "ENSG00000177000",
"ENSG00000198793", "ENSG00000196091", "ENSG00000108784", "ENSG00000072864",
"ENSG00000275911", "ENSG00000125356", "ENSG00000131495", "ENSG00000023228",
"ENSG00000213619", "ENSG00000164258", "ENSG00000115286", "ENSG00000110717",
"ENSG00000167792", "ENSG00000049759", "ENSG00000223957", "ENSG00000204386",
"ENSG00000234343", "ENSG00000228691", "ENSG00000227129", "ENSG00000184494",
"ENSG00000227315", "ENSG00000234846", "ENSG00000196712", "ENSG00000151092",
"ENSG00000187566", "ENSG00000087303", "ENSG00000164190", "ENSG00000156574",
"ENSG00000074181", "ENSG00000141458", "ENSG00000119655", "ENSG00000122585",
"ENSG00000185149", "ENSG00000213281", "ENSG00000179915", "ENSG00000079482",
"ENSG00000116329", "ENSG00000112038", "ENSG00000187848", "ENSG00000135124",
"ENSG00000007168", "ENSG00000125779", "ENSG00000173599", "ENSG00000165194",
"ENSG00000160299", "ENSG00000131828", "ENSG00000148459", "ENSG00000164494",
"ENSG00000127980", "ENSG00000108733", "ENSG00000142655", "ENSG00000121680",
"ENSG00000164751", "ENSG00000215193", "ENSG00000034693", "ENSG00000139197",
"ENSG00000124587", "ENSG00000112357", "ENSG00000102144", "ENSG00000156531",
"ENSG00000092621", "ENSG00000165195", "ENSG00000108474", "ENSG00000165282",
"ENSG00000124155", "ENSG00000060642", "ENSG00000121879", "ENSG00000184381",
"ENSG00000182621", "ENSG00000123560", "ENSG00000140650", "ENSG00000039650",
"ENSG00000108439", "ENSG00000140521", "ENSG00000115138", "ENSG00000131238",
"ENSG00000102103", "ENSG00000139174", "ENSG00000163637", "ENSG00000100033",
"ENSG00000167371", "ENSG00000197746", "ENSG00000185920", "ENSG00000179295",
"ENSG00000172053", "ENSG00000151552", "ENSG00000155961", "ENSG00000132155",
"ENSG00000108557", "ENSG00000146282", "ENSG00000078328", "ENSG00000167281",
"ENSG00000189056", "ENSG00000163933", "ENSG00000155906", "ENSG00000104889",
"ENSG00000136104", "ENSG00000172922", "ENSG00000067836", "ENSG00000151835",
"ENSG00000101347", "ENSG00000138760", "ENSG00000144285", "ENSG00000105711",
"ENSG00000136531", "ENSG00000153253", "ENSG00000183873", "ENSG00000196876",
"ENSG00000169432", "ENSG00000130489", "ENSG00000073578", "ENSG00000178980",
"ENSG00000152217", "ENSG00000127990", "ENSG00000181523", "ENSG00000164690",
"ENSG00000108061", "ENSG00000138083", "ENSG00000064651", "ENSG00000124140",
"ENSG00000119899", "ENSG00000135917", "ENSG00000106688", "ENSG00000110436",
"ENSG00000079215", "ENSG00000102743", "ENSG00000125454", "ENSG00000177542",
"ENSG00000117394", "ENSG00000164414", "ENSG00000117620", "ENSG00000181830",
"ENSG00000076351", "ENSG00000144290", "ENSG00000142319", "ENSG00000276996",
"ENSG00000165970", "ENSG00000130821", "ENSG00000198689", "ENSG00000072501",
"ENSG00000108055", "ENSG00000166311", "ENSG00000102172", "ENSG00000163877",
"ENSG00000115904", "ENSG00000104450", "ENSG00000152583", "ENSG00000166068",
"ENSG00000197694", "ENSG00000102359", "ENSG00000126091", "ENSG00000115525",
"ENSG00000124356", "ENSG00000123473", "ENSG00000136854", "ENSG00000144455",
"ENSG00000139531", "ENSG00000148290", "ENSG00000008056", "ENSG00000197283",
"ENSG00000227460", "ENSG00000102003", "ENSG00000198198", "ENSG00000164458",
"ENSG00000136463", "ENSG00000143374", "ENSG00000054611", "ENSG00000162065",
"ENSG00000145979", "ENSG00000184058", "ENSG00000143178", "ENSG00000196628",
"ENSG00000177426", "ENSG00000175606", "ENSG00000061938", "ENSG00000166340",
"ENSG00000213689", "ENSG00000165699", "ENSG00000103197", "ENSG00000154743",
"ENSG00000274672", "ENSG00000275165", "ENSG00000274796", "ENSG00000274078",
"ENSG00000278712", "ENSG00000273896", "ENSG00000170892", "ENSG00000274129",
"ENSG00000278605", "ENSG00000278622", "ENSG00000182173", "ENSG00000175894",
"ENSG00000104833", "ENSG00000131462", "ENSG00000128159", "ENSG00000198431",
"ENSG00000114062", "ENSG00000104517", "ENSG00000173218", "ENSG00000137411",
"ENSG00000236178", "ENSG00000234032", "ENSG00000206476", "ENSG00000230985",
"ENSG00000223494", "ENSG00000213585", "ENSG00000165637", "ENSG00000197969",
"ENSG00000141252", "ENSG00000196998", "ENSG00000075702", "ENSG00000186153",
"ENSG00000169554", "ENSG00000043355")
# Build the gene vector for the analysis
geneList <- factor(as.integer(unique(allGenes) %in% genesOfInterest))
names(geneList) <- unique(allGenes)
# Run the test
myTopAnatObject <- topAnat(myTopAnatData, geneList)
resFis <- runTest(myTopAnatObject, algorithm ="elim", statistic ="fisher")
# Format results
tableOver <- makeTable(myTopAnatData, myTopAnatObject, resFis, 0.1)
tableOver
The adult Fly Cell Atlas (FCA) is a comprehensive single-cell transcriptomic atlas of Drosophila melanogaster, which includes 580k cells from 15 individually dissected sexed tissues, as well as from the entire head and body. It includes more than 250 distinct cell types across tissues.
In addition to using the fly-specific vocabularies for annotation (i.e., FBbt ontology), Bgee reconnects these data to species-neutral terms (i.e., from Uberon and CL ontologies) to enhance comparisons between species, while still conserving precise fly-specific terms when necessary. All annotations were verified and re-curated to ensure consistency between cell types and organismal information.
All corresponding scRNA-seq data were reanalyzed in the Bgee pipeline. These data are available both through the website and through the BgeeDB R package.
ERP129698
.