Abstract
BioCyc.org is a web portal for >20,000 sequenced microbes, including 54 Brucella genomes. BioCyc couples high-quality curated data with a wide range of easy-to-use bioinformatics tools. Each of the Brucella databases in BioCyc was constructed using a similar methodology. A series of computational inferences were applied to the annotated genome from RefSeq, including prediction of metabolic reactions and metabolic pathways, transport reactions, operons, Pfam domains, and orthologs with other genomes in BioCyc. When available, additional data were imported from related databases, including protein features and Gene Ontology terms from UniProt, protein localization data from PSORTDB, and gene essentiality data from OGEE. Manual curation was performed on the databases for B. abortus 2308 and B. ovis ATCC 25840 to integrate information from the experimental literature. Mini-review summaries, literature references, and evidence codes were authored for selected proteins and pathways. The resulting B. abortus database contains 224 metabolic pathways and 3,874 protein features and the B. ovis database contains 236 pathways and 3,745 protein features. A particular focus was curation of virulence factors, such as thelipopolysaccharide and type IV secretion system. The BioCyc website provides extensive bioinformatics tools for searching and analyzing these databases and leveraging them for analysis of omics datasets. Genome-related tools include a genome browser, sequence search and alignment tools, and extraction of sequence regions. Pathway-related tools include pathway diagrams and navigation of zoomable organism-specific metabolic map diagrams. Operons, regulatory sites, and the full regulatory network can be displayed when such data are present. Comparative analysis tools enable comparisons of genome organization, of orthologs, and of pathway complements. Omics data analysis tools support enrichment analysis and painting of transcriptomics and metabolomics data onto individual pathways and the full metabolic mapdiagrams. The Omics Dashboard tool enables hierarchical exploration of omics datasets. A unique feature called SmartTables enables users to construct and store tables of genes, metabolites, or pathways, and to perform multiple analyses, such as converting a list of genes or metabolites into a list of all pathways in which those genes/metabolites participate.References
P.D. Karp et al., "The BioCyc collection of microbial genomes and metabolic pathways," Briefings in Bioinformatics 20(4):1085-1093 (2019). https://doi.org/10.1093/bib/bbx085