Digital Diversity

 

The Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures establishes an integrated suite of scientific databases of fundamental relevance for the life sciences. The databases include BRENDA, SILVA, LPSN, BacDive and TYGS which are developed in a highly coordinated manner. A dedicated team of 22 scientists and software engineers builds a common research data infrastructure to significantly improve the access to a plethora of curated and standardized biological data. The resulting data platform DSMZ Digital Diversity will provide an integrated resource and will allow to link and comprehensively analyse the different types of scientific data from all domains of life sciences.

  1. Resources
  2. Open positions
  3. About DSMZ Digital Diversity

1. Short description of the resources

The BRENDA Enzyme Database is the world's most comprehensive enzyme repository. It is an electronic resource that comprises molecular and biochemical information on enzymes that have been classified by the IUBMB. BRENDA contains enzyme-specific data manually extracted from primary scientific literature and additional data derived from automatic information retrieval methods such as text mining. It provides a web-based user interface that allows a convenient and sophisticated access to the data.

SILVA is a comprehensive, quality-controlled web resource for up-to-date aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains alongside supplementary online services. In addition to data products, SILVA provides various online tools such as alignment and classification, phylogenetic tree calculation and viewer, probe/primer matching, and an amplicon analysis pipeline. With every full release a curated guide tree is provided that contains the latest taxonomy and nomenclature based on multiple references. The SILVA project published its initial release in 2007 and has become a mature resource of rRNA sequence data and is acknowledged as an ELIXIR Core Data Resource which “are a set of European data resources of fundamental importance to the wider life-science community and the long-term preservation of biological data.” SILVA is also part of the German Network for Bioinformatics Infrastructure (de.NBI) and ELIXIR Germany.

Because of the rapid changes in prokaryotic nomenclature and the continued proposal of new names, expert-curated and semi-automated database systems are needed to keep track of the current nomenclature. The List of Prokaryotic names with Standing in Nomenclature (LPSN) represents an influential and authoritative resources that was first introduced by Jean P. Euzéby more than two decades ago as a manually curated database. After the move of LPSN to DSMZ in 2020 a new database infrastructure, automated data import routines, a database-driven web interface and a plethora of content-related additions were established. LPSN is recognized by the Web of Science as a highly cited resource and is the nomenclatural basis for many popular international databases.

The Bacterial Diversity Metadatabase BacDive is the worldwide largest database for standardized bacterial phenotypic information. Its mission is to mobilize research data on strain level and to make them freely accessible. The database offers systematic access to phenotypic data of 89,545 prokaryotic strains and thereby enables to find strains by their traits using the Advanced search functions. BacDive acts as central hub for strain data and therefore interlinks information from sources like sequence, taxonomic, metabolic as well as literature databases.

The Type (Strain) Genome Server (TYGS) is a user-friendly high-throughput web server for genome-based prokaryote taxonomy, connected to a terabyte-scale, continuously growing database of genomic, taxonomic and nomenclatural information. It infers whole-genome-based phylogenies and state-of-the-art estimates for species and subspecies boundaries from either user-defined or automatically determined closest type genome sequences. Results include access to nomenclature, synonymy and associated taxonomic literature. Since its start, TYGS has already conducted more than 50K analyses for thousands of scientists worldwide and has been recognized by the Web of Science as a highly cited resource. Recently emerging applications from metagenomics and clinical microbiology further expand the application of TYGS.

 

BRENDA, SILVA and BacDive are part of the German Network for Bioinformatics Infrastructure (de.NBI) and the European Network ELIXIR. In 2017 SILVA and BRENDA became acknowledged as an ELIXIR Core Data Resource which “are a set of European data resources of fundamental importance to the wider life-science”.

 

3. About the DSMZ Digital Diversity Infrastructure:

The Leibniz Institute DSMZ is an internationally-renowned biological resource center, research infrastructure, and service provider for academia and industry with one of the largest and most diverse microbial and cell culture collections worldwide. The institute is equipped with state-of-the art instruments for microbiological, chemical, and molecular work, and runs its own next-generation sequencing, metabolomics, and bioinformatics core facilities. In order to cover the increasing demand of scientists for digital bioressources, DSMZ has developed several key web services, including the bacterial metadatabase BacDive, the database on prokaryotic nomenclature LPSN, and the type strain genome server TYGS. These databases have quickly become important and reliable resources that are already highly used by researchers from all over the world. Based on this development, DSMZ initiated DSMZ Digital Diversity to build up a comprehensive and integrated biodata infrastructure. Its mission is to significantly improve the access and analysis options for high-quality research data by gathering and interlinking scientific databases of fundamental importance for the life sciences. As first step, forces are joined with the two ELIXIR core data resources BRENDA and SILVA to generate an integrated infrastructure.

Details about our current server infrastructure.

For further questions please contact Dr. Lorenz Reimer (Lorenz.Reimer(at)dsmz.de)