DDBJ

Background and mission The DNA Data Bank of Japan (DDBJ) is operated by the Bioinformation and DDBJ Center at the National Institute of Genetics (NIG). As the Japanese node of the International Nucleotide Sequence Database Collaboration (INSDC), DDBJ collects, curates and distributes nucleotide sequence data and related biological metadata, issues internationally recognized accession numbers, and exchanges released data daily with ENA/EBI and NCBI so that the international sequence database remains synchronized. DDBJ’s mission is to support life-science research by providing open, well-annotated sequence data together with computational infrastructure, training and policies for both unrestricted and controlled-access human data. Core databases and data access DDBJ operates several major archives and specialized repositories: the DDBJ Sequence Read Archive (DRA) for raw high-throughput sequencing reads and alignment information; the DDBJ Omics Archive and Genomic Expression Archive (GEA) for functional genomics experiments; the Japanese Genotype-phenotype Archive (JGA) for controlled-access individual-level human genotype and phenotype data; and MetaboBank for metabolomics datasets. The center also provides BioProject, BioSample and Assembly records that link diverse data types for a single project, plus variant repositories such as TogoVar and tools for retrieving annotated/assembled records by accession or keyword. Public (unrestricted-access) data are available for download and programmatic retrieval, while controlled-access human datasets require compliance with NBDC human-data sharing guidelines and appropriate access approvals. Analysis services, submission and programmatic integration Beyond archival services, DDBJ provides a suite of analysis and submission utilities. DFAST is an automated annotation service for prokaryotic genomes; DDBJ offers cloud- and supercomputer-backed annotation pipelines (e.g., the DDBJ Read Annotation Pipeline and the DDBJ Pipeline) for high-throughput processing. Sapporo is a workflow execution service that promotes reuse of bioinformatics workflows across languages, and WABI (Web API for Biology) exposes web APIs for programmatic search, retrieval and common bioinformatics operations. The site includes interactive web wizards for nucleotide submission, a mass-submission system for large projects, and vector-search tools for contamination screening. DDBJ also operates the NIG Supercomputer system to provide computational resources for large-scale analyses and hosts the DDBJ Group Cloud for sharing pre-publication datasets. Use cases, training and governance Typical use cases include submitting new sequence data (from individual runs to whole-genome projects), retrieving SRA/assembled/annotated records for comparative genomics or metagenomics, running automated prokaryotic genome annotation with DFAST, integrating WABI endpoints into custom analysis pipelines, or requesting controlled access to human datasets in JGA under NBDC guidelines. DDBJ runs training courses (D-STEP and other workshops) and publishes updates and progress reports describing new resources and system enhancements. Content on DDBJ websites is provided under CC‑BY 4.0 where indicated; users are expected to credit the center, follow terms of use and respect controlled-access rules for human data. For integration, DDBJ’s daily data exchange with INSDC partners and its Web APIs make it straightforward to incorporate DDBJ records and services into global bioinformatics workflows and local compute environments.

Links