Logo
Molecular Biology,  Libraries and SDKs

DNA Chisel

Date Published

DNA Chisel is an open-source Python library for systematic DNA sequence optimization developed at the Edinburgh Genome Foundry (MIT licence). It frames sequence design as a set of specifications: hard constraints that must be satisfied and optimization objectives whose scores are maximized. Users express design intent with built-in specification classes (over 15 types) or by writing custom specs in Python; problems can be created from raw sequences, annotated GenBank features, the command line, or the project's web interface (Sculpt_a_sequence). At its core DNA Chisel builds a mutation space and hunts down local constraint breaches and suboptimal regions. The solver recreates localized subproblems around each breach and applies targeted local searches (exhaustive or guided random search depending on mutation-space size). Specifications include pattern avoidance (restriction sites), GC-content windows, EnforceTranslation (to preserve amino-acid sequence), codon-optimization methods (use_best_codon / match_codon_usage / harmonize_rca), AvoidMatches (Bowtie-backed homology avoidance), primer/annealing constraints, hairpin avoidance, repeat/homology checks, and many more. Each specification can either restrict the mutation space to speed verification or contribute a weighted objective score to multi-objective runs. Typical workflows are flexible: script a DnaOptimizationProblem in Python to combine constraints and objectives, annotate a GenBank with features prefixed by @ (constraints) or ~ (objectives) and feed it to the CLI, or use the web app for interactive edits. Common use-cases include removing restriction-enzyme sites across both strands, codon-optimizing a CDS for E. coli (while preserving translation with EnforceTranslation), tuning local GC content windows, eliminating short homologies to a reference genome using a Bowtie index, harmonizing codon usage for heterologous expression, and designing primer regions that meet melting-temperature and specificity requirements. The library also provides convenience biotools: translation, reverse-complement, random sequence/protein generation, and utilities to compare and annotate SeqRecord differences via Biopython. Reporting and integrations are first-class: DNA Chisel can emit multi-file optimization reports with an annotated GenBank, plots of constraint breaches, and a PDF summary (requires optional dependencies such as Matplotlib and sequenticon). AvoidMatches relies on Bowtie (Bowtie2) to detect short homologies, and there is optional support for local BLAST+ (though AvoidMatches + Bowtie is recommended). Benchling is known to use DNA Chisel in its sequence-optimization pipeline, and Genome Collector can supply Bowtie indices for AvoidMatches. The codebase interoperates with Biopython SeqRecord objects and supports common motif formats (JASPAR, MEME) for PSSM-based pattern matching. Installation and contribution are straightforward: DNA Chisel targets Python 3 and is pip-installable (pip install dnachisel, or pip install 'dnachisel[reports]' for full report support). The project is actively maintained on GitHub, part of the EGF Codons toolset, and welcomes contributions. Because specifications are extensible and the optimizer supports both deterministic and stochastic search modes, DNA Chisel is suitable for automated design pipelines, batch optimization campaigns, and interactive sequence tuning where reproducible, constraint-aware edits are required.