Abstract visualization of a synthetic enzyme binding a plastic polymer chain, rendered with AI-style molecular graphics.
Synthetic enzyme designed with generative AI docking to a PET-like polymer fragment.

De Novo Protein Design: Using Generative AI to Create Synthetic Enzymes that Eat Plastic Waste

A practical developer guide to designing, validating, and iterating synthetic plastic-degrading enzymes with generative AI and computational pipelines.

De Novo Protein Design: Using Generative AI to Create Synthetic Enzymes that Eat Plastic Waste

Intro — why this matters to engineers

Plastic pollution is a trillion-piece materials problem. Biocatalysis offers a promising route: enzymes such as PETase and MHETase can depolymerize PET into monomers. But natural enzymes are often slow, unstable, or poorly expressed. De novo protein design—creating sequences and folds that never existed in nature—combined with generative AI, gives engineers a practical path to synthesize enzymes tailored to degrade specific plastics at industrial scales.

This post is a sharp, practical guide for developers building computational pipelines: which models and tools to use, how to validate designs in silico, how to prepare libraries for wet-lab testing, and what safety and ethical checks to adopt.

How de novo design pipelines are structured

At a high level the pipeline splits into four phases:

Each phase has automated components you can stitch together with workflow tools (Airflow, Nextflow, or a simple Python script). The common pattern: generate many candidates, apply increasingly expensive filters, and output a manageable library for wet-lab screening.

Core building blocks (tools)

Practical pipeline for developers

Here is a minimal, practical pipeline you can implement:

  1. Define your substrate model (a PET oligomer, a fragment representing the ester bond you want to cleave).
  2. Choose a catalytic template: use known active-site motifs (e.g., serine-histidine-aspartate triad) or extract catalytic residues from PETase-like structures.
  3. Generate backbone scaffolds either by: (a) scaffolding known folds around the motif, (b) hallucinating new backbones with generative models.
  4. Design sequences for each backbone using ProteinMPNN or language-model-guided sampling.
  5. Predict structures with AlphaFold / ESMFold and filter for RMSD to scaffold and confidence metrics (pLDDT).
  6. Score remaining candidates with Rosetta energy, solubility predictors, and docking against the substrate model.
  7. Run short MD simulations for top candidates to check active-site integrity.
  8. Output a prioritized library (dozen to a few hundred variants) for synthesis and experimental screening.

Example pseudocode (developer-friendly)

# pseudo-Python pipeline outline
scaffold_list = generate_backbones(template_motif)
sequence_pool = []
for backbone in scaffold_list:
    seqs = protein_mpn_design(backbone, num_samples=50)
    sequence_pool.extend(seqs)

predicted = []
for seq in sequence_pool:
    structure = alphafold_predict(seq)
    if structure.pLDDT_mean > 70 and rmsd_to_backbone(structure, backbone) < 2.5:
        predicted.append((seq, structure))

scored = score_with_rosetta(predicted)
top_candidates = select_top(scored, n=100)
run_md_on(top_candidates)
final_lib = prioritize_for_synthesis(top_candidates, criteria=[stability, docking_score, expressibility])

Note: function names are illustrative. Replace with actual library calls or API clients.

Designing for plastic-degrading activity

Plastic degradation has unique constraints compared with small-molecule substrates:

Key engineering targets:

Metrics to optimize:

In silico validation: what to run and why

Automation tips:

Wet-lab handoff and iteration

Design begins in silico but lives or dies in the lab:

Limitations, risks, and ethics

> Practical developers must treat wet-lab validation and containment as first-class concerns, not afterthoughts.

Summary and checklist

Checklist for a first experimental run:

De novo enzyme design for plastic degradation is now practical for developer teams with computational resources. The bottleneck is no longer imagination but rigorous filtering, safe experimental design, and a tight computational–experimental loop. Follow the pipeline above, instrument your workflow, and iterate based on real activity data—this is how generative AI moves from novelty to impact.

Related

Get sharp weekly insights