Bioprocess Optimization via Evolutionary Algorithms

Abstract

A fed-batch bioreactor producing L-asparaginase in metabolically engineered E. coli, modeled as a stiff nine-state ODE that couples Monod growth kinetics, closed mass balances, and an energy balance under PID temperature control. Two gradient-free optimizers run on top of the same simulator and solve different problems. A particle swarm (PySwarm) tunes the PID gains that hold the broth at its 328.5 K setpoint. A genetic algorithm (DEAP) evolves the reactor design to maximize batch profit. The simulator is the fitness function; neither optimizer touches the kinetics. The model's verdict is blunt: under wild-type metabolism the process loses money, and it clears a profit only once acetate overflow is engineered out of the strain.

keywords: fed-batch · Monod kinetics · PID control · evolutionary computation · particle swarm · process economics

## The simulator as fitness function

State runs over a 24-hour batch: substrate glucose S, biomass X, product P, broth volume V, dissolved oxygen O₂, acetate A, internal energy U, and temperature T. Growth follows a Monod rate that glucose and oxygen limit, and that acetate and excess glucose inhibit:

μ = μ_max S / (K_s + S) · (1 − A/K_i,A) · (1 − S/K_i,S) · O₂/(K_O2 + O₂) (1)

Product is growth-associated, q_p = Y_pxμ, and substrate feeds both biomass and product, so the carbon balance stays closed:

dX/dt = μX − (F/V)X , dS/dt = −μX/Y_xs − q_pX/Y_ps + (F/V)(S_f − S) (2)

dP/dt = q_pX − (F/V)P , dA/dt = Y_axμX − (F/V)A , dV/dt = F (3)

Volume rises at the feed rate F until the vessel fills, then feeding stops. An energy balance tracks T against the heat of fermentation, agitation power, and feed enthalpy. A PID controller closes the loop by setting the heat-exchanger area each step, holding the broth at the 328.5 K setpoint. Batch profit nets product value against glucose and seed-culture cost:

J = c_P P(t_f)V(t_f) − c_S(S₀V₀ + F S_ft_f) − c_X X₀V₀ (4)

Both objectives, temperature-tracking error and profit J, are terminal readouts of a stiff integration with switching constraints: the volume ceiling, non-negativity, and feed-rate bounds. There is no usable gradient. That rules out gradient methods and motivates a population search where every candidate is scored by a full simulation.

## Two searches over one simulator

The system is two layers with a hard boundary between them. The optimizer layer proposes vectors and knows nothing about kinetics. The simulator layer integrates (1) to (4) with a BDF stiff solver and returns a single scalar. Two tenants share that interface and pursue different objectives. PySwarm searches the PID gains K_p, K_i, K_d that minimize the temperature tracking error. DEAP evolves the design vector (S₀, X₀, V₀, S_f, k_La, F) that maximizes profit.

  +-----------------------+      candidate vector       +----------------------+
  |    optimizer layer    |  ----------------------->   |   simulator          |
  |                       |                             |   9-state stiff ODE  |
  |  PySwarm : PID gains  |                             |   Monod + energy     |
  |  DEAP GA : design     |  <-----------------------   |   balance + PID      |
  +-----------------------+      scalar objective       +----------------------+
            ^                   (tracking MSE | profit J)          |
            |  selection / velocity update                        | parallel map
            +-----------------------------------------------------+   over population

Fig. 1 Optimizer and simulator are decoupled. The simulator stays a pure function; each optimizer consumes only the scalar its objective returns.

Because candidates are independent, the population fans out across a process pool while the kinetics stay fixed. The encoding differs by tenant, but the call boundary does not, so a controller-tuning run and an economic-design run reuse the identical integrator.

# one simulator, two objectives
def simulate(design, gains):                 # 9-state stiff ODE: Monod + energy balance + PID
    return integrate(reactor_odes, design, gains,
                     t_span=(0, 24), method="BDF",
                     events=[volume_cap, nonneg])

# PySwarm holds temperature: minimize tracking error to the 328.5 K setpoint
gains  = pyswarm_run(lambda g: mse_T(simulate(design0, g)), lb, ub)

# DEAP maximizes economics: evolve the reactor design for batch profit
design = deap_evolve(lambda d: profit(simulate(d, gains)), dim=6, workers=n_cores)

## What the model says

The particle swarm settles on gains that hold the broth flat at the setpoint across the batch. The genetic algorithm drives the design toward a high-cell-density operating point. The economics that result are not flattering, and that is the finding.

Acetate is the lever. At the wild-type yield Y_ax = 0.92 g/g the culture self-poisons near the inhibition constant and biomass caps around 1.7 g/L. Drop it to 0.05, a low-overflow strain along the lines of a pta/ackA knockout, and the culture reaches roughly 20 g/L.
Economics follow biology. The product yields and price are informed placeholders, flagged as such in the code. Under un-engineered metabolism the process loses money; it turns a profit only on the engineered strain. Production becomes economical when the strain improves, not before.
Scope is stated, not hidden. The model prices crude product and excludes downstream purification, and the stochastic optimizers vary run to run. The headline numbers move with the placeholders and need measured data before any real claim.

The deliverable is the mapping, not a single number: change the strain assumptions or the price ratio c_P/c_S and re-run, and the architecture returns the new optimal design and the controller that keeps it on temperature, with no change to the kinetics or the optimizers.

Python, NumPy, SciPy (BDF stiff solver). DEAP for evolutionary operators, PySwarm for swarm search. Matplotlib and Seaborn for figures.
Liam Kozma · liam@liamkozma.com