Skip to content

Simulators

gpmap.simulate ships a small zoo of toy-landscape generators that subclass GenotypePhenotypeMap. Each generator enumerates the full Cartesian product of per-site alphabets and fills in phenotypes according to its model. Use them when you need a deterministic landscape for testing fits, benchmarking, or pedagogy.

Shared interface

Every simulator takes at minimum a wildtype and a mutations dict. Optional kwargs:

  • rng: a np.random.Generator. Defaults to np.random.default_rng(). Pass a seeded generator for reproducibility.
  • site_labels: optional per-site labels.
  • include_binary: whether to build binary_packed eagerly (default True).

All simulators are GenotypePhenotypeMap subclasses, so the full container API (gpm.data, gpm.binary_packed, to_json, etc.) is available on every instance.

NKSimulation

The Kauffman NK model: each site's fitness contribution depends on its K nearest neighbors (wrap-around). Total phenotype is the average per-site contribution.

import numpy as np
from gpmap.simulate import NKSimulation

sim = NKSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    K=2,
    rng=np.random.default_rng(0),
)
sim.phenotypes.shape  # (16,)
Parameter Type Default Meaning
K int 1 Neighborhood size, in bits
rng np.random.Generator \| None None Seed source for the per-neighborhood random table

Rough phenotype range: [0, 1]. Higher K means rougher landscapes with more local optima.

MountFujiSimulation

A single-peak additive landscape with optional Gaussian or uniform roughness. The peak is at the wildtype.

from gpmap.simulate import MountFujiSimulation

sim = MountFujiSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    field_strength=1.0,
    roughness_width=0.1,
    roughness_dist="normal",
)
Parameter Type Default Meaning
field_strength float 1.0 Slope of the additive component (units: phenotype per Hamming step from WT)
roughness_width float 0.0 Standard deviation ("normal") or half-width ("uniform") of the noise term
roughness_dist "normal" \| "uniform" "normal" Noise distribution

Phenotype is -field_strength * hamming_distance + noise, so the wildtype is the global maximum at 0 + noise.

MultiPeakMountFujiSimulation

A max-of-Fujis landscape: pick peak_n peaks at random (subject to a minimum pairwise Hamming distance), build a single-peak Fuji around each, and the genotype's phenotype is the max over those.

from gpmap.simulate import MultiPeakMountFujiSimulation

sim = MultiPeakMountFujiSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    peak_n=3,
    min_peak_distance=2,
)

sim.peak_genotypes  # list[str] of length 3
Parameter Type Default Meaning
peak_n int 2 Number of peaks to plant
min_peak_distance int 1 Minimum Hamming distance between any two peaks
max_peak_distance int \| None None (= len(wildtype)) Maximum allowed pairwise distance
max_proposal_retries int 10_000 Hard cap on peak-search retries before raising

Retry cap

v1 had an unguarded while loop that could spin forever on infeasible constraints. v2 caps the search and raises RuntimeError with a clear message if it cannot place all peaks.

HouseOfCardsSimulation

NK with K = n_bits - 1: every site sees the full genotype, so neighboring genotypes are uncorrelated. The roughest possible NK landscape.

from gpmap.simulate import HouseOfCardsSimulation

sim = HouseOfCardsSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
)

No tunable knobs beyond rng; the K is fixed by the geometry.

RandomPhenotypesSimulation

Phenotypes are drawn uniformly from [low, high]. The simplest possible landscape, useful as a null model.

from gpmap.simulate import RandomPhenotypesSimulation

sim = RandomPhenotypesSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    low=0.0,
    high=1.0,
)

Build from length

Every simulator that subclasses BaseSimulation exposes a .from_length constructor for the case where you do not care about the specific alphabet, just its size:

import random
from gpmap.simulate import NKSimulation

sim = NKSimulation.from_length(
    length=5,
    alphabet_size=3,
    kind="DNA",
    K=2,
    rng=random.Random(0),
)
Parameter Type Meaning
length int Number of sites
alphabet_size int Per-site alphabet size (>= 2)
kind "AA" \| "DNA" \| "RNA" \| "BINARY" Source alphabet to sample from

The wildtype is the first letter of each site's randomly-chosen alphabet.

Masking: subsample a map

gpmap.simulate.mask returns a (true_fraction, GenotypePhenotypeMap) named tuple holding a fraction of the input genotypes, sampled uniformly without replacement:

from gpmap.simulate import mask

masked = mask(sim, fraction=0.5, rng=np.random.default_rng(0))
masked.fraction      # actual fraction kept (rounded)
masked.gpm           # subsampled GenotypePhenotypeMap

Named-tuple return

v1 returned (float, GPM). v2 returns a MaskedGPM named tuple so you can index by name (.fraction, .gpm) instead of remembering the order.

Adding noise to a simulator

Simulators expose a .set_stdeviations(sigma) method so you can attach a uniform measurement uncertainty after construction:

sim = NKSimulation(wildtype="AAAA", mutations={i: ["A", "T"] for i in range(4)})
sim.set_stdeviations(0.05)
sim.stdeviations  # array([0.05, 0.05, ...])

For per-genotype noise, build the simulator, copy gpm.phenotypes, perturb, and feed the result back through GenotypePhenotypeMap(...).