Simulators¶
gpmap.simulate ships a small zoo of toy-landscape generators that subclass GenotypePhenotypeMap. Each generator enumerates the full Cartesian product of per-site alphabets and fills in phenotypes according to its model. Use them when you need a deterministic landscape for testing fits, benchmarking, or pedagogy.
Shared interface¶
Every simulator takes at minimum a wildtype and a mutations dict. Optional kwargs:
rng: anp.random.Generator. Defaults tonp.random.default_rng(). Pass a seeded generator for reproducibility.site_labels: optional per-site labels.include_binary: whether to buildbinary_packedeagerly (defaultTrue).
All simulators are GenotypePhenotypeMap subclasses, so the full container API (gpm.data, gpm.binary_packed, to_json, etc.) is available on every instance.
NKSimulation¶
The Kauffman NK model: each site's fitness contribution depends on its K nearest neighbors (wrap-around). Total phenotype is the average per-site contribution.
import numpy as np
from gpmap.simulate import NKSimulation
sim = NKSimulation(
wildtype="AAAA",
mutations={i: ["A", "T"] for i in range(4)},
K=2,
rng=np.random.default_rng(0),
)
sim.phenotypes.shape # (16,)
| Parameter | Type | Default | Meaning |
|---|---|---|---|
K |
int |
1 |
Neighborhood size, in bits |
rng |
np.random.Generator \| None |
None |
Seed source for the per-neighborhood random table |
Rough phenotype range: [0, 1]. Higher K means rougher landscapes with more local optima.
MountFujiSimulation¶
A single-peak additive landscape with optional Gaussian or uniform roughness. The peak is at the wildtype.
from gpmap.simulate import MountFujiSimulation
sim = MountFujiSimulation(
wildtype="AAAA",
mutations={i: ["A", "T"] for i in range(4)},
field_strength=1.0,
roughness_width=0.1,
roughness_dist="normal",
)
| Parameter | Type | Default | Meaning |
|---|---|---|---|
field_strength |
float |
1.0 |
Slope of the additive component (units: phenotype per Hamming step from WT) |
roughness_width |
float |
0.0 |
Standard deviation ("normal") or half-width ("uniform") of the noise term |
roughness_dist |
"normal" \| "uniform" |
"normal" |
Noise distribution |
Phenotype is -field_strength * hamming_distance + noise, so the wildtype is the global maximum at 0 + noise.
MultiPeakMountFujiSimulation¶
A max-of-Fujis landscape: pick peak_n peaks at random (subject to a minimum pairwise Hamming distance), build a single-peak Fuji around each, and the genotype's phenotype is the max over those.
from gpmap.simulate import MultiPeakMountFujiSimulation
sim = MultiPeakMountFujiSimulation(
wildtype="AAAA",
mutations={i: ["A", "T"] for i in range(4)},
peak_n=3,
min_peak_distance=2,
)
sim.peak_genotypes # list[str] of length 3
| Parameter | Type | Default | Meaning |
|---|---|---|---|
peak_n |
int |
2 |
Number of peaks to plant |
min_peak_distance |
int |
1 |
Minimum Hamming distance between any two peaks |
max_peak_distance |
int \| None |
None (= len(wildtype)) |
Maximum allowed pairwise distance |
max_proposal_retries |
int |
10_000 |
Hard cap on peak-search retries before raising |
Retry cap
v1 had an unguarded while loop that could spin forever on infeasible constraints. v2 caps the search and raises RuntimeError with a clear message if it cannot place all peaks.
HouseOfCardsSimulation¶
NK with K = n_bits - 1: every site sees the full genotype, so neighboring genotypes are uncorrelated. The roughest possible NK landscape.
from gpmap.simulate import HouseOfCardsSimulation
sim = HouseOfCardsSimulation(
wildtype="AAAA",
mutations={i: ["A", "T"] for i in range(4)},
)
No tunable knobs beyond rng; the K is fixed by the geometry.
RandomPhenotypesSimulation¶
Phenotypes are drawn uniformly from [low, high]. The simplest possible landscape, useful as a null model.
from gpmap.simulate import RandomPhenotypesSimulation
sim = RandomPhenotypesSimulation(
wildtype="AAAA",
mutations={i: ["A", "T"] for i in range(4)},
low=0.0,
high=1.0,
)
Build from length¶
Every simulator that subclasses BaseSimulation exposes a .from_length constructor for the case where you do not care about the specific alphabet, just its size:
import random
from gpmap.simulate import NKSimulation
sim = NKSimulation.from_length(
length=5,
alphabet_size=3,
kind="DNA",
K=2,
rng=random.Random(0),
)
| Parameter | Type | Meaning |
|---|---|---|
length |
int |
Number of sites |
alphabet_size |
int |
Per-site alphabet size (>= 2) |
kind |
"AA" \| "DNA" \| "RNA" \| "BINARY" |
Source alphabet to sample from |
The wildtype is the first letter of each site's randomly-chosen alphabet.
Masking: subsample a map¶
gpmap.simulate.mask returns a (true_fraction, GenotypePhenotypeMap) named tuple holding a fraction of the input genotypes, sampled uniformly without replacement:
from gpmap.simulate import mask
masked = mask(sim, fraction=0.5, rng=np.random.default_rng(0))
masked.fraction # actual fraction kept (rounded)
masked.gpm # subsampled GenotypePhenotypeMap
Named-tuple return
v1 returned (float, GPM). v2 returns a MaskedGPM named tuple so you can index by name (.fraction, .gpm) instead of remembering the order.
Adding noise to a simulator¶
Simulators expose a .set_stdeviations(sigma) method so you can attach a uniform measurement uncertainty after construction:
sim = NKSimulation(wildtype="AAAA", mutations={i: ["A", "T"] for i in range(4)})
sim.set_stdeviations(0.05)
sim.stdeviations # array([0.05, 0.05, ...])
For per-genotype noise, build the simulator, copy gpm.phenotypes, perturb, and feed the result back through GenotypePhenotypeMap(...).