gpmap.io¶
CSV, JSON, Excel, and pickle round-trip functions. See Loading and saving for a guide-style walkthrough.
read_json and to_json¶
def read_json(path: str | Path) -> GenotypePhenotypeMap
def to_json(gpm: GenotypePhenotypeMap, path: str | Path) -> None
JSON writes carry a top-level "schema_version": "1". Reads of files without that field emit a UserWarning.
read_csv and to_csv¶
def read_csv(path: str | Path, **kwargs: Any) -> GenotypePhenotypeMap
def to_csv(gpm: GenotypePhenotypeMap, path: str | Path, **kwargs: Any) -> None
to_csv writes two files: the data CSV (genotypes, phenotypes, stdeviations, n_replicates) and a sidecar <basename>.csv.meta.json (wildtype, mutations, site_labels, schema_version).
read_csv looks for the sidecar at either <basename>.csv.meta.json or the older <basename>.meta.json and raises FileNotFoundError if neither exists.
**kwargs are forwarded to pd.read_csv / pd.to_csv. index=False is set as a default for to_csv.
read_excel and to_excel¶
def read_excel(path: str | Path, **kwargs: Any) -> GenotypePhenotypeMap
def to_excel(gpm: GenotypePhenotypeMap, path: str | Path, **kwargs: Any) -> None
Writes a two-sheet workbook (data and meta). The meta sheet stores schema fields as key/value pairs.
**kwargs are forwarded to pd.read_excel(... sheet_name="data") / `pd.ExcelWriter.
Engine
Excel I/O uses the engine pandas picks by default. Install openpyxl if it is not already present in your environment.
read_pickle and to_pickle¶
def read_pickle(path: str | Path) -> GenotypePhenotypeMap
def to_pickle(gpm: GenotypePhenotypeMap, path: str | Path) -> None
Standard pickle.dump and pickle.load under the hood. read_pickle checks the loaded object's type and raises TypeError if it is not a GenotypePhenotypeMap.
Pickle preserves cached binary arrays, so a re-loaded map is immediately ready to use without re-running binary_packed construction.
Cross-version pickle
Pickle files are only guaranteed to load on the same major version. Class paths inside gpmap are stable within a major version; an internal refactor that moves a class between modules constitutes a breaking change.
File-format choice¶
| Format | Pros | Cons |
|---|---|---|
| JSON | Self-contained, human-readable, schema-versioned | Slowest large-map write (~150 ms at n=65k) |
| CSV + sidecar | Pandas-friendly, fast | Two files to track |
| Excel | Single file, collaborator-friendly | Slowest format overall |
| Pickle | Fastest round-trip | Version-fragile, opaque |
For sharing maps with anyone outside the gpmap-v2 ecosystem, prefer CSV or JSON.