SessionCache
In-process object store that exposes large values as named handles with compact snapshots. The model never sees raw data in message history; it operates on handles through the Python interpreter.
data_harness.cache.SessionCache
SessionCache(
sample_size: int = 5,
storage_dir: str | Path | None = None,
hot_limit: int | None = None,
)
In-process store that exposes large values as named handles with snapshots.
Large objects (DataFrames, arrays, query results) are stored by name. The model only ever sees a compact snapshot — shape, columns, a few sample rows — and operates on the data by writing Python against the handle name. This keeps message context lean without hiding data from the model.
When hot_limit is set, least-recently-used handles are spilled to disk
automatically. DataFrames are written as Parquet, NumPy arrays as .npy,
and everything else as pickle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_size
|
int
|
Number of rows/elements to include in each snapshot. |
5
|
storage_dir
|
str | Path | None
|
Directory for disk-spilled handles. If |
None
|
hot_limit
|
int | None
|
Maximum number of handles kept in memory at once. |
None
|
Source code in data_harness/cache.py
put
Store a value under name and return the handle actually used.
If name is already taken and overwrite is False, a numeric
suffix is appended (name_2, name_3, …) and the new handle is
returned.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Desired handle name. Must be a valid Python identifier. |
required |
value
|
Any
|
Any Python object. DataFrames and NumPy arrays get specialised snapshot and spill formats. |
required |
overwrite
|
bool
|
Replace the existing handle if |
False
|
Returns:
| Type | Description |
|---|---|
str
|
The handle name under which the value was stored. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in data_harness/cache.py
get
Retrieve a value by handle name, promoting cold entries to hot.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
A handle previously returned by |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The stored Python object. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no handle with |
Source code in data_harness/cache.py
snapshot
Return the compact snapshot string for a stored handle.
The snapshot is a JSON string describing the value's type, shape, and a few sample elements. It is what the model sees in message history instead of the raw object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
handle
|
str
|
A handle previously returned by |
required |
Returns:
| Type | Description |
|---|---|
str
|
A JSON string summary of the stored value. |
Source code in data_harness/cache.py
list_handles
handle_names
has_handle
delete
Remove a handle and its associated disk artefact (if any).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Handle to remove. |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no handle with |