How Memory Works
How Memory Works
Section titled “How Memory Works”Memory updates happen during context compaction, not on every turn. That keeps the conversation hot path cheap and lets the curator see a meaningful slice of recent activity.
The pipeline
Section titled “The pipeline”PreCompact hook fires ↓Curator LLM → extracts memory drafts from the about-to-be-compacted turns ↓Resolver LLM → merges drafts against existing memories (insert / update / promote / demote / forget) ↓Salience scorer → assigns a weight per memory based on novelty + reuse signals ↓SQLite + vec → memories land in ~/.ptah/ptah.db, chunks are embedded and indexed for hybrid searchCurator and resolver
Section titled “Curator and resolver”Both stages are LLM calls. By default they use claude-haiku-4-20251022 — fast and cheap, which matters because the curator runs every compaction. Override via memory.curatorModel if you want a sharper or cheaper model.
The curator’s output is structured: each draft has a kind (fact | preference | event | entity), a body, an optional subject, and a tier hint. The resolver does the work of deciding what’s actually new versus what’s a refinement of something Ptah already knows.
Salience and tier movement
Section titled “Salience and tier movement”Each memory carries a salience score. The score increases when a memory is retrieved and used in subsequent turns, and decays exponentially when it’s not. The half-life is memory.decayHalflifeDays (default 14 days).
- High salience + frequent hits → promoted toward
core - Low salience over time → demoted toward
archival, eventually pruned
Pinned memories (see Pinning & forgetting) are exempt from decay.
Embeddings
Section titled “Embeddings”Embeddings run in a worker thread using transformers.js — no network calls, no API key. Model defaults to Xenova/bge-small-en-v1.5 (384 dims). First run downloads the model weights to your Electron user-data cache; subsequent runs are local-only.
Where it lives
Section titled “Where it lives”All memory state is in ~/.ptah/ptah.db:
memories— one row per memory (kind, body, tier, salience, pinned, timestamps)memory_chunks— text shards used for retrievalmemory_chunks_fts— FTS5 BM25 indexmemory_chunks_vec— sqlite-vec embedding index
The code-symbol index
Section titled “The code-symbol index”Alongside curated memory, Ptah keeps a separate code-symbol index for the current workspace. This is distinct from the curator pipeline above:
- Memory chunks are LLM-extracted, scored, and tiered — they capture decisions and knowledge from your sessions.
- Code symbols come straight from indexing your source tree — they capture structure (functions, classes, methods) so the agent can navigate and recall where code lives.
Indexing runs on your machine; nothing is uploaded. When the workspace changes, you can re-index from the Memory tab. Each indexed symbol records its name, kind (e.g. function, class, method), the file it lives in, and a token count.