This repository has been archived on 2026-05-05. You can view files and clone it. You cannot open issues or pull requests or push a commit.

Engram

A local-first memory substrate for accumulating intelligence.

An engram is the physical trace of a memory in the brain — the actual encoded substrate, not an abstraction above it. That's what this is.


Why existing databases are wrong for this use case

Relational databases store rows and retrieve them by predicate. Key-value stores retrieve by exact key. Vector databases retrieve by geometric proximity. All of them share the same fundamental model: you store data in, you query it out. Storage and retrieval are separate systems.

The brain doesn't work this way.

When you remember something, you don't query your hippocampus. You activate a memory trace and the pattern propagates. Long-term potentiation — the strengthening of synaptic connections through co-activation — is simultaneously the storage mechanism and the retrieval mechanism. The structure that holds the memory is the same structure that surfaces it.

No existing database models this. Engram does.


The Spreading Activation Model

Engram retrieval works through spreading activation:

  1. Seeds — you name one or more nodes you know are relevant (e.g. the current task, recent context, a concept you're reasoning about)

  2. Query embedding — you provide a semantic vector representing the direction of your current thought

  3. Propagation — activation flows outward from seeds through weighted edges. At each hop, strength attenuates multiplicatively:

    strength = parent_strength × edge_weight × target_salience × cosine_sim(query, target)
    
  4. Pruning — paths weaker than a threshold are cut (the attention filter)

  5. Return — the top-N nodes by activation strength

This is not a query. It is a pattern completion. The system surfaces what is most associatively relevant to the current context, weighted by how strongly those things have been reinforced over time.


The Four Memory Tiers

Tier Analogy Contents
Working Prefrontal working memory K most recently activated nodes — hot, fast
Episodic Hippocampus Time-ordered events and experiences
Semantic Neocortex Concept graph — long-term structural knowledge
Procedural Cerebellum / basal ganglia Patterns, workflows, habits

Nodes migrate between tiers based on salience decay and reinforcement. A frequently activated semantic node stays semantic. A rarely-touched episodic memory decays toward procedural background.


Salience — Forgetting as Adaptation

Salience is not stored permanently. It decays:

fn compute_salience(importance: f32, last_activated_ms: i64, activation_count: u64) -> f32 {
    let days_since = (now_ms() - last_activated_ms) as f32 / 86_400_000.0;
    importance * (1.0 / (1.0 + days_since)) * (activation_count as f32 + 1.0).ln()
}

Three signals:

  • Importance (0.01.0): set at creation, stable
  • Recency: decays toward zero as days pass without activation
  • Frequency: log-compressed count of activations

Forgetting in Engram is not a bug. It is adaptive pruning. Memories that are never activated again become less likely to surface during retrieval. They are not deleted — they remain in storage — but they stop competing for attention. This is exactly how biological memory works, and why it is adaptive rather than pathological.


Quick Start

use engram_core::{EngramDb, Node, Edge, NodeType, MemoryTier, RelationType};
use std::path::Path;

// Open or create a database
let db = EngramDb::open(Path::new("/var/lib/my-agent/memory"))?;

// Create a node with a semantic embedding
let node = Node::new(
    NodeType::Concept,
    vec![0.9, 0.1, 0.3, 0.7, 0.8, 0.2],   // embedding from your LLM
    b"Spreading activation surfaces relevant memories by pattern completion".to_vec(),
    MemoryTier::Semantic,
    0.9,   // importance
);
let id = db.put_node(node)?;

// Link it to related concepts
let related = db.put_node(Node::new(
    NodeType::Concept,
    vec![0.8, 0.2, 0.4, 0.6, 0.7, 0.3],
    b"Long-term potentiation: co-activation strengthens synaptic weight".to_vec(),
    MemoryTier::Semantic,
    0.85,
))?;
db.put_edge(Edge::new(id, related, RelationType::Causes, 0.9))?;

// Retrieve by spreading activation
let results = db.activate(
    &[id],                                     // seeds
    &[0.85, 0.15, 0.35, 0.65, 0.75, 0.25],   // query embedding
    3,                                         // max hops
    10,                                        // top-N results
)?;

for r in results {
    println!(
        "strength={:.4} hops={}{}",
        r.activation_strength,
        r.hops,
        String::from_utf8_lossy(&r.node.content)
    );
}

Project Structure

engram/
  crates/
    engram-core/        # The memory engine — storage, graph, activation, salience
    engram-ffi/         # C FFI stubs for cross-language bindings
  bindings/
    kotlin/             # Android / JVM binding notes
    typescript/         # WASM / Node binding notes
    go/                 # CGo binding notes
  examples/
    basic.rs            # Full walkthrough: insert, activate, search, decay

Public API

impl EngramDb {
    fn open(path: &Path) -> EngramResult<Self>;
    fn put_node(&self, node: Node) -> EngramResult<Uuid>;
    fn get_node(&self, id: Uuid) -> EngramResult<Option<Node>>;
    fn put_edge(&self, edge: Edge) -> EngramResult<()>;
    fn get_edges_from(&self, from_id: Uuid) -> EngramResult<Vec<Edge>>;
    fn get_edges_to(&self, to_id: Uuid) -> EngramResult<Vec<Edge>>;
    fn search_embedding(&self, embedding: &[f32], limit: usize) -> EngramResult<Vec<ScoredNode>>;
    fn activate(&self, seeds: &[Uuid], query_embedding: &[f32], max_depth: u8, limit: usize) -> EngramResult<Vec<ActivatedNode>>;
    fn traverse(&self, from: Uuid, relation: Option<RelationType>, max_depth: u8) -> EngramResult<Vec<Node>>;
    fn touch(&self, id: Uuid) -> EngramResult<()>;
    fn decay(&self, factor: f32) -> EngramResult<usize>;
    fn node_count(&self) -> EngramResult<usize>;
    fn edge_count(&self) -> EngramResult<usize>;
}

Dependencies

  • sled — embedded persistent B-tree (no daemon, no network, local-first)
  • bincode — compact binary serialization
  • uuid — stable node identity
  • serde — derive support
  • thiserror / anyhow — error handling

Design Decisions

Why sled? Local-first. No daemon. Transactional. Fast enough for the node counts Engram targets (< 1M nodes). When the right HNSW index is needed, it will layer on top of sled, not replace it.

Why flat cosine scan? Correct and simple. The graph structure itself is the primary retrieval mechanism. Vector search is a secondary signal. HNSW adds complexity and a compile dependency that isn't justified until retrieval quality at scale demands it.

Why multiplicative activation? Because memory is conjunctive. A path requires all of its links to be strong to carry signal. Addition would allow many weak associations to accumulate into false relevance. Multiplication enforces that every factor matters.

Why salience decay? Because not everything that was once important remains important. Adaptive forgetting is not failure — it is the mechanism that keeps attention on what's current. A memory system that never forgets is one that can never focus.

S
Description
Local-first spreading-activation memory database
Readme
598 KiB
Languages
HTML 77.6%
Emacs Lisp 11.4%
C 11%