Cognitive Memory Model (CMM)

Autoassociative cognitive memory for LLMs and AI agents. | Documentation | PyPI

LLMs have no real memory. Knowledge is either baked into weights, held in the ephemeral context window, or stored in files that require the agent to explicitly decide to read and write. CMM changes this — it gives LLMs automatic, cognitive-like memory that passively monitors conversations, encodes them into compressed gist representations, and surfaces relevant memories when associative cues appear. No one has to decide to "look something up."

Results

CMM improves LLM response accuracy by +67% keyword / +78% LLM-judge on contamination-free novel-fact benchmarks with Claude Opus 4.6.

The improvement holds across model scales — both small (Mistral 7B) and frontier (Claude Opus 4.6) models benefit substantially.

Key Features

CMM is the first LLM memory system to implement spreading activation, entity linking, priming, and metamemory in combination.

Cognitive Features

Spreading activation — dual-path: FAISS embedding neighbors + spaCy entity linking. Discovers cross-domain connections that flat retrieval misses.
Priming — recently activated memories boost related memories for subsequent turns.
Metamemory — confidence levels (HIGH/MODERATE/LOW/NONE) and "tip of the tongue" partial-match hints.
Grace-period temporal decay — no decay for 2 weeks, then frequency-dependent exponential decay. Frequently accessed memories lock permanently.
Importance scoring — corrections and instructions get 2x importance; routine exchanges get 0.5x. Importance scoring turned a 0.12-similarity allergy memory into a life-saving recall.
Emotional valence — each memory tagged with valence, arousal, and emotion label for empathetic recall.
Episodic → semantic consolidation — clusters similar episodic memories into general knowledge over time.

Cross-Domain Spreading Activation

In a test with 1,200 city case files and 5 hidden investigation chains, entity-linked spreading activation found 2.4x more connections than flat retrieval:

Demos

Four narrative demos showcase the cognitive features with Claude Opus 4.6:

Demo	What it shows	Run it
The Handoff	3 agents disagree on an API rate limit. System surfaces contradiction with agent attribution.	`python -m demos.handoff --backend anthropic`
Tip of the Tongue	Metamemory detects a partial match, surfaces a hint, priming enables full recall on follow-up.	`python -m demos.tip_of_tongue --backend anthropic`
Six Weeks with Alex	6-week personal assistant: allergy recall, emotional empathy, correction handling, consolidation.	`python -m demos.six_weeks --backend anthropic`
The Investigator	1,200 case files. Spreading activation chains warehouse inspection → hospital report → shipping records → environmental data.	`python -m demos.investigator --backend anthropic`

Installation

# From PyPI
pip install cognitive-memory-model

# With optional integrations
pip install cognitive-memory-model[anthropic]   # Anthropic API support
pip install cognitive-memory-model[openai]      # OpenAI API support
pip install cognitive-memory-model[mcp]         # MCP server support
pip install cognitive-memory-model[all]         # Everything

# Download the spaCy model for entity linking
python -m spacy download en_core_web_sm

From source (for development):

git clone https://github.com/SyntheticCognitionLabs/cognitive-memory-model.git
cd cognitive-memory-model
python3 -m venv .venv
source .venv/bin/activate
pip install torch --index-url https://download.pytorch.org/whl/cu126  # CUDA 12.6
pip install -e ".[dev]"
python -m spacy download en_core_web_sm

Quick Start

from cmm.pipeline.conversation import CognitiveMemoryPipeline

pipeline = CognitiveMemoryPipeline()

# Ingest conversation turns — both sides are captured
pipeline.ingest("user", "I'm allergic to peanuts. I carry an EpiPen.")
pipeline.ingest("assistant", "Noted, severe peanut allergy.")
pipeline.ingest("user", "My project deadline is April 15th.")

# Later... the memory system automatically recalls relevant information
results = pipeline.recall("I'm ordering food for the team lunch")
print(pipeline.format_recalled(results))
# → [Recalled from memory...] I'm allergic to peanuts. I carry an EpiPen.

# Save to disk — memories persist across restarts
pipeline.save("./my_memory")

# Load and continue tomorrow
pipeline = CognitiveMemoryPipeline.load("./my_memory")

Integrations

Every integration supports true autoassociative memory except MCP tools.

Integration	How	Autoassociative?	Language
HTTP Memory Server	REST API on localhost or network	Yes	Any
Claude Code Hooks	UserPromptSubmit + Stop hooks	Yes	Any
Python Middleware	Wraps OpenAI/Anthropic API calls	Yes	Python
MCP Server	memory_recall/store tools	Semi (tool-based)	Any MCP client
Direct Library	`CognitiveMemoryPipeline` API	Yes	Python

Autoassociative means the memory system passively monitors all conversation turns and automatically surfaces relevant memories. No one decides to "look something up." Every path except MCP achieves this.

Use the Python Middleware (any LLM API)

from integrations.middleware import MemoryMiddleware

mw = MemoryMiddleware(api_type="anthropic")
response = mw.chat("I'm allergic to peanuts.")
response = mw.chat("Order food for the team lunch.")
# ^ Automatically recalls the peanut allergy

Use Claude Code Hooks (zero-config)

# Start the memory server
python -m integrations.claude-code.memory_server --data-dir ./memory

# Add hooks to Claude Code settings — see integrations/claude-code/README.md
# That's it. Memory happens automatically on every message.

Multi-Agent Teams

Dozens of agents can share a single memory server. Each agent tags memories with their agent_id. Nightly consolidation merges, deduplicates, and detects contradictions.

# Central server (accessible to all agents)
python -m integrations.claude-code.memory_server \
    --host 0.0.0.0 --data-dir /shared/memory --auto-save 300

See integrations/README.md for full setup instructions, CRUD endpoints, and production deployment guidance.

Architecture

Conversation stream (user ↔ agent turns, reasoning steps)
       │
       ▼
┌─────────────────┐
│  Gist Encoder   │  LLM or small model: turn → compressed summary + tags
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Embedding Model │  gist text → 768D dense vector (all-mpnet-base-v2)
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────┐
│     FAISS-backed Memory Store   │  O(1) similarity search
│  + Entity Index (spaCy NER)     │  Named entity → memory linkage
└────────┬────────────────────────┘
         │
         ▼  (on each new turn)
┌─────────────────────────────────┐
│ Cognitive Retrieval Pipeline    │
│  1. FAISS similarity search     │
│  2. Temporal decay + rehearsal  │
│  3. Importance weighting        │
│  4. Priming boost               │
│  5. Spreading activation        │
│     (embedding + entity links)  │
│  6. Working memory merge        │
│  7. Metamemory confidence       │
└────────┬────────────────────────┘
         │
         ▼
  Inject recalled memories into LLM context
  (clearly marked as "from memory, not user input")

Scoring Formula

final_score = similarity × decay(age, access_count) × importance × priming_boost

Running Tests

pytest                                          # all tests
pytest --ignore=tests/test_ollama_encoder.py \
       --ignore=tests/test_phase4_integration.py  # fast (no Ollama needed)
pytest -v -k "zebra"                            # the zebra test

License

MIT

Documentation

CLAUDE.md — Architecture and development guide
ROADMAP.md — Future directions and contribution areas
docs/DEVELOPMENT_PLAN.md — Phased implementation roadmap (all complete)
docs/BENCHMARK_PLAN.md — Evaluation methodology and results
integrations/README.md — Integration guide and production deployment
docs/FAISS-SDM.md — FAISS IVF for O(1) content-addressable memory

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
cmm		cmm
demos		demos
docs-src		docs-src
docs		docs
integrations		integrations
paper		paper
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
ROADMAP.md		ROADMAP.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cognitive Memory Model (CMM)

Results

Key Features

Cognitive Features

Cross-Domain Spreading Activation

Demos

Installation

Quick Start

Integrations

Use the Python Middleware (any LLM API)

Use Claude Code Hooks (zero-config)

Multi-Agent Teams

Architecture

Scoring Formula

Running Tests

License

Documentation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cognitive Memory Model (CMM)

Results

Key Features

Cognitive Features

Cross-Domain Spreading Activation

Demos

Installation

Quick Start

Integrations

Use the Python Middleware (any LLM API)

Use Claude Code Hooks (zero-config)

Multi-Agent Teams

Architecture

Scoring Formula

Running Tests

License

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages