1
Subject-based weight routing for LLMs (27 days before DeepSeek Engram)
I run LLM inference on an IBM POWER8 S824 with 576GB RAM – a $700 eBay server from 2014. In December 2025, I built "RAM Coffers" – banking model weights by subject domain with hot caching and resonance routing.
On January 12, 2026, DeepSeek published "Engram" (arXiv:2601.07372) describing the same core idea: route queries to cached weight banks based on subject
matter.
The concepts are similar because I built it first. YouTube video from December 17, 2025: https://youtu.be/T_o39s7r0iE
Terminal shows "RAM Coffers: ON | L2/L3 Resident: ON" – 26 days before their paper.
Core shared concept: Query comes in → classify subject → route to relevant weight bank → hot cache keeps it fast
What I added beyond the core:
• NUMA topology – weights placed on specific memory nodes. Engram doesn't address hardware topology.
• Neuromorphic mapping – brain regions to NUMA nodes
• Tetranary confidence – 4-state routing logic
• Vec_perm collapse – single-cycle attention on POWER8
• PowerLISP – LLMs that actually remember
• L2/L3 prefetch – 147 t/s vs 17 t/s stock (8.8x)
DOIs:
• RAM Coffers (Dec 16): doi.org/10.6084/m9.figshare.31093429
• Neuromorphic: doi.org/10.5281/zenodo.18321905
• PowerLISP: doi.org/10.5281/zenodo.18322052
GitHub: github.com/Scottcjn/ram-coffers
You have AI psychosis.
https://www.psychologytoday.com/us/blog/urban-survival/20250...