Storyworlds as Sparse Autoencoders
Interactive narratives encode compressed, interpretable representations of complex systems. A well-designed storyworld forces coherent structure onto agent behavior, goal dynamics, and world-state evolution — precisely the features we want to extract and understand in AI systems.
This research program tests whether models trained on richly-structured narrative data produce more interpretable internal representations than models trained on unstructured corpora, and whether SAE features extracted from such models map more cleanly to human-understandable concepts.
Execution Timeline
Deliverables
Trained Models
Intellect-3 variant trained on narrative corpus + control model. Open weights released for reproducibility.
SAE Feature Analysis
Extracted autoencoders with annotated feature dictionaries mapping to narrative structures.
Research Paper
Peer-reviewable publication documenting methodology, results, and implications for interpretability.
Open Source Tools
Training scripts, data pipeline, and SAE extraction code for community replication.
Narrative Dataset
Curated 40M token corpus with structural annotations, released for future research.
Interactive Demo
Web interface for exploring SAE features and their narrative correspondences.
Budget
Funding primarily supports compute infrastructure. I'm based in Chile, which provides significant runway advantages — this budget goes further than equivalent US-based research.
| Item | Allocation |
|---|---|
| GPU Cluster (RTX 4090 x4 or equivalent) | $16,000 |
| Cloud compute overflow / experimentation | $2,000 |
| Infrastructure (storage, networking) | $1,500 |
| Miscellaneous (publication fees, tools) | $500 |
| Researcher stipend (12 months) | $25,000 |
| Total | $45,000 |
Note: Chile-based, significantly lower cost of living than US/EU. Stipend enables full-time dedicated research.
Why Me
I bring a unique combination of deep technical experience and long-standing work at the intersection of interactive systems and AI. This isn't a pivot — it's the convergence of parallel research threads I've pursued for over a decade.
Technical Development
Founded TradeLayer (2014), a Bitcoin-native derivatives protocol. Deep experience shipping production financial systems.
Interactive Narrative
Collaboration with Chris Crawford and game design community. Long-term research into storyworlds as computational structures.
Qubit Systems
Operating commercial quantum computing systems for semantic indexing. Provisional patents filed for QFT-based techniques.
Token Corpus
Built and maintained large-scale corpus with QFT-enhanced retrieval systems. Infrastructure ready for training.
Theory of Change
Current interpretability research struggles with the gap between mathematical feature descriptions and human-understandable concepts. Narrative structures provide a natural bridge — humans already think in terms of agents, goals, obstacles, and causality.
If this works:
We get a new methodology for interpretability research that produces more human-legible features. We get evidence for or against the hypothesis that training data structure shapes internal representation quality. We get open tools and models that other researchers can build on.
If this doesn't work:
We get valuable negative results about the limits of data-structure-driven interpretability. We still release the models, data, and tools for others to learn from. The failure modes themselves inform future research directions.
Let's Talk
I'm happy to discuss this research program in detail, answer questions about methodology, or explore how this work might complement your foundation's priorities.
Get in Touch