Blog | Femi Adeniran

Deep Dives

What’s Actually Inside Your Model Download

Three formats dominate the quantized model landscape. GGUF packs everything in one self-describing binary. AWQ stores weights as safetensors for GPU serving. EXL2 adds per-column error maps for maximum quality per bit. This article opens all three.

Deep Dives GGUF quantization model formats

Inference Deep Dive

How LLMs Actually Run at Scale

An end-to-end look at production LLM inference: request handling, tokenization, prefill, KV cache management, decode, schedulers, quantization, context windows, streaming APIs, and the serving stack behind real latency.

Deep Dives inference LLM systems serving

Agentic Protocols

Agentic Debugging Protocol

v2.0 | 8 Principles | 24 pages

A systematic framework for LLM agents to tackle hard optimization and debugging problems. Covers theoretical floors, structured worklogs, agent architecture, bottleneck hierarchy, the wall protocol, optimization patterns, and SQLite-based memory systems.

optimization debugging LLM agents methodology

PDF (767 KB) HTML

Agentic Exploration Protocol

v2.0 | 10 Principles | 28 pages

A framework for LLM agents to systematically explore and understand complex systems, codebases, and problem spaces. Covers uncertainty mapping, exploration journals, agent roles (Surveyor, Diver, Tracer, Synthesizer, Challenger), and memory systems.

exploration codebase analysis knowledge graphs Zettelkasten

PDF (777 KB) HTML

Research Systems

Eureka Device

Go | ~15K LOC | 197 papers | 271 claims

An autonomous scientific discovery engine that extracts, validates, and synthesizes claims from ML research papers. Builds a knowledge graph with regime-gated edges and mines it for testable hypotheses using anti-hype scoring.

knowledge graphs hypothesis mining paper synthesis LLM agents

MathShard AI

Go | Elo Models | Monte Carlo | Kelly Criterion

A soccer prediction system combining Elo ratings, Bayesian calibration, and Monte Carlo simulation. Features walk-forward backtesting, multi-league support, season odds simulation, and evidence-based betting strategy analysis.

prediction Bayesian Monte Carlo Go

Tools & Libraries

Agent Memory System

Python | SQLite | Zettelkasten

A hybrid memory system for LLM agents combining SQLite for structured storage, embeddings for semantic search, and Zettelkasten-style linking. Used by both debugging and exploration protocols.

Python SQLite embeddings memory