§ R — Our research

Pushing the frontier of agentic AI.

Building agents that actually do the work means rethinking the stack — from tokens to tools to orchestration. We publish what we learn along the way: efficiency wins, failure modes, surprises, and the quiet engineering that makes the easy part feel easy.

FocusEfficiency · orchestration · agentic systems

Active threads4

StatusOpen · publishing as we go

§ FR — Frontier

On the frontier.

FIG. 00 · GRIA · STRUCTURAL

shape ↔ hole retrieval by alignment

Frontier · in development

GRIA: Graph Retrieval Inference Architecture.

A retrieval architecture where a small, structural model reaches for knowledge by the shape of its own uncertainty. Topics live as shapes in a shared representational space; reasoning produces shapes of holes that align against them. The corpus fills the hole; the model never needed to hold the answer to reach for it.

Frontier research — more to share when we can.

ClassGraphretrieval

ReachShapeshape-space

StageEarlyfrontier

§ F — Featured

What's shipping first.

FIG. 01 · TOKEN DISTILLATION

Δ tokens · Δ cost baseline → optimized

Active · in progress

Token-efficient inference for agentic workloads.

Long-running agents burn context. We're studying how to make every token earn its keep — through smarter routing, stricter scaffolds, and aggressive distillation of intermediate state. Early results: a 4.6× improvement on both token and dollar cost, with no measurable drop in task completion.

A working paper is in preparation. More to come on benchmarks, methodology, and the surprising places efficiency hides.

Token cost4.6×more efficient

Dollar cost4.6×cheaper per task

Quality≈no measurable loss

§ W — What we're doing

Active threads.

The questions we're sitting with right now. Some will become papers, some will become product. We share the receipts either way.

/ 01 Active

Token & cost efficiency

Cutting the cost of long-horizon agentic runs without trading away quality. Where the savings actually live, and what they reveal about model behavior under load.

/ 02 Ongoing

Long-horizon orchestration

How an agentic OS keeps state, plans, and self-correction coherent across hundreds of steps. The boring part of agents that quietly decides whether they finish.

/ 03 Ongoing

Tool-use reliability

Browsers, terminals, files — the surfaces agents actually act on. Quantifying the failure modes and the scaffolds that turn a flaky tool call into a finished task.

/ 04 Exploratory

Cross-provider routing

When you're not married to one model, you can pick the right one per step. Studying how routing decisions compound across long runs — for cost, latency, and quality.

§ P — Publications

What we've written down.

Working papers, preprints, and technical notes. Newest first. We'll keep this list short and honest — only the things worth reading.

2026 · Q2 Working paper

Predicting Grokking from Early Training Dynamics.

Essarion Research

Can we forecast eventual grokking before validation accuracy even begins to move? We demonstrate that a simple regularized model, trained on early-training dynamical signals, can predict long-horizon generalization with AUROC 0.955 as early as step 200—saving compute and providing new mechanistic insights into delayed generalization.

Read Paper GitHub Repository Download PDF
··· In progress

More to come.

Essarion Research

Notes on orchestration, tool-use reliability, and cross-provider routing are in the pipeline. We'll publish here when each one is ready.

§ C — Collaborate

Working on something adjacent?

We're a small research group inside Essarion. If you're working on agentic systems — efficiency, orchestration, evaluation, tool-use — we'd like to hear from you.

Open to collaborations, residencies, and the occasional very specific question.

research@essarion.com

Open labresearch at essarion