§ R — Our research

Pushing the frontier of agentic AI.

Building agents that actually do the work means rethinking the stack — from tokens to tools to orchestration. We publish what we learn along the way: efficiency wins, failure modes, surprises, and the quiet engineering that makes the easy part feel easy.

FocusEfficiency · orchestration · agentic systems
Active threads4
StatusOpen · publishing as we go
§ W — What we're doing

Active threads.

The questions we're sitting with right now. Some will become papers, some will become product. We share the receipts either way.

/ 01 Active

Token & cost efficiency

Cutting the cost of long-horizon agentic runs without trading away quality. Where the savings actually live, and what they reveal about model behavior under load.

/ 02 Ongoing

Long-horizon orchestration

How an agentic OS keeps state, plans, and self-correction coherent across hundreds of steps. The boring part of agents that quietly decides whether they finish.

/ 03 Ongoing

Tool-use reliability

Browsers, terminals, files — the surfaces agents actually act on. Quantifying the failure modes and the scaffolds that turn a flaky tool call into a finished task.

/ 04 Exploratory

Cross-provider routing

When you're not married to one model, you can pick the right one per step. Studying how routing decisions compound across long runs — for cost, latency, and quality.

§ P — Publications

What we've written down.

Working papers, preprints, and technical notes. Newest first. We'll keep this list short and honest — only the things worth reading.

  1. 2026 · Q2 Working paper

    Predicting Grokking from Early Training Dynamics.

    Essarion Research

    Can we forecast eventual grokking before validation accuracy even begins to move? We demonstrate that a simple regularized model, trained on early-training dynamical signals, can predict long-horizon generalization with AUROC 0.955 as early as step 200—saving compute and providing new mechanistic insights into delayed generalization.

  2. ··· In progress

    More to come.

    Essarion Research

    Notes on orchestration, tool-use reliability, and cross-provider routing are in the pipeline. We'll publish here when each one is ready.

§ C — Collaborate

Working on something adjacent?

We're a small research group inside Essarion. If you're working on agentic systems — efficiency, orchestration, evaluation, tool-use — we'd like to hear from you.

Open to collaborations, residencies, and the occasional very specific question.

research@essarion.com
Open labresearch at essarion