Posts

All the articles I've posted.

The Price of Anarchy in Disaggregated Inference

17 Jun, 2026

I split NVIDIA Dynamo's prefill and decode into three competing games and measured the Price of Anarchy on a 3-node B200 cluster. While the GPUs had headroom, no router tuning moved the needle; the moment they saturated, one parameter was the gap between a 1-second tail and a 28-second one. So I built a 270-line controller that watches for that moment and flips the switch, without touching Dynamo's core.
Mine the Way Your Model Scores: MaxSim Hard-Negative Mining for a Late-Interaction Student

Updated: 7 Jun, 2026

The standard way to mine hard negatives for a late-interaction model uses a single-vector cosine teacher, even though the model itself scores with multi-vector MaxSim. So I rebuilt my miner to score the way my model does. Matched mining clearly beat training with no mined negatives, while the cosine approach was barely doing anything at all.
Diminishing Returns and the Art of Knowing When to Stop

15 Mar, 2026

I trained three generations of ColQwen3.5, each with more sophisticated optimization than the last. The most optimized version barely beat the previous one on the primary benchmark (+0.0011 nDCG@5). Individual tasks reshuffled substantially, with per-task swings an order of magnitude larger than the aggregate gain.
Closing the AI Value Gap: Insights from Research

25 Jan, 2026

Enterprise AI adoption has reached 88%, yet only 5% of pilots deliver measurable impact. Research from MIT, BCG, and RAND reveals what separates successful implementations from the rest. It's not the technology.
Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation

2 Dec, 2025

A deep dive into my recent research on spatially-grounded document retrieval using ColPali models and OCR bounding boxes, enabling precise region-level retrieval during inference time and without additional training.

Posts

The Price of Anarchy in Disaggregated Inference

Mine the Way Your Model Scores: MaxSim Hard-Negative Mining for a Late-Interaction Student

Diminishing Returns and the Art of Knowing When to Stop

Closing the AI Value Gap: Insights from Research

Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation