Lorenzo Bruch

Lorenzo Bruch

@brownbear611345

Our family dog means the world to us, especially to my son who has Down syndrome and shares a special bond with him. Recently, our dog had to be taken to the ve

Jerichow, Germany Joined Jan 2026

Only @brownbear611345 can see everyone listening in. Visitors see a rotating sample.

Lorenzo Bruch
@brownbear611345 · Jan 12, 2026

Building adaptive routing logic in Go for an Open source LLM gateway - Bifrost

Working on an LLM gateway (Bifrost)- Code is open source: [https://github.com/maxim-ai/bifrost](https://github.com/maxim-ai/bifrost), ran into an interesting problem: how do you route requests across multiple LLM providers when failures happen gradually?
Traditional load balancing assumes binary states – up or down. But LLM API degradations are messy. A region starts timing out, some routes spike in errors, latency drifts up over minutes. By the time it's a full outage, you've already burned through retries and user patience.
Static configs don't cut it. You can't pre-model which provider/region/key will degrade and how.
**The challenge:** build adaptive routing that learns from live traffic and adjusts in real time, with <10µs overhead per request. Had to sit on the hot path without becoming the bottleneck.
**Why Go made sense:**
* Needed lock-free scoring updates across concurrent requests
* EWMA (exponentially weighted moving averages) for smoothing signals without allocations
* Microsecond-level latency requirements ruled out Python/Node
* Wanted predictable GC pauses under high RPS
**How it works:** Each route gets a continuously updated score based on live signals – error rat

85 likes 281 responses