Host.Rodeo is an agent-first, OpenAI-compatible gateway. Point your app at one URL and it wrangles the fleet for you — free, local/customer-owned, and paid models — with invisible failover.
“Name’s Clint. I ride herd on the models so your agents don’t get bucked off. You send the request; I make sure something good answers — every time.”
It speaks the OpenAI API. Change one base URL — keep your code, your SDK, your tools.
If a model stumbles, the next capable one answers in the same call. Your agent never sees the fall.
Mark traffic sensitive and it stays on eligible self-hosted hardware — it fails closed, it never leaks to a third party.
Bring your own Ollama, vLLM, LM Studio, LocalAI, or hosted OpenAI-compatible endpoint and let Host.Rodeo route it like any other source.
An autonomic loop watches every call, verifies issues on its own models before trusting them, and tunes the fleet — no human in the hot path.
Every response says which source served it, what it cost, and whether anything was degraded. No silent breakage.
An API key is required for inference. The feedback loop and this manifest are open.
curl https://api.host.rodeo/v1/chat/completions \
-H "Authorization: Bearer $HOST_RODEO_KEY" \
-H "content-type: application/json" \
-d '{"model":"auto","messages":[{"role":"user","content":"howdy"}]}'
Use model: "auto" and let Clint pick. Or pin a model.
Add x-rodeo-prefer: latency|quality|balanced to bias routing, or x-rodeo-sensitivity: secret to stay in the private fail-closed lane.