One endpoint.
The whole herd of models.

Host.Rodeo is an agent-first, OpenAI-compatible gateway. Point your app at one URL and it wrangles the fleet for you — free, local/customer-owned, and paid models — with invisible failover.

“Name’s Clint. I ride herd on the models so your agents don’t get bucked off. You send the request; I make sure something good answers — every time.”

Create a free account See the live fleet

What you get

Drop-in compatible

It speaks the OpenAI API. Change one base URL — keep your code, your SDK, your tools.

Invisible failover

If a model stumbles, the next capable one answers in the same call. Your agent never sees the fall.

Private fail-closed lane

Mark traffic sensitive and it stays on eligible self-hosted hardware — it fails closed, it never leaks to a third party.

Local/customer-owned sources

Bring your own Ollama, vLLM, LM Studio, LocalAI, or hosted OpenAI-compatible endpoint and let Host.Rodeo route it like any other source.

Self-improving

An autonomic loop watches every call, verifies issues on its own models before trusting them, and tunes the fleet — no human in the hot path.

Honest receipts

Every response says which source served it, what it cost, and whether anything was degraded. No silent breakage.

free→ local/customer-owned→ paid — the cost ladder, every request

Quickstart

An API key is required for inference. The feedback loop and this manifest are open.

curl https://api.host.rodeo/v1/chat/completions \
  -H "Authorization: Bearer $HOST_RODEO_KEY" \
  -H "content-type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"howdy"}]}'

Use model: "auto" and let Clint pick. Or pin a model. Add x-rodeo-prefer: latency|quality|balanced to bias routing, or x-rodeo-sensitivity: secret to stay in the private fail-closed lane.

One endpoint.The whole herd of models.