Agentic Layers · Gateway

Your agent's whole pipeline. One gateway.

Inference, memory, tools, verifiers, traces — every call your agent makes runs through Mubit Gateway. One client, not five vendor SDKs stitched together.

Select
  • gpt-5.5
  • gpt-5.5-pro
  • claude-opus-4-8
  • gemini-3.5
  • llama-4
send
In practice / 01

One product where your agent's pipeline actually lives.

01 · ONE PIPELINE, NOT FIVE

Pick your stack. One SDK runs all of it.

Pick your models, point at your tools, and your agent has inference, memory, verifiers, and traces from the first call. Swap models, memory backends, or tool catalogs later — the agent code doesn't change.

PROVIDERS
OpenAI
Anthropic
Google
Cohere
Microsoft

02 · LEARN BETWEEN RUNS

Lessons from the last run, applied to the next call.

When a run finishes, the lesson is stored. The next call reads it before generating. No vector DB to provision, no glue between your memory store and your model client.

LAST RUN MEMORY NEXT CALL RUN 01 LESSONS RUN 02

03 · ONE TRACE PER RUN

The full agentic pipeline in a single timeline.

Every LLM call, memory read, tool invocation, and verifier outcome lands in the same run trace. No correlating spans across Helicone, LangSmith, and Datadog — your agent's full causal chain is one query.

"found it without three dashboards"

Less to integrate. More coordination.

Inference, memory, tools, verifiers, and traces share a code path. Each piece has access to the others.

FAQ
How is this different from a model router like OpenRouter or LiteLLM?

Model routers connect one piece of the stack — your app to multiple LLMs. Gateway is the seam for the whole agentic pipeline: inference, memory, tools, verifiers, traces, and audit, all through one SDK and one observability surface. Routing across LLM providers is one feature, not the product.

What exactly does Gateway consolidate?

Five things that usually live in five products: LLM inference (OpenAI, Anthropic, Google, etc.), execution memory (lesson capture + retrieval), tool registry + invocation, verifier outcomes, and the run-level trace that ties it all together. One client surfaces all of them.

Does Gateway add latency?

Typical Gateway overhead is 6–9ms at p95 — negligible against any LLM round-trip. Requests stream through; we don't buffer the response. Memory writes happen async after the response is returned.

Can I keep my existing LLM keys?

Yes. Bring-your-own-keys is the default for every provider — Gateway uses your keys, so you keep your existing rate limits, billing, and SLA terms. Mubit-managed keys are available for providers we resell.

Does this replace my vector database?

It can. Mubit Memory ships with managed embeddings and retrieval — point Gateway at your agent ID and you're done. Or run Gateway as a passthrough to your existing vector store; the SDK doesn't care.

What about my existing observability tools?

Gateway exports OTel spans natively — pipe them to Datadog, Honeycomb, LangSmith, or your own collector. The unified Mubit trace is an additional surface, not a replacement.

Can I self-host the Gateway?

Yes. Run the gateway plane in your own VPC — same wire format, same SDK, same APIs. Useful when traffic must not leave your network, or you want all keys to stay in-house.