Skip to content

Bring Your Own Model: Connect Your Own LLMs to Corporate LLM

Connect your own LLM to Corporate LLM: OpenAI-compatible endpoints, Ollama, and OpenRouter, secured against SSRF with DNS pinning. Available on Free and all paid plans.

Brand visual: a central Corporate LLM hub with three additional docking paths for custom models.

Bring Your Own Model (BYOM) is available in Corporate LLM starting today: on the Free plan and all paid plans, customers can connect their own LLM endpoints directly into the model picker, secured against SSRF with DNS pinning, with no second tool required.

Three examples from the field: a law firm runs a Llama 3 fine-tuned on its in-house client letters on its own GPU and exposes it via a secured HTTPS endpoint. An insurance company is testing an experimental model from OpenRouter that is not yet available on every platform. A mid-sized company wants to connect its self-hosted Ollama model instead of paying for a cloud model for every routine task.

Until now, each of these use cases had to happen outside of Corporate LLM. Starting today, they run in the same login as GPT, Claude, and Gemini.

Overview: With Bring Your Own Model (BYOM), Corporate LLM customers on the Free plan and all paid tiers can connect their own LLM endpoints: three adapter types for OpenAI-compatible APIs, Ollama endpoints, and OpenRouter, secured with DNS pinning against SSRF and encrypted credentials. BYOM is an add-on with no separate token budget; you settle inference costs directly with your provider.

Why the standard provider list is not enough and you need to connect your own LLM

The Corporate LLM hub bundles four cloud providers (OpenAI, Anthropic, Google, Mistral) plus a curated open-source selection (see the multi-model picker). This standard list covers the vast majority of SMB use cases, but three scenarios need more:

  • Fine-tuned models. A law firm with its own corpus of client letters has trained Llama 3 on it. The model runs on the firm's own hardware under its own contract and should be usable in the same workspace as the standard models.
  • Existing contracts and capacity. A team already has contracts or reserved capacity with a provider and wants to use them within the same login, instead of paying per token in the standard pool on top.
  • Experimental models. Frontier models that go live on OpenRouter first and are not yet available from the four major providers. If you want early access, you need the OpenRouter adapter — without opening a second tool.

Until now, that meant: ChatGPT, Claude, and Gemini ran inside Corporate LLM, while your own model ran alongside in a separate tool — with its own auth, its own data processing agreement (DPA), its own audit trail. A multi-tool break, with all the adoption consequences that entails.

Connect your own LLM with BYOM: three adapters, one model picker

BYOM extends the model picker with three connection types:

  • OpenAI-compatible. Any HTTPS endpoint that speaks the OpenAI API: vLLM, LiteLLM, Azure OpenAI, Mistral La Plateforme, your own cloud setups. You enter the base URL and API key, Corporate LLM tests the connection once and adds the discovered models to the picker.
  • Ollama. Self-hosted open-source models, connected via a publicly reachable HTTPS endpoint with a /v1 surface and bearer auth. A local address such as http://localhost:11434 is not sufficient; instead, you put an auth proxy plus an HTTPS tunnel in front of it (instructions below). The connection setup pulls the model list from the endpoint and makes it available in the picker.
  • OpenRouter. One account, many models from a single source: Kimi K2, DeepSeek, Qwen, early frontier models. You enter your OpenRouter key and choose which models should appear in the picker.

In chat, BYOM models behave like any other: model switching mid-conversation without losing context, and the same usage tracking as the standard models. For more on the standard setup without custom models, see the LLM platform for the Mittelstand.

BYOM in practice: a law firm, an industrial company, and an SMB with their own models

Law firm with a client-letter Llama. The firm runs Llama 3 on two H100 GPUs in its own data center, fine-tuned on 12 years of client letters. Through an upstream auth proxy with an OpenAI-compatible API and an HTTPS endpoint, the model is connected via BYOM. The lawyer writes in the Corporate LLM chat, and the fine-tuned model responds in the firm's house style. Inference runs on the firm's own hardware under its own contract; Corporate LLM orchestrates the request as a routing layer and keeps the credentials encrypted. This supports the protection of client confidentiality under § 203 of the German Criminal Code (StGB), because the model stays under the firm's own control. Industrial company with an experimental reasoning model. The innovation lead is testing a freshly released reasoning model on OpenRouter that is not yet available from OpenAI or Anthropic. Through the BYOM OpenRouter connection, the model lands directly in the picker, right next to GPT-5.5 and Claude Opus 4.8. Side-by-side comparison without switching tools. Mid-sized company with its own Ollama model for routine work. The IT lead runs Mistral Small and Llama on the company's own hardware and exposes them through an auth proxy with an HTTPS tunnel. For routine responses (email drafts, short summaries), the self-hosted model is enough, and its inference runs on the company's own contract rather than the plan usage limit. The cloud models remain in reserve for demanding tasks. BYOM makes this hybrid setup practical. Three setups, three models, one login, one data processing agreement (DPA). Just like the standard picker — one layer further out.

How secure is BYOM? DNS pinning against SSRF, encrypted credentials, fail-closed gate

BYOM brings external HTTPS endpoints into play that we do not control. Three security layers kick in before every connection test and before every inference call:

  1. DNS pinning against SSRF. Every base URL must be https://. Every outbound request goes through a custom dispatcher that resolves the hostname once via DNS in advance, checks the IP against a blocklist of private and internal ranges, and then connects pinned to the approved IP. Targets such as 127.0.0.1, RFC 1918, link-local, and CGNAT ranges, cloud metadata endpoints, and DNS-rebinding attempts against internal Corporate LLM infrastructure are hard-blocked.
  2. Encrypted credentials. API keys, bearer tokens, and base URLs are stored encrypted in the connection table, usable only by owners and admins, and never returned in the UI. Error reports to Sentry are limited to sanitized messages without PII; no raw logs containing keys.
  3. Fail-closed plan gate. BYOM is available on the Free plan and all paid tiers. Access is implemented as an explicit allow-list so that unknown or suspended account states do not accidentally gain access but remain fail-closed. If access is suspended or expired, existing BYOM models cannot be used until access is restored.

Is BYOM GDPR-compliant? Connecting self-hosted Ollama and your own endpoints

With BYOM, data protection responsibility shifts — it does not disappear. First, the important part: BYOM is not an air gap. Corporate LLM is a web app and calls your endpoint via a publicly reachable HTTPS URL; a local address such as http://localhost:11434 and private networks are rejected. Self-hosted Ollama therefore connects through an auth proxy plus an HTTPS tunnel. Model inference runs on your own hardware under your own contract, not in the standard model pool; Corporate LLM acts as a routing layer, keeps the credentials encrypted, and logs usage.

The division of roles is set out in the data processing agreement (DPA): Corporate LLM is responsible for the routing layer and credential management; you are responsible for the BYOM endpoint, its hosting location, and its data flow. For regulated industries, this is the decisive lever. A law firm can support its duty of confidentiality under § 203 of the German Criminal Code (StGB) for client secrets, because the fine-tuned model runs under the firm's own control and the connection is orchestrated through Corporate LLM with SSRF protection.

For how Corporate LLM ensures GDPR compliance even with cloud models, read on here: Using Claude in a GDPR-compliant way.

Which plan includes BYOM? Free and all paid plans

BYOM is available starting today on the Free plan and all paid tiers: Business Starter, Business Pro, Business Max, and Enterprise. There is no separate token budget and no plan hurdle for the feature itself. You pay your BYOM provider directly for inference; Corporate LLM does not charge tokens on the BYOM connection, and the in-app budget limits do not apply there.

Access is implemented as an explicit allow-list: if team access is suspended or expired, existing BYOM models cannot be used until access is restored. All plans keep the standard picker with the four cloud providers and the open-source models as before.

When a custom LLM is not worth it: BYOM vs. the standard picker

BYOM is a specialty feature, not the right fit for every SMB. Three cases where the standard picker is enough:

  • Standard use cases without an industry-specific edge. If you need GPT, Claude, and Gemini and have no fine-tuned use case, you will get further with the built-in models: faster, and without connection setup.
  • No internal engineering owner. BYOM means: someone maintains the endpoint connection. If the Ollama server crashes or OpenRouter keys rotate, you have to fix that yourself.
  • Pure cost optimization. If you only want to cut inference costs, the inexpensive EU models (Gemini 3 Flash-Lite, Mistral Small) will serve you better than the BYOM setup overhead.

Connect your own LLM to Corporate LLM: your setup starts today

Your own models. Your own contracts. One login. No second tool. And your data protection officer gets a clear division of roles instead of three separate contracts.

More background on the multi-model strategy in Corporate LLM:
GPT, Claude, Gemini, and Mistral in one AI platform.

To activate BYOM: in the Corporate LLM admin, go to Settings → Models → Custom Models. Requirement: you are an owner or admin of the team; available on the Free plan and all paid tiers.

Don't have a Corporate LLM account yet? Start for free on the Free plan, no payment details required. You can try BYOM directly on the Free plan and upgrade to a paid plan later.

Frequently asked questions

Which custom models can I connect to Corporate LLM?

Three adapter types: OpenAI-compatible endpoints (any provider that speaks the OpenAI API: vLLM, LiteLLM, Azure OpenAI, Mistral La Plateforme, etc.), Ollama endpoints (self-hosted open-source models such as Llama, Mistral, and Qwen, connected via a reachable HTTPS endpoint), and OpenRouter (experimental and specialty models not yet available in the standard picker).

Which plan includes BYOM?

BYOM is available on the Free plan and all paid tiers (Business Starter, Business Pro, Business Max, and Enterprise). No separate token budget is required: inference runs under your own provider contract, and in-app budget limits do not apply there. Access is implemented as an explicit allow-list so that unknown or suspended account states remain fail-closed.

Is BYOM GDPR-compliant if I connect a self-hosted model?

Model inference runs on your own hardware under your own contract, not in the standard Corporate LLM model pool. Important: BYOM is not an air gap. Corporate LLM is a web app and calls your endpoint via a publicly reachable HTTPS URL; local addresses such as http://localhost:11434 and private networks are rejected. Self-hosted Ollama therefore connects through an auth proxy plus an HTTPS tunnel. Corporate LLM acts as a routing layer, keeps the credentials encrypted, and writes audit logs; you remain responsible for the GDPR suitability of your endpoint. The data processing agreement (DPA) covers this division of roles.

How does Corporate LLM prevent a BYOM connection from being abused for SSRF attacks?

Every BYOM base URL must be https. Outbound requests go through a dispatcher that resolves the hostname via DNS in advance, checks it against a blocklist of private and internal ranges, and then connects pinned to the approved IP. This blocks DNS rebinding as well as connections to loopback (127.0.0.1), RFC 1918, link-local, CGNAT, and cloud metadata ranges. Credentials are stored encrypted, and error reports to Sentry are limited to sanitized messages without PII.

Can I see how many tokens my BYOM connection consumes?

Yes. Token consumption for the BYOM connection is tracked in a separate usage log and reported per connection in the stats area. Tokens are billed directly by your BYOM provider, not against the standard usage limit of your Corporate LLM plan. BYOM is an add-on, not a token bundle.

More articles

Start today with your team

Start now — no credit card, no risk.