Inference Control Plane

Decide where every AI request
should run

Software that runs in your cloud. Routes AI workloads across your self-hosted and hosted models. Your data never leaves your environment.

Try the routing simulator See how it works

Runs in your VPCAWS / Azure / GCP MarketplaceZero data exfiltrationYour keys, your compliance

What Makes This Different

Not a gateway. A decision engine.

Four capabilities that commodity routing tools don't have.

Normalized Cost Engine

Other tools compare per-token sticker prices.

We compute cost-to-successful-completion across your self-hosted GPUs and your hosted API accounts. That means idle overhead, queue delays, retry probability, power costs, and amortization — all normalized into one comparable number. Your backends, our math.

Policy-Constrained Routing

Other tools treat policy as a scoring weight.

We enforce it as a hard gate. If a request violates a data residency, sensitivity, or compliance rule, that backend is excluded before scoring even begins. Cost never overrides policy.

Hybrid Staged Execution

Other tools route the whole request to one backend.

We split it across your backends. Run PII scanning or document retrieval on your local models, then send only sanitized snippets to your hosted API for synthesis. Raw data never leaves your environment. No other routing tool does this.

Ranked Fallback with Cost-to-Complete

Other tools retry the next backend in a static list.

We rank fallbacks dynamically using SLA fitness, real-time health, and the remaining cost to deliver a successful response. If the primary fails at 80% completion, the fallback accounts for that sunk cost.

Available on

AWS Marketplace

Install directly into your EKS cluster. Counts toward your AWS EDP commitment.

View on AWS Marketplace

Interactive Demo

See the decision engine in action

Choose a scenario and watch ICP evaluate backends, apply policies, estimate costs, and explain its routing decision.

Select a scenario

Decision trace

Select a scenario and click “Route this request”

For Startups

Ship AI features fast without burning through your runway

Cut Inference Costs Immediately

Route simple queries to open-source models on your existing GPU. Complex queries auto-escalate to your hosted API accounts. VIP tiers get priority routing. You control every backend and every dollar.

One Endpoint, All Providers

OpenAI-compatible proxy works with LangChain, your custom code, and any framework. No vendor lock-in.

Budget Guardrails

Set daily spend caps. When limits are hit, traffic shifts to local models automatically. No surprise invoices.

For Enterprise

Deploy with confidence, protect your data, and build without limits

Your Cloud. Your Data. Always.

ICP runs entirely inside your VPC. Your prompts never leave your environment. Your provider credentials stay in your K8s secrets. Zero data exfiltration — verifiable by network policy.

SOC 2 & GDPR Ready

Access logging, audit trails, data export/purge APIs, and payload redaction built in. Every routing decision is fully explainable and auditable. Compliance controls, not afterthoughts.

Predictable Pricing

Free tier for evaluation. Usage-based after that — no per-seat fees, no hidden costs. Scale teams without scaling bills.

Pricing

Free to start. Pay as you scale.

Software license only — you pay your cloud providers directly for inference. No per-seat fees. Available on all three cloud marketplaces.

Free

$0/ forever

Full routing engine. Connect your own backends. No credit card.

Install free

1,000 routed requests / day
All backend adapters (Bedrock, Azure, Vertex, self-hosted)
Policy engine & cost normalization
Hybrid staged execution
Decision traces & explainability
OpenAI-compatible proxy
Runs entirely in your cloud

Pro

$99/ month

Uncapped routing + deploy models to your cluster. Usage-based after included requests.

Start routing

10,000 requests included / month
$1.00 per 1,000 requests after that
Everything in Free, uncapped
ICP Agent — deploy models to your K8s
Model catalog (Llama, Mistral, Qwen, Phi)
Autoscaling & health monitoring
Budget controls & spend caps
Priority support

Enterprise

Custom/ annual contract

Same capabilities as Pro with volume pricing, dedicated support, and custom SLAs. Counts toward your cloud EDP commitment.

Contact sales

Everything in Pro
Volume discounts on usage
Dedicated support engineer
Custom SLAs
Deployment approval workflows
SOC 2 & GDPR compliance features
SSO / Entra ID integration
Marketplace private offers

Frequently Asked Questions

Answers to common questions about ICP and launching on your preferred cloud.

Start routing smarter today

Install in your cloud in minutes. Connect your backends. See cost savings on your first request. Your data never leaves your environment.

Try the routing simulator View pricing

Decide where every AI request should run