Introduction#
What is OptScale AI#
OptScale AI is a unified AI operations platform—AI Gateway, FinOps, profiling, and observability in one place—for teams that manage LLM access, monitor token spend, enforce governance, and run model training workflows. Administrators configure providers, policies, MCP servers, and virtual keys in the Admin UI; practitioners use Chat, APIs, and external tools such as OpenCode and Cursor for day-to-day conversations and automation. IDE and agent clients use the same OpenAI-compatible LLM proxy and organization virtual keys as Chat, so traffic stays under the same policies, guardrails, and observability. Dashboards, traces, and usage views cover both inference traffic and profiled training runs.
For platform structure, navigation modules, and the governed request path through policies and providers, see Architecture Overview. To configure your first organization and provider, see First Steps; for Admin UI and Model Training screen detail, see Core Services and Model Training. To connect OpenAI-compatible clients, copy connection values from Chat and follow External Tools.
Key capabilities#
-
AI Gateway — Connect external AI vendors, expose a unified provider catalog, and route traffic with load balancing, fallbacks, and routing strategies.
-
FinOps and usage analytics — Track token consumption, provider costs, and provider usage from dashboards; compare spend across providers and teams.
-
Governance — Apply organization-wide policies (timeouts, sampling, request filtering) and reusable guardrails for safety, compliance, and prompt restrictions.
-
Virtual keys — Issue scoped API keys for applications and teams, with usage limits, isolation, and rotation.
-
MCP integrations — Register Model Context Protocol servers, manage transport and authentication, and assign access by team.
-
Observability — Inspect request traces, analyze errors, and debug multi-step AI workflows.
-
Model Training — Register models and datasets, track tasks and artifacts, book shared environments, and connect cloud accounts and CI integrations—see Model Training.
-
Chat — Interactive workspace for multi-provider conversations, attachments, voice, web access, and chat history.
-
External tools — Connect OpenCode, Cursor, and other OpenAI-compatible clients to the governed LLM proxy with virtual keys from AI Access; copy Base URL, API key, and Model name from Chat so IDE and agent traffic uses the same providers, policies, and Traces as the rest of the organization—see External Tools.
Core concepts#
| Concept | Description |
|---|---|
| Organization | Top-level tenant that owns providers, policies, teams, usage data, and billing context. |
| Team | Group within an organization used for access control and resource isolation. |
| Provider | A registered AI endpoint (vendor integration plus callable identity such as openai/gpt-4o), including credentials, health status, limits, and use in routing and Chat. |
| Router | Configuration that directs requests to providers using routing strategies, load balancing, and fallback rules. |
| Virtual key | API credential scoped to an organization or team, used by apps and integrations instead of raw provider keys. |
| Policy | Operational rule applied to AI requests (filtering, timeouts, sampling, enforcement stage). |
| Guardrail | Reusable safety or compliance control linked to policies. See Guardrail types and Guardrail thresholds for available engines and tuning guidance. |
| MCP server | External tool or data source exposed through the Model Context Protocol. |
| Trace | Record of a single AI request lifecycle for debugging and observability. |
Supported AI providers#
OptScale AI integrates with multiple commercial and self-hosted vendor families. Supported integrations include:
- OpenAI — GPT family endpoints
- Anthropic — Claude endpoints
- Google — Gemini and related Google AI endpoints
- Cohere
- Mistral
- Ollama — Self-hosted open-source endpoints
The exact providers available in your deployment depend on which vendor connections you register. Administrators add and enable providers on the Providers page; end users then select from the enabled catalog in Chat or via API.
Main use cases#
Centralize AI access for the organization
Replace scattered vendor API keys with virtual keys, unified routing, and a single provider catalog so teams share governed access instead of ad hoc credentials.
Control cost and usage
Monitor token volume and spend by team and provider; use dashboards and usage views to find inefficiencies and enforce budgets.
Enforce governance and compliance
Apply policies and guardrails so requests respect timeouts, content rules, and safety requirements before they reach external providers.
Operate multi-provider AI workloads
Route traffic across providers with fallbacks and load balancing; compare provider performance and cost without rewriting client applications.
Extend assistants with approved tools
Connect MCP servers so assistants can use organization-approved tools and external context.
Debug and improve AI workflows
Use traces and request lifecycle views to troubleshoot failures, latency, and unexpected provider behavior.
Collaborate through Chat
Give practitioners a shared chat workspace with history, provider switching, attachments, and organization-scoped access.
Connect OpenCode and Cursor to OptScale AI
Route IDE and agent clients through the governed LLM proxy with virtual keys—see Connect OpenCode and Connect Cursor.
Migrate existing AI conversations
Export history from ChatGPT, Claude, or Gemini and import it into OptScale AI Chat so teams keep context under organization governance—see Use Cases — Migrate chat history.
Track ML experiments and artifacts
Version datasets, models, tasks, and experiment outputs, and manage shared environments and cloud connections—see Model Training.