Architecture Overview#

This page describes how OptScale AI is structured—Admin UI and Chat UI surfaces, sidebar modules, the request path through policies and providers, and the organization access model. For product capabilities, glossary terms, supported vendors, and common use cases, see Introduction.

OptScale AI is built as two primary experiences that share the same organization, provider connections, and governance settings:

Table 1: OptScale AI platform surfaces — Admin UI and Chat UI
Surface	Who uses it	Primary purpose
Admin UI	Platform admins, organization managers, team leads	Configure providers, keys, policies, and MCP; monitor cost and traces; manage users
Chat UI	Developers, analysts, business users	Run interactive AI conversations with provider selection, history, and multimodal input

Both surfaces operate in the context of an organization (for example a tenant such as Sunflower corporation in the Chat header). Administrative changes in the Admin UI affect which providers, policies, and tools are available in Chat.

Platform architecture#

Admin UI#

The Admin UI is a management console organized around a left sidebar and a main work area. The default landing experience is Home → Dashboards, which surfaces FinOps and usage analytics.

Main navigation (top-level modules):

Table 2: OptScale AI Admin UI — main navigation modules
Module	Role in the platform
Home / Dashboards	Cost, token usage, and provider-usage charts over time; personal (MY) vs shared (DEFAULT) dashboard tabs; actions such as REFRESH, PIN DASHBOARD, + CREATE DASHBOARD, and EDIT
AI Access	Organization-level access management, including user invitations, role assignments, virtual keys, team visibility, and allowed provider configuration
Providers	Providers and Routers tabs—connect vendors, register endpoints, and define intelligent routing
Policies & Guardrails	Request rules (timeouts, sampling, conditions) and reusable safety controls (for example PII handling)
MCP Servers	Model Context Protocol integrations for tools and external context
Optimizations	Analyze AI cost savings by comparing projected and actual spend, with savings broken down by optimization technique (cache read, prompt compression, memory retrieval), model, and user.

Analytics section (FinOps and observability):

Table 3: OptScale AI Admin UI — Analytics section modules
Module	Role in the platform
Usage and Cost	Detailed consumption and spend views beyond dashboard summaries
Traces	Per-request lifecycle logging for debugging and observability

See Core Services — Usage and Cost and Traces for screen-level detail.

Model Training section (AIOps and environment):

Table 4: OptScale AI Admin UI — Model Training section
Module	Role in the platform
Tasks	Profiled training and pipeline runs grouped for comparison and access control
Models	Model registry with versions and aliases
Datasets	Dataset catalog, versions, and lineage
Artifacts	Run outputs stored with paths and run links
Shared Environments	Bookable ML dev/test environments
Cloud Connections	Linked cloud accounts (AWS, Azure, GCP, and Alibaba)
Power Schedules	Scheduled instance power on/off
Integrations	Google Calendar, GitLab, GitHub, and Jenkins

See Model Training, Experiment Tracking, and Environments & Operations for screen-level detail.

System section (administration):

Table 5: OptScale AI Admin UI — System section modules
Module	Role in the platform
User Management	Invite users, assign roles (Member vs Organization manager), view activity
Settings	Global or organization-level configuration options

Chat#

Chat is the primary end-user workspace for interacting with AI models. It provides a unified interface for conversations, task execution, file and media handling, and access to approved tools and data sources. Chat operates within the governance framework defined in the Admin UI, including provider access, routing rules, policies, guardrails, and usage controls.

The Chat interface provides navigation between chat areas, access to projects and settings, the ability to connect external tools, and access to conversation history. Select an Organization, Provider, and Model to scope available resources and start interacting with AI models. Use NEW CHAT to start a new conversation, then enter prompts using the message composer, which also provides controls for attachments, web access, voice input, and message submission.

Chat consumes the provider catalog, access controls, routing configuration, and governance policies defined in the Admin UI. It serves as the user-facing interface for these services and does not replace administrative configuration or organization governance.

For related Admin UI and Chat documentation, see Core Services, Model Training, and Interface Overview.

AI request flow#

At a high level, traffic moves through shared platform services:

Ingress — A prompt or API payload enters with organization context and an authenticated principal.
Policy and guardrail evaluation — Organization rules and attached guardrails run at the configured stage.
Extensions — Optional MCP servers supply approved tools when the workflow requires them.
Context compression (when enabled) — The payload is optimized before the provider call; see Compression workflow.
Provider call — The request is sent to the vendor API base with stored credentials.
Observability — Token usage, cost, and trace records are persisted for dashboards and troubleshooting.

Failures at any step surface in Traces and in provider or router health indicators under Providers.

External clients such as OpenCode and Cursor use the same OpenAI-compatible LLM proxy and employee virtual keys as Chat and API traffic. See Connect OpenCode and Connect Cursor. For migrating existing SDK or proxy integrations, see Switch to OptScale AI Gateway.

AI Access principles#

Organization roles#

OptScale AI provides predefined organization roles to manage user permissions and responsibilities.

Table 6: OptScale AI organization roles and access levels
Role	Access
Organization manager	Full administrative access across the organization. Can manage resources, invite users, and change configuration.
Member	Access to Chat and its available features only. Cannot access the Admin UI or manage organizational resources.

Notes and restrictions#

Only an Organization Manager can send invites.
Users cannot send invitations to themselves or assign multiple roles to the same user within one organization.
Existing members can be invited again to the same organization; this adds an additional role.
Invited users receive an email:
- If already registered, they get a notification.
- If not yet registered, they receive a signup link and obtain the assigned role after completing registration.

Providers and routers#

Providers — Concrete endpoints (for example openai/gpt-4o) with health checks, limits, and tags.
Routers — Named policies that map user intent (utterances, thresholds) to one of several providers, with a default provider and embedding provider for semantic matching.
Providers — AI model endpoints (for example, openai/gpt-4o) configured in OptScale AI. Providers include connection settings, health checks, limits, and tags.
Routers — Routing configurations that select a provider based on user requests, matching rules, or thresholds. Each router includes a provider used by routing rule and an fallback provider used when a target cannot process a request.

See Core Services — Providers.

MCP in the platform#

MCP (Model Context Protocol) servers register external tools and data sources. Administrators configure transport, authentication, health, and team access so Chat or API workloads can invoke approved tools during a session.

Policies and guardrails#

Policies define when and how traffic is evaluated; guardrails are reusable safety controls you link to a policy so they run when that policy matches.

Policies — Conditional rules (often with CEL previews) controlling when and how requests are evaluated: stages, sampling, timeouts, request types.
Guardrails — Reusable controls linked from policies at the configured evaluation stage.

Configuration lives under Policies & Guardrails in the Admin UI; enforcement applies to Chat and API traffic according to scope settings. For list and detail page layout, see Core Services — Policies & Guardrails. For common use cases, limitations, and an end-to-end example, see the referenced sections.

Guardrail types#

When you add a guardrail, the Type dropdown lists the available engines. Pick the control that matches the risk you want to evaluate at the configured Stage (Input, Output).

Table 7: OptScale AI guardrail types
Type	Purpose	How threshold is used
PII detection and redaction	Detects personally identifiable information (PII) and applies the configured Policy action (for example, Redact), optionally scoped to selected PII fields and Custom patterns.	May use confidence scores for entity detection before applying actions like redact or block.
Secrets	Detects secrets and credentials (for example, API keys, tokens, or passwords) that should not appear in prompts or model output.	May use confidence scores for entity detection before applying actions like redact or block.
Ban topics	Flags or blocks content that matches administrator-defined prohibited topics.	Triggers when content is classified as matching a prohibited topic above the threshold.
Jailbreak	Detects attempts to bypass model safety instructions, role constraints, or usage boundaries.	Triggers when the request appears likely to bypass safety or system instructions.
Prompt injection	Detects instructions embedded in user or external content that attempt to override system or developer prompts.	Triggers when suspicious instruction-overriding patterns exceed the threshold.
Toxicity	Detects harmful, abusive, hateful, or otherwise offensive language in evaluated content.	Triggers when toxicity confidence exceeds the threshold.
Invisible text	Detects hidden or non-visible characters and Unicode obfuscation techniques used to bypass filters.	Uses detection confidence where the engine produces a match score.
Token limit	Enforces limits on the number of tokens in the evaluated request or response payload.	Not score-based; enforces the configured token limit directly.
Code injection	Detects executable or malicious code patterns in text (for example, script or command injection in prompts or responses).	Uses detection confidence where the engine produces a match score.
Gibberish	Detects nonsensical or low-quality text that may indicate abuse, probing attempts, or corrupted input.	Triggers when text quality or confidence falls below or above configured limits, depending on implementation.
Sentiment	Evaluates emotional tone or sentiment in content against configured thresholds.	Uses threshold ranges to classify emotional tone intensity.

Type-specific fields (for example PII fields, topic lists, or token limits) appear in the guardrail form after you select a Type. See Guardrail thresholds for how to choose Threshold values. Link the guardrail to a policy to put it into effect—see First Steps — Add policies.

Guardrail thresholds#

Thresholds define how confident or how strong a match must be before a guardrail triggers its configured action.

The exact behavior depends on the guardrail type, but the general principle is the same:

Higher threshold = stricter confidence requirement
Lower threshold = more sensitive detection

AI guardrails (except Sentiment) produce a score between 0 and 1:

0 = no match detected
1 = very strong match

The threshold defines the minimum score required to trigger the guardrail.

Example:

Threshold = 0.75
Detected score = 0.82
→ Guardrail triggers

Threshold = 0.75
Detected score = 0.60
→ Guardrail does not trigger

For the Sentiment guardrail, the threshold represents a sentiment score on a scale from negative to positive.

Unlike toxicity or jailbreak detection, which typically use 0 → 1 confidence scores, sentiment analysis operates on a negative ↔ positive scale:

negative values = negative sentiment
positive values = positive sentiment
values near 0 = neutral sentiment

Typical sentiment range:

-1.0 = very negative
 0.0 = neutral
+1.0 = very positive

The guardrail evaluates the emotional tone of the content and assigns a sentiment score. Lower threshold values (closer to -1) make detection more strict and selective.

How each Type uses Threshold is listed in the Guardrail types table (How threshold is used column).

Threshold configuration recommendations

Lower thresholds

Example: 0.3 – 0.5
- More aggressive detection
- More sensitive
- Higher chance of false positives
Useful when:
- strict security is required
- missing unsafe content is unacceptable
Higher thresholds

Example: 0.8 – 0.95
- More conservative detection
- Fewer false positives
- Higher chance of missing borderline cases
Useful when:
- user experience is important
- overly aggressive blocking should be avoided

Recommended approach

Start with moderate default values (for example 0.7 – 0.8) and adjust thresholds based on:

observed false positives,
missed detections,
latency and operational impact,
organizational security requirements.

It is recommended to test thresholds in monitoring or logging mode before enforcing blocking actions in production.

Tracing and observability#

Traces capture the request lifecycle for debugging. Usage and Cost plus Dashboards aggregate tokens, spend, and per-provider activity so teams can optimize performance and budget. Optimizations measures savings from cache read, prompt compression, and memory retrieval—see Use Cases — Optimizations and Core Services — Optimizations. For dashboard scenarios, see Use Cases — Dashboards. For screen-level detail on summary cards, filters, and the request trace table, see Core Services — Traces. For profiled training runs and experiment outputs, see Experiment Tracking — Tasks and Artifacts.

Administrators use these views after setup to validate that Chat and API clients behave as expected; see First Steps for the initial configuration path. For AIOps workflows, see Model Training.