Architecture Overview#
This page describes how OptScale AI is structured—Admin UI and Chat UI surfaces, sidebar modules, the request path through policies and providers, and the organization access model. For product capabilities, glossary terms, supported vendors, and common use cases, see Introduction.
OptScale AI is built as two primary experiences that share the same organization, provider connections, and governance settings:
| Surface | Who uses it | Primary purpose |
|---|---|---|
| Admin UI | Platform admins, organization managers, team leads | Configure providers, keys, policies, MCP, vector stores; monitor cost and traces; manage users |
| Chat UI | Developers, analysts, business users | Run interactive AI conversations with provider selection, history, and multimodal input |
Both surfaces operate in the context of an organization (for example a tenant such as Sunflower corporation in the Chat header). Administrative changes in the Admin UI affect which providers, policies, and tools are available in Chat.
Platform architecture#
Admin UI#
The Admin UI is a management console organized around a left sidebar and a main work area. The default landing experience is Home → Dashboards, which surfaces FinOps and usage analytics.
Main navigation (top-level modules):
| Module | Role in the platform |
|---|---|
| Home / Dashboards | Cost, token usage, and provider-usage charts over time; personal (MY) vs shared (DEFAULT) dashboard tabs; actions such as REFRESH, PIN DASHBOARD, + CREATE DASHBOARD, and EDIT |
| AI Access | Organization-level access management, including user invitations, role assignments, virtual keys, team visibility, and allowed provider configuration |
| Providers | Providers and Routers tabs—connect vendors, register endpoints, and define intelligent routing |
| Policies + Guardrails | Request rules (timeouts, sampling, conditions) and reusable safety controls (for example PII handling) |
| MCP Servers | Model Context Protocol integrations for tools and external context |
| Vector Stores | Embeddings and retrieval backing RAG-style workflows |
| Usage and Cost | Detailed consumption and spend views beyond dashboard summaries |
| Traces | Per-request lifecycle logging for debugging and observability |
Model Training section (AIOps and environment):
| Module | Role in the platform |
|---|---|
| Tasks | Profiled training and pipeline runs grouped for comparison and access control |
| Models | Model registry with versions and aliases |
| Datasets | Dataset catalog, versions, and lineage |
| Artifacts | Run outputs stored with paths and run links |
| Shared Environments | Bookable ML dev/test environments |
| Cloud Connections | Linked cloud accounts (AWS, Azure, GCP, and Alibaba) |
| Power Schedules | Scheduled instance power on/off |
| Integrations | Google Calendar, GitLab, GitHub, and Jenkins |
See Model Training, Experiment Tracking, and Environments & Operations for screen-level detail.
System section (administration):
| Module | Role in the platform |
|---|---|
| User Management | Invite users, assign roles (Member vs Organization manager), view activity |
| Settings | Global or organization-level configuration options |
AI Chat#
The AI Chat is the primary end-user workspace. It is designed for everyday interaction with AI models and provides a streamlined interface for conversations, task execution, and access to integrated tools and data sources.
Layout:
- Left sidebar — Icon rail for switching areas, NEW CHAT, and a Chats history list grouped by time (for example Last month, March, February). Each conversation can be reopened or managed from a context menu.
- Top bar — Provider selector (for example
gpt-3-chat-latest), Organization selector, and user profile. - Main area — Welcome state (“Welcome to OptScale AI Chat!”) and guidance to start a conversation, generate images, or use audio.
- Message composer — Text input scoped to the selected provider, plus attachments, web access, voice input, and send.
Chat consumes the catalog and policies configured in the Admin UI; it does not replace provider setup or organization governance.
For screen-level detail, see Core Services, Model Training, and Interface Overview.
AI request flow#
At a high level, traffic moves through shared platform services:
- Ingress — A prompt or API payload enters with organization context and an authenticated principal.
- Policy and guardrail evaluation — Organization rules and attached guardrails run at the configured stage (for example input inspection before the provider call).
- Extensions — Optional MCP servers supply tools; vector stores supply retrieval context when the workflow requires it.
- Provider call — The request is sent to the vendor API base with stored credentials.
- Observability — Token usage, cost, and trace records are persisted for dashboards and troubleshooting.
Failures at any step surface in Traces and in provider or router health indicators under Providers.
External clients such as OpenCode and Cursor use the same OpenAI-compatible LLM proxy and employee virtual keys as Chat and API traffic. See Connect OpenCode and Connect Cursor.
AI Access principles#
Organization roles#
OptScale AI provides predefined organization roles to manage user permissions and responsibilities.
| Role | Access |
|---|---|
| Organization manager | Full administrative access across the organization. Can manage resources, invite users, and change configuration. |
| Member | Read-only access. Can view information but cannot manage resources. |
Notes and restrictions#
-
Only an Organization Manager can send invites.
-
Users cannot send invitations to themselves or assign multiple roles to the same user within one organization.
-
Existing members can be invited again to the same organization; this adds an additional role.
-
Invited users receive an email:
-
If already registered, they get a notification.
-
If not yet registered, they receive a signup link and obtain the assigned role after completing registration.
-
Providers and routers#
- Providers — Concrete endpoints (for example
openai/gpt-4o) with health checks, limits, and tags. - Routers — Named policies that map user intent (utterances, thresholds) to one of several providers, with a default provider and embedding provider for semantic matching.
See Core Services — Providers.
MCP in the platform#
MCP (Model Context Protocol) servers register external tools and data sources. Administrators configure transport, authentication, health, and team access so Chat or API workloads can invoke approved tools during a session.
Policies and guardrails#
Policies define when and how traffic is evaluated; guardrails are reusable safety controls you link to a policy so they run when that policy matches.
- Policies — Conditional rules (often with CEL previews) controlling when and how requests are evaluated: stages, sampling, timeouts, request types.
- Guardrails — Reusable controls linked from policies at the configured evaluation stage.
Configuration lives under Policies + Guardrails in the Admin UI; enforcement applies to Chat and API traffic according to scope settings. For list and detail page layout, see Core Services — Policies + Guardrails. For common use cases, limitations, and an end-to-end example, see the referenced sections.
Guardrail types#
When you add a guardrail, the Type dropdown lists the available engines. Pick the control that matches the risk you want to evaluate at the configured Stage (Input, Output).
| Type | Purpose | How threshold is used |
|---|---|---|
| PII detection and redaction | Detects personally identifiable information (PII) and applies the configured Policy action (for example, Redact), optionally scoped to selected PII fields and Custom patterns. | May use confidence scores for entity detection before applying actions like redact or block. |
| Secrets | Detects secrets and credentials (for example, API keys, tokens, or passwords) that should not appear in prompts or model output. | May use confidence scores for entity detection before applying actions like redact or block. |
| Ban topics | Flags or blocks content that matches administrator-defined prohibited topics. | Triggers when content is classified as matching a prohibited topic above the threshold. |
| Jailbreak | Detects attempts to bypass model safety instructions, role constraints, or usage boundaries. | Triggers when the request appears likely to bypass safety or system instructions. |
| Prompt injection | Detects instructions embedded in user or external content that attempt to override system or developer prompts. | Triggers when suspicious instruction-overriding patterns exceed the threshold. |
| Toxicity | Detects harmful, abusive, hateful, or otherwise offensive language in evaluated content. | Triggers when toxicity confidence exceeds the threshold. |
| Invisible text | Detects hidden or non-visible characters and Unicode obfuscation techniques used to bypass filters. | Uses detection confidence where the engine produces a match score. |
| Token limit | Enforces limits on the number of tokens in the evaluated request or response payload. | Not score-based; enforces the configured token limit directly. |
| Code injection | Detects executable or malicious code patterns in text (for example, script or command injection in prompts or responses). | Uses detection confidence where the engine produces a match score. |
| Gibberish | Detects nonsensical or low-quality text that may indicate abuse, probing attempts, or corrupted input. | Triggers when text quality or confidence falls below or above configured limits, depending on implementation. |
| Sentiment | Evaluates emotional tone or sentiment in content against configured thresholds. | Uses threshold ranges to classify emotional tone intensity. |
Type-specific fields (for example PII fields, topic lists, or token limits) appear in the guardrail form after you select a Type. See Guardrail thresholds for how to choose Threshold values. Link the guardrail to a policy to put it into effect—see First Steps — Add policies.
Guardrail thresholds#
Thresholds define how confident or how strong a match must be before a guardrail triggers its configured action.
The exact behavior depends on the guardrail type, but the general principle is the same:
Higher threshold = stricter confidence requirement
Lower threshold = more sensitive detection
AI guardrails (except Sentiment) produce a score between 0 and 1:
0= no match detected1= very strong match
The threshold defines the minimum score required to trigger the guardrail.
Example:
Threshold = 0.75
Detected score = 0.82
→ Guardrail triggers
Threshold = 0.75
Detected score = 0.60
→ Guardrail does not trigger
For the Sentiment guardrail, the threshold represents a sentiment score on a scale from negative to positive.
Unlike toxicity or jailbreak detection, which typically use 0 → 1 confidence scores, sentiment analysis operates on a negative ↔ positive scale:
- negative values = negative sentiment
- positive values = positive sentiment
- values near
0= neutral sentiment
Typical sentiment range:
-1.0 = very negative
0.0 = neutral
+1.0 = very positive
The guardrail evaluates the emotional tone of the content and assigns a sentiment score. Lower threshold values (closer to -1) make detection more strict and selective.
How each Type uses Threshold is listed in the Guardrail types table (How threshold is used column).
Threshold configuration recommendations
-
Lower thresholds
Example:
0.3 – 0.5- More aggressive detection
- More sensitive
- Higher chance of false positives
Useful when:
- strict security is required
- missing unsafe content is unacceptable
-
Higher thresholds
Example:
0.8 – 0.95- More conservative detection
- Fewer false positives
- Higher chance of missing borderline cases
Useful when:
- user experience is important
- overly aggressive blocking should be avoided
Recommended approach
Start with moderate default values (for example 0.7 – 0.8) and adjust thresholds based on:
- observed false positives,
- missed detections,
- latency and operational impact,
- organizational security requirements.
It is recommended to test thresholds in monitoring or logging mode before enforcing blocking actions in production.
Data and retrieval#
Vector stores hold embeddings and search configuration for retrieval-augmented generation. They complement provider calls when assistants need organization knowledge beyond what the upstream LLM was trained on.
Tracing and observability#
Traces capture the request lifecycle for debugging. Usage and Cost plus Dashboards aggregate tokens, spend, and per-provider activity so teams can optimize performance and budget. For screen-level detail on summary cards, filters, and the request trace table, see Core Services — Traces. For profiled training runs and experiment outputs, see Experiment Tracking — Tasks and Artifacts.
Administrators use these views after setup to validate that Chat and API clients behave as expected; see First Steps for the initial configuration path. For AIOps workflows, see Model Training.
Related documentation#
| Topic | Where to read more |
|---|---|
| Admin UI screen detail | Core Services, Model Training, Experiment Tracking, Environments & Operations |
| Chat UI | Interface Overview |
| External tools (OpenCode, Cursor) | External Tools, Connect OpenCode, Connect Cursor |
| Initial setup | First Steps |
| Product introduction | Introduction |