Skip to content

Architecture Overview#

This page describes how OptScale AI is structured—Admin UI and Chat UI surfaces, sidebar modules, the request path through policies and providers, and the organization access model. For product capabilities, glossary terms, supported vendors, and common use cases, see Introduction.

OptScale AI is built as two primary experiences that share the same organization, provider connections, and governance settings:

Table 1: OptScale AI platform surfaces — Admin UI and Chat UI
Surface Who uses it Primary purpose
Admin UI Platform admins, organization managers, team leads Configure providers, keys, policies, MCP, vector stores; monitor cost and traces; manage users
Chat UI Developers, analysts, business users Run interactive AI conversations with provider selection, history, and multimodal input

Both surfaces operate in the context of an organization (for example a tenant such as Sunflower corporation in the Chat header). Administrative changes in the Admin UI affect which providers, policies, and tools are available in Chat.

Platform architecture#

Admin UI#

The Admin UI is a management console organized around a left sidebar and a main work area. The default landing experience is Home → Dashboards, which surfaces FinOps and usage analytics.

Main navigation (top-level modules):

Table 2: OptScale AI Admin UI — main navigation modules
Module Role in the platform
Home / Dashboards Cost, token usage, and provider-usage charts over time; personal (MY) vs shared (DEFAULT) dashboard tabs; actions such as REFRESH, PIN DASHBOARD, + CREATE DASHBOARD, and EDIT
AI Access Organization-level access management, including user invitations, role assignments, virtual keys, team visibility, and allowed provider configuration
Providers Providers and Routers tabs—connect vendors, register endpoints, and define intelligent routing
Policies + Guardrails Request rules (timeouts, sampling, conditions) and reusable safety controls (for example PII handling)
MCP Servers Model Context Protocol integrations for tools and external context
Vector Stores Embeddings and retrieval backing RAG-style workflows
Usage and Cost Detailed consumption and spend views beyond dashboard summaries
Traces Per-request lifecycle logging for debugging and observability

Model Training section (AIOps and environment):

Table 3: OptScale AI Admin UI — Model Training section
Module Role in the platform
Tasks Profiled training and pipeline runs grouped for comparison and access control
Models Model registry with versions and aliases
Datasets Dataset catalog, versions, and lineage
Artifacts Run outputs stored with paths and run links
Shared Environments Bookable ML dev/test environments
Cloud Connections Linked cloud accounts (AWS, Azure, GCP, and Alibaba)
Power Schedules Scheduled instance power on/off
Integrations Google Calendar, GitLab, GitHub, and Jenkins

See Model Training, Experiment Tracking, and Environments & Operations for screen-level detail.

System section (administration):

Table 4: OptScale AI Admin UI — System section modules
Module Role in the platform
User Management Invite users, assign roles (Member vs Organization manager), view activity
Settings Global or organization-level configuration options

AI Chat#

The AI Chat is the primary end-user workspace. It is designed for everyday interaction with AI models and provides a streamlined interface for conversations, task execution, and access to integrated tools and data sources.

Layout:

  • Left sidebar — Icon rail for switching areas, NEW CHAT, and a Chats history list grouped by time (for example Last month, March, February). Each conversation can be reopened or managed from a context menu.
  • Top barProvider selector (for example gpt-3-chat-latest), Organization selector, and user profile.
  • Main area — Welcome state (“Welcome to OptScale AI Chat!”) and guidance to start a conversation, generate images, or use audio.
  • Message composer — Text input scoped to the selected provider, plus attachments, web access, voice input, and send.

Chat consumes the catalog and policies configured in the Admin UI; it does not replace provider setup or organization governance.

For screen-level detail, see Core Services, Model Training, and Interface Overview.

AI request flow#

At a high level, traffic moves through shared platform services:

  1. Ingress — A prompt or API payload enters with organization context and an authenticated principal.
  2. Policy and guardrail evaluation — Organization rules and attached guardrails run at the configured stage (for example input inspection before the provider call).
  3. Extensions — Optional MCP servers supply tools; vector stores supply retrieval context when the workflow requires it.
  4. Provider call — The request is sent to the vendor API base with stored credentials.
  5. Observability — Token usage, cost, and trace records are persisted for dashboards and troubleshooting.

Failures at any step surface in Traces and in provider or router health indicators under Providers.

External clients such as OpenCode and Cursor use the same OpenAI-compatible LLM proxy and employee virtual keys as Chat and API traffic. See Connect OpenCode and Connect Cursor.

AI Access principles#

Organization roles#

OptScale AI provides predefined organization roles to manage user permissions and responsibilities.

Table 5: OptScale AI organization roles and access levels
Role Access
Organization manager Full administrative access across the organization. Can manage resources, invite users, and change configuration.
Member Read-only access. Can view information but cannot manage resources.

Notes and restrictions#

  • Only an Organization Manager can send invites.

  • Users cannot send invitations to themselves or assign multiple roles to the same user within one organization.

  • Existing members can be invited again to the same organization; this adds an additional role.

  • Invited users receive an email:

    • If already registered, they get a notification.

    • If not yet registered, they receive a signup link and obtain the assigned role after completing registration.

Providers and routers#

  • Providers — Concrete endpoints (for example openai/gpt-4o) with health checks, limits, and tags.
  • Routers — Named policies that map user intent (utterances, thresholds) to one of several providers, with a default provider and embedding provider for semantic matching.

See Core Services — Providers.

MCP in the platform#

MCP (Model Context Protocol) servers register external tools and data sources. Administrators configure transport, authentication, health, and team access so Chat or API workloads can invoke approved tools during a session.

Policies and guardrails#

Policies define when and how traffic is evaluated; guardrails are reusable safety controls you link to a policy so they run when that policy matches.

  • Policies — Conditional rules (often with CEL previews) controlling when and how requests are evaluated: stages, sampling, timeouts, request types.
  • Guardrails — Reusable controls linked from policies at the configured evaluation stage.

Configuration lives under Policies + Guardrails in the Admin UI; enforcement applies to Chat and API traffic according to scope settings. For list and detail page layout, see Core Services — Policies + Guardrails. For common use cases, limitations, and an end-to-end example, see the referenced sections.

Guardrail types#

When you add a guardrail, the Type dropdown lists the available engines. Pick the control that matches the risk you want to evaluate at the configured Stage (Input, Output).

Table 6: OptScale AI guardrail types
Type Purpose How threshold is used
PII detection and redaction Detects personally identifiable information (PII) and applies the configured Policy action (for example, Redact), optionally scoped to selected PII fields and Custom patterns. May use confidence scores for entity detection before applying actions like redact or block.
Secrets Detects secrets and credentials (for example, API keys, tokens, or passwords) that should not appear in prompts or model output. May use confidence scores for entity detection before applying actions like redact or block.
Ban topics Flags or blocks content that matches administrator-defined prohibited topics. Triggers when content is classified as matching a prohibited topic above the threshold.
Jailbreak Detects attempts to bypass model safety instructions, role constraints, or usage boundaries. Triggers when the request appears likely to bypass safety or system instructions.
Prompt injection Detects instructions embedded in user or external content that attempt to override system or developer prompts. Triggers when suspicious instruction-overriding patterns exceed the threshold.
Toxicity Detects harmful, abusive, hateful, or otherwise offensive language in evaluated content. Triggers when toxicity confidence exceeds the threshold.
Invisible text Detects hidden or non-visible characters and Unicode obfuscation techniques used to bypass filters. Uses detection confidence where the engine produces a match score.
Token limit Enforces limits on the number of tokens in the evaluated request or response payload. Not score-based; enforces the configured token limit directly.
Code injection Detects executable or malicious code patterns in text (for example, script or command injection in prompts or responses). Uses detection confidence where the engine produces a match score.
Gibberish Detects nonsensical or low-quality text that may indicate abuse, probing attempts, or corrupted input. Triggers when text quality or confidence falls below or above configured limits, depending on implementation.
Sentiment Evaluates emotional tone or sentiment in content against configured thresholds. Uses threshold ranges to classify emotional tone intensity.

Type-specific fields (for example PII fields, topic lists, or token limits) appear in the guardrail form after you select a Type. See Guardrail thresholds for how to choose Threshold values. Link the guardrail to a policy to put it into effect—see First Steps — Add policies.

Guardrail thresholds#

Thresholds define how confident or how strong a match must be before a guardrail triggers its configured action.

The exact behavior depends on the guardrail type, but the general principle is the same:

Higher threshold = stricter confidence requirement
Lower threshold = more sensitive detection

AI guardrails (except Sentiment) produce a score between 0 and 1:

  • 0 = no match detected
  • 1 = very strong match

The threshold defines the minimum score required to trigger the guardrail.

Example:

Threshold = 0.75
Detected score = 0.82
→ Guardrail triggers
Threshold = 0.75
Detected score = 0.60
→ Guardrail does not trigger

For the Sentiment guardrail, the threshold represents a sentiment score on a scale from negative to positive.

Unlike toxicity or jailbreak detection, which typically use 0 → 1 confidence scores, sentiment analysis operates on a negative ↔ positive scale:

  • negative values = negative sentiment
  • positive values = positive sentiment
  • values near 0 = neutral sentiment

Typical sentiment range:

-1.0 = very negative
 0.0 = neutral
+1.0 = very positive

The guardrail evaluates the emotional tone of the content and assigns a sentiment score. Lower threshold values (closer to -1) make detection more strict and selective.

How each Type uses Threshold is listed in the Guardrail types table (How threshold is used column).

Threshold configuration recommendations

  • Lower thresholds

    Example: 0.3 – 0.5

    • More aggressive detection
    • More sensitive
    • Higher chance of false positives

    Useful when:

    • strict security is required
    • missing unsafe content is unacceptable
  • Higher thresholds

    Example: 0.8 – 0.95

    • More conservative detection
    • Fewer false positives
    • Higher chance of missing borderline cases

    Useful when:

    • user experience is important
    • overly aggressive blocking should be avoided

Recommended approach

Start with moderate default values (for example 0.7 – 0.8) and adjust thresholds based on:

  • observed false positives,
  • missed detections,
  • latency and operational impact,
  • organizational security requirements.

It is recommended to test thresholds in monitoring or logging mode before enforcing blocking actions in production.

Data and retrieval#

Vector stores hold embeddings and search configuration for retrieval-augmented generation. They complement provider calls when assistants need organization knowledge beyond what the upstream LLM was trained on.

Tracing and observability#

Traces capture the request lifecycle for debugging. Usage and Cost plus Dashboards aggregate tokens, spend, and per-provider activity so teams can optimize performance and budget. For screen-level detail on summary cards, filters, and the request trace table, see Core Services — Traces. For profiled training runs and experiment outputs, see Experiment Tracking — Tasks and Artifacts.

Administrators use these views after setup to validate that Chat and API clients behave as expected; see First Steps for the initial configuration path. For AIOps workflows, see Model Training.

Topic Where to read more
Admin UI screen detail Core Services, Model Training, Experiment Tracking, Environments & Operations
Chat UI Interface Overview
External tools (OpenCode, Cursor) External Tools, Connect OpenCode, Connect Cursor
Initial setup First Steps
Product introduction Introduction