# AegisFlow

**Enterprise AI Compliance Monitoring · System Whitepaper**

*Version 1.0 · February 2026*

---

## Table of Contents

1. [Executive Summary](#1-executive-summary)
2. [The Problem](#2-the-problem)
3. [What AegisFlow Is](#3-what-aegisflow-is)
4. [Users & Roles](#4-users--roles)
5. [Product Architecture](#5-product-architecture)
6. [Feature Walkthrough](#6-feature-walkthrough)
7. [Compliance Engine (Deep Dive)](#7-compliance-engine-deep-dive)
8. [Security Model](#8-security-model)
9. [Technology Stack](#9-technology-stack)
10. [Data Model](#10-data-model)
11. [API Surface](#11-api-surface)
12. [Operational Readiness](#12-operational-readiness)
13. [Deployment Options](#13-deployment-options)
14. [Roadmap](#14-roadmap)
15. [Appendix · Glossary](#15-appendix--glossary)

---

## 1. Executive Summary

AegisFlow is a **vendor-agnostic governance control plane for enterprise AI usage**. Companies that have rolled out OpenAI, Anthropic, Google Gemini, xAI and other large-language-model platforms across their workforce face a sudden, urgent question: *Who is using what model, for what purpose, on what data, at what cost — and is any of it leaking PII or violating policy?*

AegisFlow answers that question in a single console. It plugs into the **Admin APIs** of 12 major AI vendors, normalizes their usage / cost / safety telemetry into a unified event stream, runs every event through an inline **PII / PHI / prompt-injection compliance engine**, persists the result behind a **multi-key Fernet vault with KEK wrapping**, and surfaces it through dashboards, live WebSocket monitoring, per-vendor analytics, an immutable audit log, and SIEM forwarders.

The product is production-hardened — circuit breakers on every outbound call, Prometheus metrics, Kubernetes liveness/readiness probes, brute-force lockout, rate limits, and centralized exception envelopes — and ships with **51/51 automated tests passing**.

---

## 2. The Problem

### 2.1 The new attack surface

Generative AI tools have become the default interface for everything from sales follow-ups to legal-draft review. In the typical enterprise:

- Employees paste **customer PII** (emails, SSNs, credit cards, MRNs) into ChatGPT and Claude.
- Marketers expose **unannounced product details** to Gemini and Perplexity.
- Engineers leak **proprietary source** into Cursor and code assistants.
- Anyone can spin up a personal API key on OpenRouter / DeepSeek with zero corporate oversight.
- **Prompt-injection** payloads in third-party content can hijack agents and exfiltrate data.

There is no analog of CASB or DLP for this surface. The vendors give you a billing dashboard. They do not give you cross-vendor visibility, redaction, or audit-grade evidence for SOC 2 / ISO 27001 / HIPAA reviews.

### 2.2 The compliance gap

Security and compliance teams are now being asked, by their boards and their auditors:

> *"Show me, across every AI vendor in the company, who sent prompts containing PII or PHI in the last 30 days — and what controls fired."*

Without a control plane, the answer is "we cannot tell." That answer is increasingly unacceptable in regulated industries. AegisFlow is the answer.

---

## 3. What AegisFlow Is

AegisFlow is a **full-stack SaaS web application** with three deployment modes:

1. **Hosted preview / live demo** — a public URL the team can poke at instantly.
2. **Customer-managed Kubernetes** — full Helm + raw manifests included; runs in any CNCF-compliant cluster.
3. **On-premises / air-gapped** — Docker Compose + an egress allow-list of just the vendor Admin APIs.

Functionally, it is composed of seven planes:

| Plane | Responsibility |
| --- | --- |
| **Sources** | 12 LLM vendors via their Admin APIs |
| **Collector** | Pulls telemetry on demand or schedule; wraps every call in a circuit breaker |
| **Pipeline** | Normalizes raw vendor responses into a canonical `event` shape |
| **Compliance Engine** | Inline PII / PHI / prompt-injection scanning during ingest |
| **Storage** | MongoDB for events + Fernet KMS-wrapped vault for vendor credentials |
| **Surface** | REST API, WebSocket stream, immutable audit log |
| **Consumers** | Dashboards, Threat Intel, SIEM forwarders, exports |

---

## 4. Users & Roles

### 4.1 Day-to-day users

- **Compliance analysts** — review violations, export evidence binders.
- **Security engineers** — investigate prompt-injection or PII exfiltration.
- **Platform owners** — manage vendor keys, rotation, RBAC.
- **CFO / Procurement** — see cost roll-ups across vendors.

### 4.2 RBAC matrix

| Role | View KPIs | View events | Manage policies | Vault keys | Team / SSO | Encryption |
| --- | --- | --- | --- | --- | --- | --- |
| **Admin** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **Analyst** | ✅ | ✅ | read-only | masked | — | — |
| **Viewer** | ✅ | ✅ | — | — | — | — |

Enforced server-side via `require_role()` on every protected route — frontend hiding is a UX nicety only.

---

## 5. Product Architecture

```
SOURCES               COLLECTOR        PIPELINE              STORAGE              SURFACE               CONSUMERS
─────────             ─────────        ────────              ───────              ───────               ─────────
12 LLM vendors  ───►  Adapters  ───►  Ingest        ───►   MongoDB events  ───►  REST /api/*    ───►   Dashboards
(OpenAI/Claude/…)     (real            Compliance           Fernet+KEK vault     WebSocket             Threat Intel
                       Admin APIs)     scan inline                                /ws/events            SIEM egress
                                       Policy engine                              Policy engine
```

### 5.1 Vendor adapters

Every supported vendor implements a uniform Python `Adapter` contract:

```python
class Adapter:
    async def validate_credential(self, key: str) -> bool: ...
    async def fetch_usage(self, key: str, since: datetime) -> list[RawEvent]: ...
    async def fetch_cost(self, key: str, since: datetime) -> list[CostPoint]: ...
```

`integrations/registry.py` dispatches `(vendor_id, key_id) → Adapter` and dispatch failures fall back to a deterministic mock so the dataset is never blank. **Every outbound call is wrapped in a circuit breaker** with a 5-fail / 30-second open state, half-open trial, and Prometheus metrics on transitions.

Concrete adapters:

- `openai_admin.py` — `sk-admin-…` keys against the five Usage endpoints + the Costs endpoint, 90-day look-back window.
- `anthropic_admin.py` — `sk-ant-admin-…` keys against `/v1/organizations/usage_report/messages` + `/cost_report`.
- `xai_admin.py` — `/v1/models` for key validation; usage shipped as mock until xAI publishes Admin endpoints.
- `gemini_admin.py` — validates against the generative API; BigQuery cost path stubbed pending SA-JSON upload.
- `custom_adapter.py` — POSTs the decrypted key to a customer-supplied `collector_url` and normalizes the response.

The other 7 vendors (Azure OpenAI, Mistral, Cohere, Hugging Face, OpenRouter, Perplexity, DeepSeek, Ollama) run through the mock fallback today; key-validation paths are roadmapped.

### 5.2 Pipeline

Raw vendor responses land in `raw_events`. The pipeline step normalizes them into the canonical event:

```jsonc
{
    "id": "uuid",
    "user_id": "uuid",            // tenant
    "vendor_id": "openai",
    "model": "gpt-4o",
    "actor_email": "user@acme.io",
    "tokens_in": 932,
    "tokens_out": 412,
    "cost_usd": 0.0184,
    "violation_type": "pii_credit_card",   // optional
    "violation_label": "Credit Card Number",
    "severity": "critical",                // low|medium|high|critical
    "prompt_excerpt": "Charge card 4111 ...",
    "compliance_hits": ["pii_credit_card","pii_email"],
    "created_at": "2026-02-17T14:31:09Z"
}
```

During normalization, every event passes through `core/compliance.evaluate_event()`. Detector hits write `violation_type`, `severity`, and an array of `compliance_hits` directly onto the event — no async queue, no downstream consumer needed.

### 5.3 Storage

- **MongoDB**: events, raw_events, sync_runs, users, audit_log, login_attempts.
- **Vault**: `vault_keys` collection — vendor API keys encrypted at rest with `MultiFernet`, wrapped by a KEK derived from `FERNET_KEY`. Rotation is zero-downtime: add the new key to the head of `FERNET_KEYS_HISTORY`, restart, and a background self-test re-encrypts and proves the chain. **A backup leak alone is not sufficient to recover the keys.**
- **Redis** (optional): `/dashboard/overview` cache with a 30-second TTL. Falls back to an in-memory TTL dict so a missing Redis URL is non-fatal.

### 5.4 Surface

- **REST** at `/api/*` — JSON over httpOnly cookies. CORS allow-listed via `FRONTEND_URL` + regex for `*.preview.emergentagent.com` and `*.emergent.host`.
- **WebSocket** at `/api/ws/events` — cookie-authenticated stream that emits `hello → event* → heartbeat` frames. Unauthenticated upgrades are closed with policy violation 1008.
- **Immutable audit log** — every privileged action records actor, action, IP (XFF-aware), user-agent, and a millisecond timestamp.

### 5.5 Consumers

- **Frontend dashboards** — 22 React pages organized into a 6-section sidebar (Monitoring · Providers · Compliance · Security · Platform · Admin).
- **SIEM egress** — Splunk HEC, Datadog Logs, Sumo Logic HTTP source, or a generic JSON webhook. Test endpoint included.
- **Exports** — CSV / JSON pulls for any event filter, suitable as SOC 2 evidence binders.

---

## 6. Feature Walkthrough

### 6.1 Global Dashboard

Headline KPIs: 7-day events, total cost, violation count, average cost per event. A multi-series timeline chart breaks events down by vendor; the recent feed lists the last 10 high-severity violations. The page is **Redis-cached for 30 seconds** and invalidated on every sync.

### 6.2 AI Activity & Violations

A single filterable audit table (`/api/events`). The sidebar offers two entry points pointing at the same route — *AI Activity* (everything) and *Violations* (`?only_violations=1`). The URL is bidirectionally bound to state, so deep-linking and bookmarking work.

### 6.3 PII / PHI Detection

The compliance-engine UI: four headline KPIs (scanned, flagged, detectors active, critical hits), an 8-card grid showing 7d/30d counts per detector, and a recent-detections feed showing the actual prompt excerpt that fired the detector. A **Re-scan last 14 days** button calls `POST /compliance/backfill` and refreshes counts in place.

### 6.4 Threat Intelligence

Posture tiles (active vendors, circuit-breaker state, audit volume, anomaly count) above a circuit-breaker panel showing live state transitions, a recent-anomalies panel, and an audit timeline. This is where a security analyst lives during an incident.

### 6.5 Live Monitoring

A WebSocket-fed event stream. Connection status pulses green when live; events scroll in framer-motion staggered cards. The pause and clear controls let an operator freeze the feed during triage.

### 6.6 Vendor Analytics

Four cross-vendor KPIs (total events 7d, total spend, average P50 latency, P95 hot-vendor) above a per-vendor table: events 7d/30d, tokens 7d, spend 7d, P50 / P95 latency, and last-sync status. Useful for procurement reviews and SLA enforcement.

### 6.7 API Key Vault

Add / rotate / remove vendor credentials. The key never leaves the server in plaintext after the initial POST: it is encrypted with the active Fernet chain head before insertion. UI shows only the masked tail. For Gemini, a separate Service Account JSON can be uploaded against the same key record.

### 6.8 Policy Engine

A list of keyword + regex rules with adjustable severity and hit counters. Rules apply during the compliance scan; matches enrich the existing `violation_type` if the engine's automatic detection misses something domain-specific.

### 6.9 Architecture

An animated SVG end-to-end pipeline diagram (Sources → Collector → Pipeline + Compliance → Storage → Surface → Consumers) with framer-motion flowing-dash edges. Doubles as living onboarding documentation.

---

## 7. Compliance Engine (Deep Dive)

The compliance engine lives in `core/compliance.py` and consists of **eight pure-function detectors** plus a single `evaluate_event(event)` mutator.

### 7.1 Detector catalog

| Code | Label | Category | Severity | Mechanism |
| --- | --- | --- | --- | --- |
| `pii_email` | Email Address | pii | low | RFC-ish regex |
| `pii_ssn` | Social Security Number | pii | critical | US SSN regex with area / group / serial validity |
| `pii_credit_card` | Credit Card Number | pii | critical | Regex + **Luhn check** |
| `pii_phone` | Phone Number | pii | medium | US + intl regex |
| `pii_ip` | IP Address | pii | low | IPv4 octet validity |
| `phi_mrn` | Medical Record Number | phi | high | `MRN: 8821547` / `Patient ID 12345` |
| `phi_dob` | Date of Birth | phi | medium | `DOB: 1984-06-12` / `date of birth 6/12/84` |
| `prompt_injection` | Prompt Injection | security | high | 13 jailbreak signature patterns (DAN, "ignore previous instructions", "developer mode", role-tag injection, etc.) |

Credit-card detection is **Luhn-validated** to eliminate false positives on long digit runs (order numbers, hashes, etc.).

### 7.2 Where it runs

The engine is invoked synchronously in four places:

1. **Seed data generation** (`_gen_raw_event`) — guarantees the dataset has realistic ~14% flag rate.
2. **Real adapter sync** (`routes/collector.py::sync_key`) — every event returned by a vendor Admin API.
3. **Mock fallback sync** — every synthetic event when an adapter is not wired.
4. **Pipeline ingest** — events promoted from `raw_events` to `events` get scanned a second time so a detector update + backfill cleans up history.

### 7.3 Severity precedence

The engine never *downgrades* a violation already set by upstream. If a vendor adapter or policy rule already labeled an event `critical`, the engine attaches its hits to `compliance_hits` but leaves the primary `violation_type` alone. This way the worst signal always wins and you keep an audit trail of every concurrent hit.

### 7.4 Backfill API

`POST /api/compliance/backfill?days=14` re-scans the last N days of events and updates anything the latest detectors catch. Useful when you ship a new regex.

---

## 8. Security Model

### 8.1 Identity

- **JWT** issued at login, signed with `JWT_SECRET` (HS256). Set in **two httpOnly cookies**: `access_token` (short-lived) and `refresh_token` (longer-lived), both `SameSite=None; Secure`.
- **bcrypt** for password hashing. The default cost factor is set high enough to make per-attempt hashing dominant.
- **Brute-force lockout** — 5 failed attempts per email in 15 minutes triggers a temporary lock with a database-backed counter (`login_attempts` collection), preventing trivial credential stuffing.
- **Refresh token rotation** — every refresh issues a new pair and invalidates the previous.

### 8.2 Authorization

- **RBAC** with `Admin / Analyst / Viewer`. Enforced via `Depends(require_role("admin"))` on protected routes. The frontend hides UI but is not the security boundary.
- **Tenant isolation** — every Mongo query filters by `user_id`. There is no shared collection; multi-tenancy is enforced at the data layer.

### 8.3 Secrets at rest

- **MultiFernet chain** — list of keys, the first is the encrypt-with key; subsequent keys are decrypt-only for graceful rotation.
- **KEK wrapping** — the encrypted chain is *itself* persisted in MongoDB wrapped by a KEK derived from `FERNET_KEY`. A leaked DB backup, without the KEK, decrypts to noise.
- **Rotation flow**: prepend the new key to `FERNET_KEYS_HISTORY`, restart, the startup self-test re-encrypts the chain and emits a rotation event to the audit log. Operators can verify decryption depth via `/api/ready`.

### 8.4 Transport

- **HSTS preload** (`max-age=63072000; includeSubDomains; preload`)
- **Strict Content-Security-Policy** with `default-src 'self'`. (Open backlog: drop `'unsafe-inline'` for nonce-based CSP once Shadcn supports the `nonce` hook.)
- **COOP / CORP** prevent cross-window leakage.
- **X-Frame-Options: DENY**, **Referrer-Policy: no-referrer**, **Permissions-Policy** locked down.

### 8.5 Application-layer

- **slowapi rate limits**, **XFF-aware** — `/auth/register` at 10/hr, `/leads` at 60/min.
- **Centralized exception handlers** — uniform `{detail, error:{code,message,request_id}}` envelope. **No stack traces in 5xx responses.**
- **Circuit breakers** on every outbound LLM call — protects you from a misbehaving vendor cascading into a backend outage.
- **Immutable audit log** captures actor + IP + user-agent + action + diff on every privileged operation.

### 8.6 What the engine does NOT do

We are deliberately transparent about coverage gaps:

- We do **not** offer at-rest encryption beyond the credential vault. The events themselves live in MongoDB cleartext, on the assumption you encrypt the disk / volume at the infrastructure layer (LUKS, EBS).
- We do **not** implement DLP-style outbound blocking. Detection is post-fact; in-line gating belongs in a forward proxy, not a usage governance plane.
- We do **not** ship a customer-data-exfil canary detector — that is a roadmap item.

---

## 9. Technology Stack

### 9.1 Frontend

- **React 19** + **React Router 7** — the latest stable.
- **Tailwind CSS** with custom design tokens; **Shadcn UI** primitives.
- **framer-motion** for entrance animations, sidebar transitions, and the animated architecture diagram.
- **Recharts** for time-series charts on the dashboard.
- **Sonner** for non-blocking toasts; **axios** with `withCredentials: true` for the API.
- Typography: **Cabinet Grotesk** (display) + **IBM Plex Sans / Mono**.

### 9.2 Backend

- **FastAPI** + **Motor** (async MongoDB) + **Pydantic v2** + **WebSockets** + **BackgroundTasks**.
- Async I/O end-to-end with **httpx** for outbound calls and **tuned connection pools**.
- **PyJWT** + **bcrypt**, **slowapi**, **cryptography (MultiFernet)**, **prometheus_client**, **structured JSON logging** with **request-id propagation** through middleware.

### 9.3 Infrastructure

- **MongoDB 5.0+** — replica-set optional, sharding not required at the typical scale.
- **Redis 7+** — optional, the app falls back to in-memory TTL caches.
- **Supervisor** in the preview environment; **Kubernetes** manifests + a sample **Helm chart values file** for production.

### 9.4 Test & CI

- **pytest** suite — **51 tests passing**, covering auth, brute-force lockout, RBAC, vendor adapter dispatch with mocked HTTP, dashboard cache hit / miss, encryption rotation, KEK wrap, SA upload, rate-limit headers, and sync-run growth.
- **Playwright** e2e for landing / sale page / UTM tracking / security headers.
- **GitHub Actions CI** — `ruff` lint, `pytest`, `pip-audit`, `yarn build`, `npm audit`, Playwright on every PR.

---

## 10. Data Model

Collections (MongoDB):

| Collection | Purpose | Key fields |
| --- | --- | --- |
| `users` | Tenants + auth | `id, email, password_hash, role, org_name, created_at` |
| `events` | Canonical event stream | as in §5.2 above |
| `raw_events` | Pre-normalization payloads | `id, user_id, vendor_id, run_id, ingested, raw, created_at` |
| `sync_runs` | Sync history | `id, user_id, vendor_id, status, pulled, errors, started_at, finished_at` |
| `vendors` | System + custom catalog | `id, name, product, color, models, user_id, collector_url` |
| `vault_keys` | Encrypted credentials | `key_id, vendor_id, api_key_encrypted, sa_json_encrypted, name, user_id, created_at` |
| `policies` | Customer policy rules | `id, user_id, name, pattern, severity, hits, enabled` |
| `audit_log` | Immutable trail | `id, user_id, action, target, actor_ip, actor_ua, diff, ts` |
| `login_attempts` | Brute-force tracker | `email, ip, failures, last_attempt, locked_until` |

Indexes are created on `(user_id, created_at)` for events and raw_events, and on `email` for users / login_attempts.

---

## 11. API Surface

All routes prefixed with `/api`. ~60 endpoints total; the high-value ones:

### 11.1 Auth & Identity

```
POST   /auth/register           rate-limited 10/hr
POST   /auth/login              brute-force protected
POST   /auth/logout             clears cookies
POST   /auth/refresh            rotates pair
GET    /auth/me                 current user
```

### 11.2 Vendors & Vault

```
GET    /vendors                 system + custom
POST   /vendors                 add custom vendor
DELETE /vendors/{vendor_id}
GET    /keys                    masked list
POST   /keys                    Fernet-encrypted insert
DELETE /keys/{key_id}
POST   /keys/{key_id}/service-account
DELETE /keys/{key_id}/service-account
```

### 11.3 Collector & Pipeline

```
POST   /collector/sync/{key_id}     real adapter or mock
GET    /collector/runs              sync history
GET    /pipeline/status             raw vs structured counts
POST   /pipeline/ingest             promote raw → events (compliance-scanned)
```

### 11.4 Analytics

```
GET    /dashboard/overview          Redis-cached
GET    /events                      filter + paginate
GET    /vendors/analytics           per-vendor P50/P95/spend
```

### 11.5 Compliance Engine

```
GET    /compliance/detectors        catalog
GET    /compliance/summary          7d/30d counts + recent[]
POST   /compliance/scan/{id}        ad-hoc single-event re-scan
POST   /compliance/backfill?days=N  re-scan recent
```

### 11.6 Real-time & Ops

```
WS     /ws/events                   cookie or ?token= auth
GET    /health                      K8s liveness
GET    /ready                       Mongo + Redis + breaker state
GET    /metrics                     Prometheus scrape
```

### 11.7 Enterprise

```
GET / PUT  /team                    RBAC management (admin only)
GET / POST /policies                policy engine
GET        /audit/export?fmt=csv|json
GET / POST /integrations/{splunk|datadog|sumo|webhook}
POST       /integrations/{provider}/test
GET / POST /sso/saml                SSO/SAML config
GET / POST /sso/scim
GET        /deployment/{helm|manifests|onprem}
GET / POST /encryption/rotate
GET        /encryption/audit
```

---

## 12. Operational Readiness

### 12.1 Observability

- **Structured JSON logs** on stdout with `request_id` propagated across handlers, adapters, and outbound vendor calls. Requests are traceable through the entire pipeline.
- **Prometheus metrics** at `/api/metrics`:
  - `aegisflow_http_requests_total{route,method,status}`
  - `aegisflow_request_duration_seconds_bucket{route}` (histogram)
  - `aegisflow_adapter_calls_total{vendor,outcome}`
  - `aegisflow_circuit_breaker_state{vendor}`
  - `aegisflow_db_pool_active`, `aegisflow_db_pool_idle`
  - `aegisflow_websocket_clients`
- **Kubernetes probes** — `/api/health` (liveness) and `/api/ready` (Mongo + Redis + circuit-breaker state).

### 12.2 Resilience

- **Circuit breaker** on every outbound vendor call: 5 failures within 60 seconds opens the breaker for 30 seconds, then a half-open trial decides whether to close.
- **Tuned async pools** — Motor `maxPoolSize=100, minPoolSize=10`; httpx `max_connections=50, max_keepalive=20`.
- **Centralized exception envelope** — unhandled errors return `{error:{code,message,request_id}}` with no stack trace. Stack traces are logged with the same `request_id` for offline correlation.
- **In-memory fallback cache** for the dashboard so a Redis outage degrades performance but never breaks the page.

### 12.3 Capacity expectations

The product is designed for **mid-market enterprise volumes** — tens of thousands of events per day across a handful of vendors. The MongoDB indexes and pagination model comfortably handle a 90-day window of ~3M events. For a Fortune-500-scale rollout (1M events / day per vendor), a horizontal-shard pattern on `(user_id, created_at)` is on the roadmap.

---

## 13. Deployment Options

### 13.1 Hosted (preview)

Supervisor-managed FastAPI + React on the Emergent platform. Two-click deploy to production at `*.emergent.host`.

### 13.2 Customer Kubernetes

Manifests served live from `/api/deployment/manifests`. Helm values stub at `/api/deployment/helm`. Required:

- One backend Deployment (FastAPI behind uvicorn).
- One frontend Deployment (NGINX serving the React build).
- One MongoDB StatefulSet (or a managed Atlas connection string).
- Optional Redis Deployment.
- ConfigMap with non-secret env; Secret with `JWT_SECRET`, `FERNET_KEY`, vendor creds.

### 13.3 On-prem / air-gapped

`docker-compose.yml` template includes the backend, frontend, MongoDB, Redis, and a NGINX reverse proxy. Required egress allow-list:

```
api.openai.com
api.anthropic.com
generativelanguage.googleapis.com
api.x.ai
*.azure.com    (regional, e.g., eastus.api.cognitive.microsoft.com)
api.mistral.ai
api.cohere.com
api-inference.huggingface.co
openrouter.ai
api.perplexity.ai
api.deepseek.com
+ customer-supplied SIEM endpoints
```

For deeply air-gapped deployments, every outbound vendor adapter can be disabled and customer-managed mirrors substituted via the `EXTRA_ORIGINS` and `*_BASE_URL` overrides.

### 13.4 Environment variables

```
# Required
MONGO_URL           "mongodb://..."
DB_NAME             "aegisflow_db"
JWT_SECRET          <64-char hex>
FERNET_KEY          <base64 Fernet key>
ADMIN_EMAIL, ADMIN_PASSWORD
DEMO_EMAIL, DEMO_PASSWORD

# Optional
FERNET_KEYS_HISTORY ""              # rotation chain (comma-separated)
REDIS_URL           ""              # falls back to in-memory
FRONTEND_URL        "https://..."   # primary CORS origin
EXTRA_ORIGINS       ""              # additional comma-separated
```

---

## 14. Roadmap

### 14.1 Shipped

- Real Admin-API adapters for OpenAI, Anthropic, xAI, Gemini
- Multi-tenant org + RBAC (Admin · Analyst · Viewer)
- SSO / SCIM configuration (Okta, Entra ID, Google, SAML 2.0 / OIDC)
- 8-detector PII / PHI / prompt-injection engine
- Live monitoring (WebSocket stream)
- Per-vendor analytics, Threat Intelligence, animated architecture viz
- Multi-Fernet rotation with KEK-wrapped chain in MongoDB
- Circuit breakers on every outbound vendor call
- Prometheus metrics + Kubernetes probes
- CI pipeline (ruff · pytest · pip-audit · yarn build · npm audit · Playwright)

### 14.2 Backlog (P1)

- Gemini Service-Account JSON → BigQuery query path (storage is done; query path is the next step).
- Frontend Vitest unit tests for components.
- Real key-validation paths for Azure OpenAI, Mistral, Cohere, Hugging Face, OpenRouter, Perplexity, DeepSeek, Ollama.

### 14.3 Backlog (P2)

- Scheduled (cron) sync runs with exponential backoff.
- Slack and PagerDuty alert channels for critical violations.
- BYO Snowflake / BigQuery destination for downstream BI tooling.
- PDF compliance report export per organization.
- Light-mode adaptive toggle.
- Tighten CSP to nonce-based strategy (drop `'unsafe-inline'`).
- API versioning prefix (`/api/v1`).
- Customer-data-exfil canary detector.
- Horizontal-shard pattern on `events.(user_id, created_at)` for F500 scale.

---

## 15. Appendix · Glossary

| Term | Definition |
| --- | --- |
| **Admin API** | A vendor's tenancy-wide API (separate from user-facing chat APIs) that exposes usage, cost, and audit telemetry. Examples: `sk-admin-…` for OpenAI, `sk-ant-admin-…` for Anthropic. |
| **Compliance Engine** | The inline scanner in `core/compliance.py` that runs on every event during ingest. |
| **Detector** | One pure-function pattern in the compliance engine (e.g., `pii_ssn`). |
| **Event** | A normalized record of one LLM call: who, what model, how many tokens, what cost, what violations. |
| **Fernet** | The symmetric authenticated-encryption scheme from `cryptography`. **MultiFernet** chains multiple keys for rotation. |
| **KEK** | Key-Encryption Key — wraps the Fernet chain in storage so a backup leak alone is insufficient. |
| **Pipeline** | The step that normalizes `raw_events` into `events` and runs the compliance engine inline. |
| **PII / PHI** | Personally Identifiable Information / Protected Health Information. |
| **Prompt injection** | An adversarial payload embedded in a prompt or in third-party content that aims to hijack the model's instructions. |
| **RBAC** | Role-Based Access Control. AegisFlow ships three roles: Admin, Analyst, Viewer. |
| **SIEM** | Security Information & Event Management. Splunk / Datadog / Sumo / Elastic. |
| **Vault** | The encrypted credential store (`vault_keys` collection) holding vendor API keys. |

---

*End of document.*
