StemFun Project Architecture and Workflow
StemFun started from a simple teaching problem: STEM subjects often become difficult not because the formulas are unavailable, but because learners cannot see why those formulas emerge. Many online learning platforms provide exercises, worked answers, and recorded explanation videos, yet the path from a real-world situation to a model, equation, diagram, or algorithm is still hard to follow. This gap is especially visible in math, physics, chemistry, and programming topics where learners need motion, structure, and step-by-step visual reasoning rather than static text.
The research direction behind StemFun was to combine generative AI with deterministic scientific animation. General AI video tools can quickly create marketing-style clips or avatar-led lessons, but they struggle with precise graphs, geometric relationships, symbolic transformations, simulations, and code execution traces. Instead of relying on diffusion-style video generation, StemFun uses large language models to plan an explanation and generate Manim code, then renders the output through a controllable Python animation pipeline. This keeps the creative flexibility of AI while preserving the precision needed for educational content.
The core idea is to let a learner or teacher describe a STEM concept in natural language, then have the system produce an animated explanation that shows the reasoning process. For example, a physics problem can be decomposed into forces, variables, equations, and visual transitions; a calculus idea can be shown as a changing curve; and a programming tutorial can visualize control flow and data changes over time. The system is designed not only to produce a final video, but also to support inspection, repair, and iterative editing of the generated code.
Technically, StemFun is a full-stack AI rendering platform. It uses a React frontend, an Express/TypeScript backend, a Redis-backed Bull queue for asynchronous jobs, AI generation and repair services, and a Python/Manim runtime for video or image rendering. The project has two main product modes:
- Workflow Mode: one-shot generation through an async job pipeline.
- Agent Mode: long-lived Studio sessions where an AI agent can read files, write code, request permissions, run checks, render outputs, and spawn reviewer or designer subagents.
System Overview
StemFun is a full-stack application with a React frontend, Express/TypeScript backend, Redis-backed Bull queue, Python/Manim render runtime, and optional Supabase persistence.
graph TB
subgraph Frontend["Frontend: React 19 + Vite"]
Classic["Classic Generator UI"]
Studio["Studio Workspace"]
Pages["Workspace, metrics, settings"]
end
subgraph Backend["Backend: Express + TypeScript"]
Routes["API Routes"]
Services["Services Layer"]
Queue["Bull Queue: video-generation"]
Agent["Studio Agent Runtime"]
end
subgraph Runtime["Render Runtime"]
Python["Python"]
Manim["ManimCE"]
Latex["LaTeX"]
FFmpeg["ffmpeg"]
end
subgraph Storage["Storage"]
Redis[("Redis")]
Files["public/videos, public/images, media"]
Supabase[("Supabase optional")]
end
subgraph External["External Services"]
AI["OpenAI-compatible AI Providers"]
end
Classic -->|HTTP| Routes
Studio -->|HTTP + SSE| Routes
Routes --> Services
Routes --> Agent
Services --> Queue
Queue --> Redis
Services --> AI
Agent --> AI
Queue --> Python
Python --> Manim
Manim --> Latex
Manim --> FFmpeg
Services --> Files
Services --> Supabase
Agent --> Supabase
Main backend entry points:
| Area | Files | Purpose |
|---|---|---|
| Server boot | src/server.ts, src/server/bootstrap.ts | Express app, middleware, static files, route registration |
| Routes | src/routes/*.route.ts | Generation, modification, status, cancellation, Studio API, history, metrics |
| Queue | src/config/bull.ts, src/queues/processors/video.processor.ts | Async generation and render jobs |
| AI generation | src/services/concept-designer.ts, src/services/concept-designer/ | Scene design and Manim code generation |
| Static guard | src/services/static-guard/ | Python syntax/type checks and AI patch loop |
| Render retry | src/services/code-retry/ | AI-assisted repair after Manim render failure |
| Studio runtime | src/studio-agent/ | Agent sessions, tools, permissions, runs, tasks, work results |
Product Modes
flowchart LR
User["User"] --> Choice{"Mode"}
Choice --> Classic["Workflow Mode"]
Choice --> Studio["Agent Mode"]
Classic --> W1["Submit concept"]
W1 --> W2["Queue job"]
W2 --> W3["Generate code"]
W3 --> W4["Static guard"]
W4 --> W5["Render"]
W5 --> W6["Return video/image"]
Studio --> A1["Create session"]
A1 --> A2["Start run"]
A2 --> A3["Agent tool loop"]
A3 --> A4["Read/write/check/render/review"]
A4 --> A5["Stream events to UI"]
A5 --> A6["Continue in same workspace"]
| Dimension | Workflow Mode | Agent Mode |
|---|---|---|
| Interaction | Single request, async job result | Multi-turn session |
| State | Job-based, temporary | Session-based, workspace-backed |
| Frontend updates | Polling | Server-Sent Events |
| AI behavior | Pipeline stages | Tool-using agent loop |
| Review | Static guard and render retry | Static check, AI review, reviewer subagent |
| Workspace | Generated temp code/media | Persistent session directory |
| Best for | Direct video/image generation | Iterative creation, code editing, review, longer work |
Workflow Mode
Workflow Mode starts from POST /api/generate. The route validates the request, stores initial job state, enqueues a Bull job, and returns a jobId. The frontend then polls job status until completion.
sequenceDiagram
participant User
participant FE as Classic UI
participant API as generate.route.ts
participant Store as job-store
participant Queue as Bull queue
participant Worker as video.processor.ts
participant AI as AI provider
participant Guard as static-guard
participant Render as Manim executor
User->>FE: Enter concept and settings
FE->>API: POST /api/generate
API->>API: Validate request and sanitize concept
API->>Store: storeJobStage(jobId, analyzing)
API->>Queue: videoQueue.add(jobData)
API-->>FE: 202 Accepted with jobId
loop Poll status
FE->>API: GET /api/jobs/:jobId
API->>Store: Read tracking/result state
API-->>FE: status, stage, revision
end
Queue->>Worker: Process job
Worker->>AI: Scene design and code generation
AI-->>Worker: Manim Python code
Worker->>Guard: py_compile + mypy + patch loop
Guard-->>Worker: Refined code
Worker->>Render: Render video or image
Render-->>Worker: Output paths or error
Worker->>Store: Store completed/failed result
Request Shape
{
concept: string
problemPlan?: ProblemFramingPlan
outputMode: 'video' | 'image'
quality: 'low' | 'medium' | 'high'
code?: string
customApiConfig?: { apiUrl: string; apiKey: string; model: string }
promptOverrides?: PromptOverrides
videoConfig?: VideoConfig
referenceImages?: ReferenceImage[]
renderCacheKey?: string
}
Processor Flow Types
The queue processor dispatches into three flows:
| Flow | Trigger | Behavior |
|---|---|---|
| Generation flow | Default | Design scene, generate code, static guard, render |
| Pre-generated flow | code provided | Skip AI generation, check/render supplied code |
| Edit flow | Existing code plus edit instructions | AI edits code, static guard, render |
flowchart TD
Job["Bull job"] --> Kind{"Job kind"}
Kind -->|preGeneratedCode| Pre["Pre-generated flow"]
Kind -->|editCode + editInstructions| Edit["Edit flow"]
Kind -->|default| Gen["Generation flow"]
Pre --> Guard["Static guard"]
Edit --> EditAI["AI code edit"]
EditAI --> Guard
Gen --> Design["Scene design stage"]
Design --> Code["Code generation stage"]
Code --> Guard
Guard --> Render["Render step"]
Render --> Result{"Success?"}
Result -->|yes| Store["Store result"]
Result -->|no| Retry["Code retry"]
Retry --> Guard
Problem Framing
Problem framing is an optional pre-generation step exposed by POST /api/problem-frame. It turns a raw concept into a structured teaching plan that can be reviewed by the user before generation.
sequenceDiagram
participant User
participant FE as Frontend
participant API as problem-frame.route.ts
participant AI as Problem framing model
User->>FE: Enter concept
FE->>API: concept, locale, feedback, reference images
API->>AI: Dedicated problemFraming prompt
AI-->>API: ProblemFramingPlan
API-->>FE: Plan
User->>FE: Approve or give feedback
FE->>API: Generate with problemPlan
ProblemFramingPlan fields:
| Field | Meaning |
|---|---|
mode | clarify for a specific prompt, invent when the prompt needs structure |
headline | Short title for the generation plan |
summary | Brief teaching objective |
steps | Up to six planned explanation steps |
visualMotif | Repeated visual metaphor or style |
designerHint | Downstream guidance for the scene design stage |
When passed to generation, the plan is merged into the concept text as problem framing context.
Two-Stage AI Generation
StemFun separates creative planning from precise code generation.
flowchart TD
Prompt["Concept + settings + optional problem plan/images"] --> Design["Stage 1: Scene design"]
Design --> Plan["Structured visual plan"]
Plan --> CodeGen["Stage 2: Code generation"]
CodeGen --> Code["Manim Python code"]
Code --> Guard["Static guard"]
Stage 1: Scene Design
Implemented in src/services/concept-designer/scene-design-stage.ts.
- Uses the
conceptDesignersystem prompt. - Receives concept, output mode, seed, and optional reference images.
- Produces a scene design plan describing objects, motion, pacing, visual structure, and teaching story.
- Supports vision input by attaching uploaded reference images as
image_urlmessage parts. - Falls back to text-only input if a provider rejects images.
- Defaults:
DESIGNER_TEMPERATURE=0.8,DESIGNER_MAX_TOKENS=12000,DESIGNER_THINKING_TOKENS=20000.
Stage 2: Code Generation
Implemented in src/services/concept-designer/code-from-design-stage.ts.
- Uses the
codeGenerationsystem prompt. - Receives the concept, seed, scene design, and output mode.
- Produces Manim Python code.
- Video mode extracts code from
### START ###/### END ###anchors or markdown code fences. - Image mode strips thinking tags and preserves raw output.
- Defaults:
AI_TEMPERATURE=0.7,AI_MAX_TOKENS=12000.
Static Guard
The static guard validates generated or edited code before rendering. It reduces wasted render attempts and gives the AI a constrained patch loop.
flowchart TD
Code["Generated code"] --> Py["py_compile"]
Py --> Syntax{"Syntax/import issue?"}
Syntax -->|yes| Patch["AI patch prompt"]
Syntax -->|no| Mypy["mypy"]
Mypy --> Type{"Type/common issue?"}
Type -->|yes| Hardcoded["Known hardcoded fixes"]
Hardcoded --> Remaining{"Issues remain?"}
Remaining -->|yes| Patch
Remaining -->|no| Pass["Guard passed"]
Type -->|no| Pass
Patch --> Apply["Apply [[PATCH]] blocks"]
Apply --> Py
The guard runs:
- Python compile validation.
-
mypyvalidation with line/column diagnostics. - Known hardcoded repairs, such as Manim list-to-tuple parameter fixes.
- AI patch repair using
[[PATCH]],[[SEARCH]],[[REPLACE]],[[END]]blocks. - Re-validation up to
STATIC_GUARD_MAX_PASSES, default3.
For image mode with multiple YON_IMAGE blocks, each scene block is checked separately and diagnostics are mapped back to original line offsets.
Render Pipeline
The render step converts validated Manim code into final media. It supports video and image output paths.
flowchart TD
Ready["Guarded code"] --> Mode{"Output mode"}
Mode -->|video| Video["renderVideo"]
Mode -->|image| Image["renderImages"]
Video --> Exec["Manim executor"]
Image --> Exec
Exec --> OK{"Render OK?"}
OK -->|yes video| BGM["Optional BGM mix"]
OK -->|yes image| Done["Store image result"]
BGM --> DoneVideo["Store video result"]
OK -->|no| Retry["Code retry manager"]
Retry --> RetryMode{"Repair mode"}
RetryMode -->|patch| Guard["Static guard again"]
RetryMode -->|full_code| Exec
Guard --> Exec
The Manim executor:
- Spawns a Manim subprocess through
src/utils/manim-executor.ts. - Supports low, medium, and high resolution settings.
- Applies request-level and default timeout settings.
- Registers active processes so cancellation can terminate them.
- Monitors peak memory and logs render progress.
Video output may be mixed with background music using src/audio/bgm-mixer.ts. If mixing fails, the original video remains valid.
Render Repair and Retry
Static validation cannot catch every Manim runtime problem. The code retry manager repairs render failures using error context from Manim stderr.
stateDiagram-v2
[*] --> RenderAttempt
RenderAttempt --> Success: Manim exits cleanly
RenderAttempt --> ExtractError: Manim fails
ExtractError --> PatchMode: early attempt and patchable error
ExtractError --> FullCodeMode: repeated error, syntax error, or late attempt
PatchMode --> StaticGuard: apply targeted patch
StaticGuard --> RenderAttempt
FullCodeMode --> RenderAttempt: regenerate full file
RenderAttempt --> Failed: retry budget exhausted
Success --> [*]
Failed --> [*]
Retry behavior:
| Mechanism | Detail |
|---|---|
| Max retries | CODE_RETRY_MAX_RETRIES, default 4 |
| Patch mode | AI returns targeted patch blocks around the failing code |
| Full-code mode | AI regenerates the whole file |
| Escalation triggers | Attempt >= 3, SyntaxError, IndentationError, failed patch application, repeated normalized error signature |
| Temperature | CODE_RETRY_TEMPERATURE, default 0.1 |
The retry manager normalizes error signatures by removing unstable line numbers, paths, and string literals. This prevents repeating the same failed repair.
Job Management
All Workflow Mode render work is asynchronous. Redis stores queue state, job stages, access metadata, cancellation flags, and final results.
stateDiagram-v2
[*] --> Queued: POST /api/generate
Queued --> Processing: worker active
Processing --> Completed: render succeeds
Processing --> Failed: retry exhausted
Processing --> Cancelled: cancel active job
Queued --> Cancelled: cancel queued job
Completed --> [*]
Failed --> [*]
Cancelled --> [*]
Redis key patterns:
| Key | Purpose |
|---|---|
job:result:<jobId> | Final completed/failed result |
job:result:stage:<jobId> | Current processing stage |
job:result:tracking:<jobId> | Revision, status, attempt, timestamps |
job:cancel:<jobId> | Cancellation flag and reason |
concept:cache:<hash> | Cached generation data |
Processing stages:
flowchart LR
A["analyzing"] --> B["generating"]
B --> C["refining"]
C --> D["rendering"]
D --> E{"terminal"}
E --> F["completed"]
E --> G["failed"]
Important behavior:
-
GET /api/jobs/:jobIdchecks Bull state first, then Redis result state. - Job access is bound to API key hash and browser client ID.
-
POST /api/jobs/:jobId/cancelmarks cancellation, removes queued jobs, or kills active Manim processes. - Job result retention defaults to 24 hours.
- Media cleanup removes old generated files from
public/images/andpublic/videos/.
Agent Mode
Agent Mode is powered by the Studio Agent runtime in src/studio-agent/. It is designed for iterative work where an AI agent can operate inside a session workspace.
graph TB
subgraph Frontend
Shell["StudioShell"]
Command["StudioCommandPanel"]
Assets["StudioAssetsPanel"]
Pipeline["StudioPipelinePanel"]
end
subgraph API
Route["studio-agent.route.ts"]
SSE["SSE endpoint"]
end
subgraph Runtime
Runner["StudioSessionRunner"]
Builder["StudioBuilderRuntime"]
Processor["StudioRunProcessor"]
Loop["OpenAI tool loop"]
Planner["TurnPlanResolver"]
end
subgraph Domain
Sessions["Session store"]
Runs["Run store"]
Tasks["Task store"]
Works["Work store"]
Results["WorkResult store"]
end
subgraph Tools
Registry["Tool registry"]
FileOps["read, glob, grep, ls, write, edit, apply-patch"]
AgentOps["task, skill, question"]
RenderOps["render, static-check, ai-review"]
end
subgraph Permissions
Permission["Permission service"]
Policy["Permission policy"]
end
Shell --> Route
Command --> Route
Route --> Runner
Route --> SSE
Runner --> Builder
Runner --> Planner
Runner --> Loop
Loop --> Registry
Registry --> FileOps
Registry --> AgentOps
Registry --> RenderOps
Runner --> Processor
Processor --> Sessions
Processor --> Runs
Processor --> Tasks
Processor --> Works
Processor --> Results
Loop --> Permission
Permission --> Policy
SSE --> Shell
SSE --> Assets
SSE --> Pipeline
Session Hierarchy
erDiagram
Session ||--o{ Run : has
Session ||--o{ Message : has
Session ||--o{ Work : has
Session ||--o{ Task : has
Session ||--o{ SessionEvent : emits
Run ||--o{ Task : triggers
Run ||--o{ Work : produces
Work ||--o{ WorkResult : has
Task }o--|| Work : belongs_to
| Entity | Meaning |
|---|---|
| Session | Long-lived workspace context with studio kind, agent type, permissions, messages, runs, tasks, and works |
| Run | One user interaction cycle; only one active run is allowed per session |
| Message | User, assistant, system, or tool message part |
| Task | A concrete step such as tool execution, render, static check, review, or subagent run |
| Work | A user-visible work item such as video, plot, design, review, edit, or render-fix |
| WorkResult | Output artifact such as render output, review report, design plan, edit result, or failure report |
Run Lifecycle
stateDiagram-v2
[*] --> Session: create session
Session --> PendingRun: user sends message
PendingRun --> Running
Running --> TurnPlanning
TurnPlanning --> ToolExecution: planned/direct tool call
TurnPlanning --> AgentLoop: autonomous model turn
AgentLoop --> ToolExecution: model requests tool
ToolExecution --> TurnPlanning: continue
ToolExecution --> PermissionWait: permission required
PermissionWait --> ToolExecution: approved
PermissionWait --> Failed: rejected
Running --> Completed: finished
Running --> Failed: unrecoverable error
Running --> Cancelled: user cancels
Agent Roles
| Role | Purpose | Tool access |
|---|---|---|
builder | Main implementation agent; writes code, edits files, renders, manages tasks | All tools |
reviewer | Reviews code and output quality | Read-only tools and review tools |
designer | Produces visual concepts, scene plans, and storyboards | Read-only tools |
The builder can spawn reviewer or designer subagents through the task tool. Child sessions inherit parent permission rules but cannot recursively spawn more tasks.
Studio Turn Planning
Before using the autonomous AI tool loop, the runtime parses user intent and may choose a direct plan.
flowchart TD
Input["User input"] --> Parse["TurnPlanIntent: parse"]
Parse --> Slash{"Slash command?"}
Parse --> Files{"File references?"}
Parse --> Skill{"Skill requested?"}
Parse --> Review{"Review/design intent?"}
Slash --> Policy["TurnPlanPolicy"]
Files --> Policy
Skill --> Policy
Review --> Policy
Policy --> Decision{"Decision"}
Decision -->|continue-current-work| Continue["Continue active work"]
Decision -->|task-intent| Task["Create subagent task"]
Decision -->|direct-tool| Tool["Run direct tool"]
Decision -->|none| Loop["Autonomous AI tool loop"]
This makes simple commands predictable. For example, /read file.ts can become a direct read tool call instead of a full autonomous model turn.
Studio Tools and Permissions
Tools are registered through src/studio-agent/tools/registry.ts. Each tool declares name, category, permission, allowed agents, allowed studio kinds, and an execute function.
flowchart LR
Agent["Agent"] --> Request["Tool request"]
Request --> Registry["Tool registry"]
Registry --> Allowed{"Allowed for role and studio kind?"}
Allowed -->|no| Reject["Reject"]
Allowed -->|yes| Permission{"Permission allowed?"}
Permission -->|auto| Execute["Execute tool"]
Permission -->|ask| Ask["Emit permission.asked"]
Ask --> User["User decision"]
User -->|approve| Execute
User -->|reject| Reject
Execute --> Result["Tool result event"]
Tool groups:
| Group | Tools |
|---|---|
| File operations | read, glob, grep, ls, write, edit, apply-patch |
| Agent operations | task, skill, question |
| Render operations | render, static-check, ai-review |
Permission levels:
| Level | Default behavior |
|---|---|
L0 | No auto-allowed tools |
L1 | No auto-allowed tools |
L2 | Auto-allow read-only file tools |
L3 | L2 plus more operational access |
L4 | Auto-allow all tools |
All workspace file operations resolve paths through workspace guards so tool calls cannot escape the session directory.
Studio Event Flow
Agent Mode uses Server-Sent Events to keep the UI synchronized while runs execute in the background.
sequenceDiagram
participant FE as Studio UI
participant API as Studio API
participant Runtime as Runtime
participant Tools as Tools
participant Bus as Event bus
FE->>API: POST /api/studio-agent/sessions
API-->>FE: session
FE->>API: GET /api/studio-agent/events
API-->>FE: studio.connected
FE->>API: POST /api/studio-agent/runs
API->>Runtime: startBackgroundRun
Runtime->>Bus: run_updated
Runtime->>Tools: execute tool call
Tools->>Bus: tool_input_start / tool_call
Tools-->>Runtime: result
Runtime->>Bus: tool_result / task_updated / work_updated
Bus-->>FE: SSE events
Runtime->>Bus: run_updated completed
Common events:
| Event | Meaning |
|---|---|
studio.connected | SSE stream connected |
studio.heartbeat | Keep-alive event |
run_updated | Run status changed |
assistant_text | Assistant text streamed |
tool_input_start | Tool input started |
tool_call | Tool call received |
tool_result | Tool output or error |
task_updated | Task lifecycle changed |
work_updated | Work lifecycle changed |
permission.asked | User approval required |
permission.replied | User approved or rejected |
AI Routing
StemFun routes AI calls to OpenAI-compatible upstream providers. A request can provide customApiConfig; otherwise the server resolves routing by configured StemFun route keys.
flowchart TD
Request["Incoming request"] --> Custom{"customApiConfig present?"}
Custom -->|yes| UseCustom["Use request API URL/key/model"]
Custom -->|no| Bearer["Read bearer key"]
Bearer --> Route{"Key mapped in STEMFUN_ROUTE_*?"}
Route -->|yes| UseRoute["Use mapped upstream config"]
Route -->|no| Error["Reject: no upstream configured"]
UseCustom --> Client["OpenAI-compatible client"]
UseRoute --> Client
Client --> Provider["Provider completion API"]
Routing environment variables:
| Variable | Purpose |
|---|---|
STEMFUN_ROUTE_KEYS | Public keys accepted by StemFun |
STEMFUN_ROUTE_API_URLS | Upstream API base URLs |
STEMFUN_ROUTE_API_KEYS | Upstream provider keys |
STEMFUN_ROUTE_MODELS | Model names for each route |
This lets different users or deployments route to different providers without changing code.
Frontend Workflow
The frontend is a React SPA with classic generation screens and Studio workspaces.
flowchart TB
App["frontend/src/App.tsx"] --> Classic["Classic generator"]
App --> ManimStudio["Manim Studio"]
App --> PlotStudio["Plot Studio"]
App --> Game["Wait-state game"]
Classic --> UseGeneration["useGeneration"]
Classic --> ProblemFrame["useProblemFraming"]
Classic --> Upload["Reference image upload"]
UseGeneration --> GenerateAPI["POST /api/generate"]
UseGeneration --> Poll["GET /api/jobs/:jobId"]
ManimStudio --> SessionHook["useStudioSession"]
ManimStudio --> RunHook["useStudioRun"]
ManimStudio --> EventHook["useStudioEvents"]
EventHook --> Store["studio-session-store"]
Store --> Panels["Command, Assets, Pipeline panels"]
Classic UI:
- Creates generation jobs.
- Displays problem framing plans.
- Uploads reference images.
- Polls job status and result revisions.
- Recovers active job IDs from browser session storage.
Studio UI:
- Creates or loads Studio sessions.
- Starts background runs.
- Subscribes to SSE updates.
- Shows tasks, works, permissions, review findings, and render outputs.
Long Animation and Segmented Rendering
The render system supports segmented output for longer or more complex generations. AI-generated code can include YON_IMAGE anchors that split the output into independently renderable segments.
flowchart LR
Code["Generated code"] --> Detect["Detect YON_IMAGE anchors"]
Detect --> Split["Split into segments"]
Split --> Render1["Render segment 1"]
Split --> Render2["Render segment 2"]
Split --> RenderN["Render segment N"]
Render1 --> Combine["Combine or return ordered outputs"]
Render2 --> Combine
RenderN --> Combine
Combine --> Final["Final media result"]
Benefits:
- Reduces timeout risk for long scenes.
- Allows partial isolation of render failures.
- Supports image-mode multi-scene checks with line-offset diagnostics.
Operational Workflow
For local development:
npm install
cd frontend && npm install
cd ..
npm run dev
Expected runtime dependencies:
| Dependency | Purpose |
|---|---|
| Redis 7+ | Bull queue, job state, cancellation state |
| Python 3.11+ | Manim execution |
| ManimCE 0.19.x | Mathematical animation rendering |
| LaTeX | Equation rendering |
| ffmpeg | Video output and audio mixing |
| AI provider key | Scene design, code generation, repair, review |
Development workflow:
flowchart TD
Change["Code/doc change"] --> Read["Read README and relevant docs"]
Read --> Implement["Implement scoped update"]
Implement --> Check["Run targeted checks"]
Check --> Result{"Pass?"}
Result -->|yes| Report["Summarize changed files and verification"]
Result -->|no| Fix["Fix syntax/runtime issue"]
Fix --> Check
Recommended checks depend on scope:
| Scope | Check |
|---|---|
| Backend TypeScript | npm run build or targeted TypeScript check |
| Frontend React | cd frontend && npm run build |
| Studio behavior | Relevant unit tests in frontend/src/studio or src/studio-agent |
| Manim render behavior | Run a small render job or static guard path |
| Docs only | Verify Mermaid syntax and links by reading generated Markdown |
Key Design Decisions
- Async queue instead of synchronous rendering: Manim renders can take minutes, so the API returns a job ID immediately and lets the frontend poll.
- Two-stage AI generation: creative scene planning and precise code generation are separate model calls.
- Static guard before render: syntax and type checks catch many AI code mistakes before expensive rendering.
- Render retry after runtime failure: Manim stderr becomes structured context for AI repair.
- OpenAI-compatible routing: deployments can use different model providers without code changes.
- Studio session hierarchy: Session, Run, Task, Work, and WorkResult make long-running agent work inspectable.
- Tool registry: Agent capabilities are declarative, role-scoped, and permission-gated.
- SSE for Agent Mode: users see live tool calls, permissions, task progress, and work updates.
- Workspace sandboxing: file tools operate inside the session workspace only.
- Cancellation and cleanup: active Manim processes can be terminated, and old generated media is removed on a schedule.
Enjoy Reading This Article?
Here are some more articles you might like to read next: