LLM Integration
The LLM module wraps the language model API (OpenAI-compatible) and exposes structured generation and analysis endpoints to the rest of the backend. It handles prompt assembly, context injection, rate limiting, and response parsing.
Prompt assembly pipeline
- Preprocessing — raw user text passes through
POST /llm/prompt-preprocesswhich cleans, expands abbreviations, and structures the input - Context injection — the LLM service fetches project world rules, characters, and locations and prepends them as system context
- Generation — the assembled prompt is sent to the LLM API
- Logging — the full request/response is stored in
llm_logsfor admin review
Rate limiting
LLM endpoints use two separate rate-limit buckets:
| Bucket | Applied to |
|---|---|
Generation | /ideas, /alternatives, /prompt-preprocess, /generate-scenes |
Analysis | /canon-check (POST), /continuity-check (POST), /safety-check (POST) |
Approve | /final-approve |
Quality check pattern
Each of the three quality checks (canon, continuity, safety) follows the same pattern:
POST /<check-type> ← triggers async analysis, returns jobId or immediate result
GET /<check-type>/:episodeId ← returns the cached result
Results are stored per-episode and overwritten each time a new check runs.
Canon check
Uses the project's world rules as constraints. For each scene block, the model evaluates whether the content violates any rule and returns a structured list of violations with severity and suggested fixes.
Continuity check
Tracks character state, timeline, and location consistency across scenes. Flags cases where a character is in two places at once, or refers to knowledge they shouldn't have yet.
Safety check
Evaluates content against configurable content policy rules (managed via Admin → Analytics Settings). Returns an age-rating recommendation and any flagged passages.
Final approval
POST /llm/final-approve records that all three checks passed and transitions the episode to an approved state. Requires LLM_APPROVE.
Admin visibility
All LLM request/response logs are viewable at Admin → Logs → LLM Dump. This is useful for debugging unexpected model output or auditing generation costs.
Environment configuration
The LLM service reads its configuration from environment variables:
| Variable | Purpose |
|---|---|
LLM_API_KEY | API key for the language model provider |
LLM_MODEL | Model name (e.g. gpt-4o) |
LLM_BASE_URL | Base URL (for OpenAI-compatible providers) |