Skip to main content

LLM Integration

The LLM module wraps the language model API (OpenAI-compatible) and exposes structured generation and analysis endpoints to the rest of the backend. It handles prompt assembly, context injection, rate limiting, and response parsing.

Prompt assembly pipeline

  1. Preprocessing — raw user text passes through POST /llm/prompt-preprocess which cleans, expands abbreviations, and structures the input
  2. Context injection — the LLM service fetches project world rules, characters, and locations and prepends them as system context
  3. Generation — the assembled prompt is sent to the LLM API
  4. Logging — the full request/response is stored in llm_logs for admin review

Rate limiting

LLM endpoints use two separate rate-limit buckets:

BucketApplied to
Generation/ideas, /alternatives, /prompt-preprocess, /generate-scenes
Analysis/canon-check (POST), /continuity-check (POST), /safety-check (POST)
Approve/final-approve

Quality check pattern

Each of the three quality checks (canon, continuity, safety) follows the same pattern:

POST /<check-type> ← triggers async analysis, returns jobId or immediate result
GET /<check-type>/:episodeId ← returns the cached result

Results are stored per-episode and overwritten each time a new check runs.

Canon check

Uses the project's world rules as constraints. For each scene block, the model evaluates whether the content violates any rule and returns a structured list of violations with severity and suggested fixes.

Continuity check

Tracks character state, timeline, and location consistency across scenes. Flags cases where a character is in two places at once, or refers to knowledge they shouldn't have yet.

Safety check

Evaluates content against configurable content policy rules (managed via Admin → Analytics Settings). Returns an age-rating recommendation and any flagged passages.

Final approval

POST /llm/final-approve records that all three checks passed and transitions the episode to an approved state. Requires LLM_APPROVE.

Admin visibility

All LLM request/response logs are viewable at Admin → Logs → LLM Dump. This is useful for debugging unexpected model output or auditing generation costs.

Environment configuration

The LLM service reads its configuration from environment variables:

VariablePurpose
LLM_API_KEYAPI key for the language model provider
LLM_MODELModel name (e.g. gpt-4o)
LLM_BASE_URLBase URL (for OpenAI-compatible providers)