Product Spec

PromptCMS

A conversational content management system for community organisations. Members update their website by messaging a bot. No logins, no dashboards, no training required.

📄 v0.2-concept 📅 Updated 2026-03-09 🟢 Live prototype (2 orgs) → Production SaaS roadmap

PromptCMS — Product Specification

A conversational content management system for community organisations. Members update their website by messaging a bot. No logins, no dashboards, no training required.

Version: 0.2-concept
Last updated: 2026-03-09
Status: Live prototype (2 orgs) → Production SaaS roadmap

Product Vision
Core Thesis
Current State (v0.1)
Target Architecture
System Components
Data Model
Persona & Style Guide System
Messaging Layer
Security & Multi-tenancy
Phase Roadmap
Open Decisions
Guiding Principles

1. Product Vision

PromptCMS is a website CMS delivered entirely through messaging — WhatsApp, Telegram, Slack — without any web interface, login, or training for end users.

A sea scout leader wants to update the club's meeting time. They open WhatsApp, message the group bot ("Skipper, change Tuesday meeting to 7pm"), and the site is updated, previewed, and deployed live within seconds. No browser tabs. No passwords. No asking the volunteer webmaster who set the site up three years ago.

The product targets community organisations — scout groups, nature associations, sports clubs, school committees, neighbourhood groups — who have websites they rarely update because the tools are too cumbersome for non-technical volunteers.

The long-term vision is a platform: an operator signs up, provides their organisation's details, and within minutes has a live website with a conversational bot their members can talk to. The bot knows the organisation's visual identity, speaks in the org's voice, and can handle everything from updating event dates to redesigning sections — escalating to a human preview for anything significant.

2. Core Thesis

The problem with existing CMSes is not feature gaps — it's friction. WordPress, Squarespace, and Wix all work, but they require logging in, navigating dashboards, and understanding layout concepts. For a volunteer running a monthly meeting, that friction means the site goes stale.

The PromptCMS thesis: the messaging app is already open. Community groups already coordinate via WhatsApp. If the website bot lives there too, the update cost drops to near zero — and websites actually stay current.

The differentiator: the "persona" approach. The bot isn't a generic assistant — it's Skipper for sea scouts, Ranger for a nature association. That identity makes the bot feel native to the organisation, not like a SaaS tool bolted on. Members engage with it like they would a helpful committee member.

The key technical insight: the hard work (understanding intent, generating valid code, deploying correctly) happens invisibly in the cloud. The user experience is just: type a message, get a preview link, send thumbs-up.

3. Current State (v0.1)

What exists

Two live deployments, both fully operational end-to-end:

Site	Persona	Stack	Status
Brighton 1st/14th Sea Scouts	Skipper ⚓	Plain HTML/CSS (no build)	✅ Live, tested
Yalukit Willam Nature Association	Ranger 🌿	React/Vite/TypeScript	⚠️ Configured — bot not yet in WA group

How it works today

WhatsApp message
    → Twilio webhook
    → OpenClaw gateway (Mac Mini, Mimisbrunnr)
    → Mimir AI (Claude Sonnet, adopts persona via SITE.json)
    → Makes file edits directly on disk
    → Deploys preview via wrangler CLI
    → Replies in WhatsApp with preview URL
    → Admin says "deploy" → merges to main → production deploy

Configuration model

Each site is defined by a SITE.json file:

{
  "slug": "brighton-scouts",
  "title": "Brighton 1st/14th Sea Scouts",
  "cfProject": "brighton-sea-scouts",
  "workDir": "/Users/mimisbrunnr/workspace/brighton-scouts",
  "buildCmd": null,
  "distDir": ".",
  "gitRepo": "https://github.com/andrew-julian/brighton-sea-scouts.git",
  "siteUrl": "https://brighton-sea-scouts.pages.dev",
  "channels": [
    { "type": "whatsapp", "id": "120363425375840006@g.us", "admins": ["+61412345678"] }
  ],
  "persona": {
    "name": "Skipper ⚓",
    "greeting": "Hi! I'm Skipper...",
    "style": "friendly, concise, community-spirited."
  }
}

Routing: promptcms/index.json maps channel IDs → site slugs. When a message arrives, Mimir checks this index to know which persona to adopt.

Current limitations

Single point of failure: Mac Mini offline = all bots offline
Not scalable: every message routed through one machine and one AI instance
Not sellable: tightly coupled to Andrew's personal OpenClaw, API keys, and file system
No multi-tenancy: sites live as directories on one machine; adding an org means SSH access
YWNA complexity: React/Vite build step means changes require npm build before deploy — slower and more failure-prone than the Brighton pure-HTML setup
No config validation: array-vs-object bugs in SITE.json have caused silent failures
Shared phone number: both sites share Andrew's personal Twilio WhatsApp number during testing

4. Target Architecture

The production system has three tiers that handle different levels of complexity:

User message (WhatsApp / Telegram / Slack)
         │
         ▼
┌─────────────────────────────────────────────┐
│  TIER 1 — Fast-Path Worker                  │
│  Cloudflare Worker (edge, ~100ms)           │
│  • Classify request (simple vs complex)     │
│  • Simple path: extract data → update D1   │
│    → trigger Pages rebuild → reply "Done"  │
│  • Complex path: queue Tier 2 job          │
│    → reply "Working on it, few mins..."    │
└──────────────┬──────────────────────────────┘
               │ (complex requests only)
               ▼
┌─────────────────────────────────────────────┐
│  TIER 2 — Async Agentic Engine              │
│  Cloud-hosted coding agent                 │
│  • Reads site repo via GitHub API          │
│  • Makes targeted HTML/CSS/JS changes      │
│  • Creates PR → preview deploy             │
│  • Notifies user with preview URL          │
│  • User approves → Worker merges → live    │
└──────────────┬──────────────────────────────┘
               │ (site setup, persona design, schema)
               ▼
┌─────────────────────────────────────────────┐
│  TIER 3 — Operator / Admin (Mimir)          │
│  Setup, configuration, style guide gen     │
│  • Onboard new orgs                        │
│  • Design/update templates                 │
│  • Handle edge cases + escalations         │
│  Not in any real-time user-facing path     │
└─────────────────────────────────────────────┘

What each tier handles

Tier 1 (instant, ~seconds)

Structured content updates: event dates, text copy, contact info, opening hours
Announcement additions/removals
Anything that maps cleanly to a content schema field

Tier 2 (async, 2–5 minutes)

Layout changes: new sections, reordering existing sections
Design updates: colour adjustments, typography, component style changes
New pages or major content restructures
Photo gallery additions, image updates
Anything requiring file reading, planning, and multi-file edits

Tier 3 (human-in-the-loop, no SLA)

Initial site creation and template design
Style Guide generation at onboarding
Persona definition
Structural changes outside the agent's safe scope
Debug and recovery from agent failures

5. System Components

5.1 Cloudflare Worker (Tier 1)

The primary runtime for all user-facing interactions. Runs at the edge globally.

Responsibilities:

Receive and verify Twilio webhook signatures
Route inbound messages to the correct org (by phone number → org lookup in D1)
Classify requests: simple (content data update) vs complex (layout/code change)
Execute simple requests: call AI to extract structured data, write to D1, trigger Pages rebuild
Queue complex requests to Tier 2 job queue (Cloudflare Queues)
Handle approval webhooks (thumbs-up reaction → merge PR)
Send reply messages via Twilio API

Request classification (within Worker):

Simple triggers:
  - "change the meeting time to..."
  - "update the contact email to..."
  - "add an event: [title, date, description]"
  - "remove the [event name] event"
  - "update the announcement to say..."

Complex triggers:
  - "add a photo gallery section"
  - "make the hero more welcoming"
  - "redesign the events section layout"
  - "add a new page for..."
  - "change the colour scheme to..."
  - Any request mentioning layout, design, sections, pages

Classification is a lightweight AI call (cheap model, structured output) at the start of every request.

Worker hardening (Phase 3):

Rate limiting per org: one org cannot starve others — enforce per-org request quotas at the Worker level
Cost caps per org: token usage tracked in D1; hard cap prevents runaway spend from a bug or bad actor
Graceful degradation: if Tier 2 queue is backed up, Tier 1 simple-path continues to function normally
Circuit breakers: if an org's GitHub token is revoked or their repo is unreachable, fail fast with a clear user message — do not retry indefinitely

5.2 Agentic Engine (Tier 2)

An async coding agent triggered by the Worker's job queue. Runs in cloud (Cloudflare Durable Object or dedicated serverless function).

Tool registry (6 tools, no more):

Tool	Purpose
`read_file(path)`	Read any file in the site repo via GitHub API
`write_file(path, content)`	Create or overwrite a file; auto-commits to working branch
`list_files(path?)`	List directory contents; called at job start to map site structure
`search_content(query)`	Find sections by text or `data-section` attribute
`create_pr(title, body)`	Open PR from working branch; triggers Cloudflare Pages preview deploy
`notify(channel, message)`	Send WhatsApp/Telegram message to requesting user

The agent cannot: browse the web, run shell commands, access other orgs' repos, modify its own system prompt, or call any API beyond this list.

Execution isolation:

Each Tier 2 job runs in its own dedicated Durable Object instance
Hard timeout of 10 minutes per job; automatic cleanup on expiry
No shared in-memory state between concurrent jobs — isolation is structural, not policy

Job serialisation per org:

Maximum of one active Tier 2 job per org at any time
If a second request arrives while a job is running or awaiting approval, it is queued with a notification: "I've got another change lined up — I'll start on it once you approve or discard the current preview"
This prevents concurrent PR conflicts and gives members clarity about what's in flight
Tier 1 (D1 updates) is unaffected — D1 is ACID-compliant and handles concurrency natively

Execution flow:

Receive job from queue (org ID, repo, raw request, style guide, notify target)
list_files() → map site structure
read_file() on relevant files → understand current state
Plan the change internally
write_file() to implement (may loop back to verify)
read_file() on modified files → verify output
create_pr() → preview deploy auto-triggers
notify() user with preview URL and approval prompt
Wait for approval webhook (Worker receives thumbs-up → merges PR → live)

Preview lifecycle:

Preview URLs expire after 48 hours if not actioned
A nudge notification is sent at 24 hours: "Just checking — did you want to approve or discard that preview?"
On expiry: branch is deleted, job marked discarded, user notified; they can simply ask again
Prevents unbounded GitHub PR and branch accumulation at scale

5.3 Content Layer (Cloudflare D1 + Pages)

D1 database stores all multi-tenant state:

Organisation config (replaces SITE.json files)
Content schemas per org (events, announcements, team, etc.)
Content data (the actual structured content, not in HTML)
Style Guide per org (fed into Tier 2 system prompt as Layer 2)
Job queue state and history
Approval tokens
Conversation history per org (see §6)
Observability metrics per org (token usage, job counts, latency)

Cloudflare Pages serves all sites:

Each org has a Pages project
Preview deploys auto-created on PR open (for Tier 2 approvals)
Production deploy triggered on PR merge (approval) or direct (Tier 1 simple updates)
Build hooks for programmatic triggers

5.4 GitHub (Source of Truth)

All site files live in GitHub repos:

One repo per org
All Tier 2 changes go through PRs with full history
Preview branches auto-deleted on merge or discard
Audit trail: every AI-generated change has a commit with description
Rollback: the last successfully merged commit is tagged last-known-good on each merge — enables instant revert via git revert if a deployed change causes problems

5.5 Messaging Layer (Twilio)

One dedicated WhatsApp number per org (Twilio Business API)
Inbound webhooks routed to Cloudflare Worker
Twilio handles message delivery reliability
Supports WhatsApp, SMS fallback
Future: Telegram Business API (similar model, different provider)

5.6 Observability Layer

Instrumented from Phase 3 onwards. All metrics stored in D1 and surfaced in the operator dashboard (Phase 5).

Per-org metrics tracked:

Token usage: input + output tokens per job and per Tier 1 request — basis for billing and anomaly detection
Tier routing accuracy: what % of requests are escalated to Tier 2 unnecessarily? (calibrates the classifier)
Approval rate: how often do admins reject AI-generated PRs? (quality signal — declining approval rate = degraded output)
Time-to-publish: end-to-end latency from inbound message → live site
Error patterns: which orgs, request types, or site stacks fail most often?
Preview abandonment rate: how often do previews expire without action?

These metrics enable "profile first, then optimise" — routing decisions, model selection, and cost caps should be tuned based on real data, not assumptions.

5.7 Operator Dashboard (future Phase 5)

A web UI for Andrew (or future operators) to:

View all orgs, their status, and last activity
Manage Style Guides and personas
See job history (Tier 2 runs, successes, failures)
Provision new organisations
View billing and usage per org
Monitor observability metrics

6. Data Model

Organisation

type Organisation = {
  id: string;                     // "brighton-scouts"
  name: string;                   // "Brighton 1st/14th Sea Scouts"
  status: "active" | "inactive" | "onboarding";
  created_at: string;
  
  // Messaging
  whatsapp_number: string;        // Twilio provisioned number
  telegram_bot_token?: string;
  notify_channels: NotifyChannel[];
  
  // Site
  site_repo: string;              // GitHub repo URL
  cf_project: string;             // Cloudflare Pages project name
  site_url: string;               // Production URL
  preview_url_template: string;   // "{branch}.cf-project.pages.dev"
  build_cmd: string | null;       // null = no build step
  dist_dir: string;               // "." or "dist/public"
  
  // Style
  style_guide: StyleGuide;        // See §7
  persona: Persona;               // See §7
  
  // Admins
  admin_contacts: string[];       // Phone numbers who can approve deploys
  
  // Limits
  monthly_token_cap?: number;     // Hard spend cap; null = operator default
  tier2_rate_limit?: number;      // Max Tier 2 jobs per day
};

Conversation History (per org)

Design principle: Intra-org shared context is a feature, not a bug. Multiple committee members should be able to interact with the same bot and have it understand what's already been done. Skipper should know that Person A just changed the meeting time, so when Person B asks "what events do we have?", the answer is current.

Conversation context is scoped by org_id only — the whole organisation shares one bot context.

type ConversationMessage = {
  id: string;
  org_id: string;                 // Isolation boundary: between orgs, not between users
  sender_phone: string;           // Attributed for audit, not for isolation
  role: "user" | "assistant";
  content: string;
  created_at: string;
};

Context window policy:

Load the most recent N messages (suggested: 50) or messages within the last 7 days — whichever is smaller
Older history is archived, not deleted — available for audit but not injected into prompts
Explicit context reset is available as an admin command if needed ("Skipper, start fresh")

Content Schema (per org)

Rather than the AI editing HTML directly, simple content is stored structurally:

type ContentSchema = {
  org_id: string;
  sections: {
    [sectionName: string]: SectionSchema;
  };
};

type SectionSchema = {
  type: "list" | "single" | "richtext" | "contacts";
  label: string;
  items?: ContentItem[];  // for type=list
  value?: string;         // for type=single or richtext
};

// Example for Brighton Sea Scouts:
// sections.events = { type: "list", label: "Events", items: [...] }
// sections.announcement = { type: "single", label: "Notice board", value: "..." }
// sections.contact_email = { type: "single", label: "Contact", value: "..." }

For Tier 1 simple updates, the Worker updates D1 content directly and triggers a template rebuild — the HTML is generated from the template + content data, not edited directly.

Job (Tier 2)

type AgentJob = {
  id: string;
  org_id: string;
  status: "queued" | "running" | "awaiting_approval" | "approved" | "discarded" | "failed" | "expired";
  
  request: string;                // Raw user message
  requested_by: string;           // Phone number
  queued_at: string;
  started_at?: string;
  completed_at?: string;
  expires_at?: string;            // Set when status=awaiting_approval; 48h from pr creation
  
  pr_url?: string;
  preview_url?: string;
  pr_number?: number;
  
  change_summary?: string;        // Agent's description of what it did
  token_usage?: { input: number; output: number; };  // For billing/observability
  error?: string;                 // If status=failed
  
  approval_token: string;         // Used to verify thumbs-up came from right user
};

7. Persona & Style Guide System

This is PromptCMS's key differentiator. The bot isn't a generic assistant — it has a specific identity that matches the organisation.

Persona

type Persona = {
  name: string;           // "Skipper ⚓" / "Ranger 🌿"
  greeting: string;       // First message sent when bot joins group
  style: string;          // Writing style guidance for the AI
  theme: string;          // "nautical" / "nature / conservation"
  avoid: string[];        // ["corporate", "jargon", "markdown headers"]
  emoji_use: "sparse" | "moderate" | "none";
};

The persona is injected as Layer 1 of the system prompt in all AI interactions. The bot never breaks character, never mentions OpenClaw, and never identifies as an AI unless directly asked.

Style Guide

Generated once at site onboarding. Stored per-org in D1. Injected as Layer 2 of the Tier 2 agent's system prompt — it's the agent's visual bible for that site.

Style Guide safety: before a Style Guide is stored in D1 or injected into a system prompt, it is:

Validated against the StyleGuide schema (structured JSON, not freeform text)
Sanitised to remove any instruction-like patterns (e.g. strings containing "ignore", "system prompt", "instructions")
Tested in a sandbox prompt before being activated for live requests

This prevents both accidental ambiguity and deliberate prompt injection via admin-uploaded style data.

type StyleGuide = {
  colours: {
    primary: string;       // "#1B3A6B"
    secondary: string;     // "#F0A500"
    background: string;
    text: string;
    accent: string;
  };
  typography: {
    heading_font: string;  // "Montserrat"
    body_font: string;     // "Open Sans"
    base_size: string;     // "16px"
    scale_ratio: string;   // "1.25"
  };
  layout: {
    feel: string;          // "clean, outdoorsy, family-friendly"
    whitespace: string;    // "generous"
    imagery: string;       // "nature, community, action"
    mobile_first: boolean;
  };
  brand_voice: {
    tone: string;          // "warm, active"
    avoid: string[];       // ["corporate", "jargon"]
    cta_style: string;     // "imperative"
  };
  technical: {
    stack: string;         // "plain HTML/CSS/JS"
    build_step: boolean;
    css_vars: boolean;
    image_path: string;    // "/images/"
    protected_files: string[];
  };
  site_structure: {
    sections: string[];    // ["hero", "about", "events", "gallery", "contact"]
    nav_style: string;     // "sticky-top"
    footer: boolean;
    section_id_attr: string;  // "data-section"
  };
};

The 5-Layer System Prompt (Tier 2)

Clear separation between what the agent can do (L1, L4, L5 — static, same for all orgs) and how it presents (L2, L3 — dynamic, per org). This consistency is essential for predictable debugging and testing across orgs.

Layer	Content	Static/Dynamic
L1	Core identity: who the agent is, what its mandate is	Static — identical across all orgs
L2	Site context: Style Guide (palette, typography, voice, structure)	Dynamic — per org from D1
L3	Technical constraints: stack, protected files, conventions	Semi-static — per org
L4	Execution approach: read before writing, minimal changes, verify	Static — identical across all orgs
L5	Safety rules: always branch, always preview, never delete without explicit instruction	Static — identical across all orgs

Persona Library (future)

A curated library of persona templates organisations can choose from:

Persona	Theme	Suitable for
Skipper	Nautical / sea scouts	Scout groups, sailing clubs
Ranger	Nature / conservation	Environmental groups, nature associations
Coach	Active / sporting	Sports clubs, fitness groups
Bloom	Garden / community	Garden clubs, community gardens
Spark	Educational / youth	School committees, youth groups
Anchor	Civic / community	Neighbourhood groups, ratepayer associations

Orgs can use a template as-is, customise the name, or (premium) have a fully custom persona designed.

8. Messaging Layer

Supported Channels (current + planned)

Channel	Status	Notes
WhatsApp (Twilio)	✅ Live	Requires Twilio compliance bundle per number
Slack	✅ Live	Via OpenClaw — migrates to Worker in Phase 3
Telegram	Planned Phase 4	Simpler compliance, no per-number bundle
SMS	Future	Fallback for orgs without smartphones

WhatsApp Compliance

WhatsApp Business API via Twilio requires:

A dedicated phone number per organisation (or per namespace)
A Twilio compliance bundle (business registration)
Approval for the specific use case

Current status: Andrew's personal number +61 485 033 211 is provisioned. Compliance bundle BUead167ea... submitted. Once approved, orgs get dedicated numbers provisioned programmatically via Twilio API.

At scale, dedicated numbers are provisioned automatically on org signup via Twilio's REST API:

// Pseudo-code: provision number on org signup
const number = await twilio.incomingPhoneNumbers.create({
  areaCode: '61',
  smsApplicationSid: APP_SID,
  voiceApplicationSid: APP_SID,
});
await db.organisations.update(orgId, { whatsapp_number: number.phoneNumber });

Message Routing

Inbound WhatsApp message
    → Twilio webhook → POST /webhook/whatsapp
    → Worker validates Twilio signature
    → Lookup org: SELECT * FROM orgs WHERE whatsapp_number = ?
    → Load org config, style guide, persona
    → Load conversation history: last 50 messages for this org_id
    → Process request (Tier 1 or queue for Tier 2)
    → Reply via Twilio REST API

Admin vs Member Actions

Any group member can: request content changes, ask what the site looks like, ask for help.

Only admins can: approve deploys, discard previews, make admin-level config changes.

Admin detection: phone numbers in org.admin_contacts. Future: any WhatsApp group admin auto-detected via Twilio Group Management API.

9. Security & Multi-tenancy

Isolation Model

PromptCMS achieves multi-tenant isolation through data architecture, not container-per-customer infrastructure. This scales to thousands of orgs without proportional operational overhead.

Between-org isolation (strict):

Each org has a scoped GitHub personal access token with write access to their repo only — the Tier 2 agent is structurally blinded to other orgs' code
The Tier 2 agent's job payload contains only that org's token — no shared credential
All D1 queries are filtered by org_id — cross-tenant data leakage is architecturally impossible
Cloudflare Pages projects are org-specific
Each Tier 2 job runs in an isolated Durable Object instance with no shared state

Within-org context sharing (by design):

Conversation history is scoped to org_id only — all members of an org share one bot context
This is intentional and correct: Person A updates the meeting time; Person B asks "what's on this week?" — Skipper knows
No per-user memory isolation within an org; this is a CMS for a shared website, not a personal assistant

Approval Security

The approval flow (thumbs-up → live deploy) requires:

Message sender is in org.admin_contacts
approval_token matches the pending job (prevents replay attacks)
Job status is awaiting_approval (not already approved/discarded)

Audit Trail

Every AI-generated change:

Produces a git commit with description (Tier 2)
Stored in jobs table with full request, response, change summary, and token usage
Preview URL generated before anything goes live
Human approval required for production deploy
last-known-good tag on each successful merge enables instant rollback

Rollback

Any admin can request rollback conversationally: "Skipper, undo the last change"

Implementation: git revert on the last merge commit → new PR → preview → approval → live. The same preview-before-publish flow applies to rollbacks — no changes bypass human review.

Secrets Management

Twilio credentials: Cloudflare Worker environment secrets (per org, or shared for operator account)
GitHub tokens: Cloudflare Worker secrets, one per org
Anthropic API key: Worker environment secret (operator-level)
Cloudflare API token: Worker environment secret

10. Phase Roadmap

Phase 0 — Foundation ✅ COMPLETE

What was built:

Brighton Sea Scouts live on Cloudflare Pages
WhatsApp → OpenClaw → Mimir → wrangler deploy pipeline working
SITE.json config model
Preview → admin approval → production deploy flow
Slack mirror relay (every exchange relayed to #prompt-cms and #brighton-sea-scouts)
YWNA site created and configured (Ranger persona)

Proved: the core thesis works. A non-technical user can update a live website by typing a message in WhatsApp.

Phase 1 — YWNA Activation

Goal: Both sites fully operational. Validate the second deployment proves the config model, not a one-off.

Tasks:

Add YWNA bot number to the YWNA WhatsApp group
Run full E2E test: YWNA member makes a change request → preview → admin approves → live
Decision: keep "Ranger" or rename persona (Willy Wagtail / "Willy" proposed)
Finalise YWNA admin list (who has approval rights beyond Andrew)
Merge or close stale preview-mem3tier branch in YWNA repo
Document the current operational model (runbook) so a second operator could follow it

Exit criteria: An YWNA team member (not Andrew) successfully deploys a change end-to-end.

Phase 2 — Robustness

Goal: The current architecture is reliable enough to onboard a third org without Andrew babysitting it.

Tasks:

Config validation: schema-validate SITE.json on load; hard error with clear message on malformed config (the array-vs-object bug has hit twice)
Deploy reliability: replace ad-hoc wrangler CLI calls with a proper deploy script with retry logic and error reporting
GitHub-triggered deploys: connect Brighton and YWNA repos to Cloudflare Pages git integration, so git push main deploys automatically — remove manual wrangler calls where possible
Error handling: if the AI makes an edit that breaks the build (YWNA especially), surface the error clearly to the user and rollback gracefully
Onboarding playbook: document exactly how to add a new site (SITE.json, index.json, git setup, Cloudflare Pages project, Twilio channel)
Pending state persistence: pending.json is currently in-memory/file; move to a proper store so it survives Mimir restarts
Thumbs-up reaction deploy: implement emoji reaction (👍) on preview message as alternative to typing "deploy" — better UX for mobile users

Exit criteria: Can add a third org by following the playbook, zero OpenClaw debugging required.

Phase 3 — Architecture Migration (Decouple from Mac Mini)

Goal: The Mac Mini and Mimir are removed from the real-time message handling path. All live user interactions run in cloud infrastructure.

This is the most significant engineering phase — it's a full re-architecture, not an extension of the current system.

Part A — Cloudflare Worker (Tier 1)

Build the fast-path Worker:

Twilio webhook handler with signature verification
Org lookup: D1 database (SELECT * FROM orgs WHERE whatsapp_number = ?)
Request classifier: cheap AI call → "simple" or "complex"
Simple path: AI extracts structured data (event date, text, etc.) → writes to D1 content store → triggers Pages rebuild → replies "Done ✅"
Complex path: enqueues Tier 2 job → replies "Working on it, usually a few minutes"
Approval handler: receives thumbs-up → verifies → merges PR via GitHub API → fires production deploy
Worker hardening: rate limiting, cost caps, circuit breakers, and graceful Tier 2 degradation (see §5.1)

Part B — D1 Multi-tenant Database

Schema:

organisations — replaces all SITE.json files
content — structured content per org (events list, announcements, etc.)
style_guides — Style Guide JSON per org
jobs — Tier 2 job history and state
pending_approvals — outstanding preview approvals
conversation_history — per-org message context with TTL (see §6)
metrics — per-org observability data (token usage, latency, routing accuracy)

Data migration:

Migrate Brighton and YWNA configs from SITE.json to D1
Generate initial content schemas from existing site HTML
Seed initial last-known-good tags on existing repos

Part C — Site Template Architecture

For Tier 1 to work (updating content data without touching HTML), sites need a clean content/template separation:

Template: template.html — fixed layout, never touched by AI in normal operation
Content: stored in D1, injected at build time
Builder: a small Pages Function or Worker that combines template + D1 content on each deploy

Note: This requires rebuilding (or adapting) the Brighton and YWNA sites. Brighton (pure HTML) is straightforward — extract content into JSON, generate the HTML via template. YWNA (React/Vite) is more complex — and the strong recommendation is to rebuild it as plain HTML (see §11 Open Decisions).

Part D — Cut Over

Point Twilio webhooks at the new Worker (replacing OpenClaw)
Verify end-to-end flow for both sites
Mac Mini / OpenClaw deprecated from critical path
Mimir retained for admin operations only (onboarding, debugging, config updates)

Exit criteria: Both sites receiving and processing messages with zero Mac Mini involvement. Mimir not required for any user-facing operation.

Phase 4 — Tier 2 Agentic Engine

Goal: Complex layout and design requests (new sections, visual redesign, substantial page changes) handled by an async cloud agent, not Mimir.

What gets built:

Job queue: Cloudflare Queues or Durable Object for async job management, with per-org serialisation (max 1 active Tier 2 job per org)
Agentic runtime: Cloudflare Durable Objects (one DO instance per job, isolated)
GitHub integration: read/write via GitHub API with per-org scoped tokens
Agent system prompt: 5-layer architecture (see §7)
Style Guide injection: L2 of system prompt loaded from D1 per org, sanitised before injection
Preview flow: PR creation → Cloudflare Pages preview → notify user
Approval webhook: Worker receives thumbs-up → verifies approval token → merges PR → tags last-known-good
Preview expiry: 48h TTL with 24h nudge notification; expired jobs are auto-discarded and branch cleaned up

Style Guide generation at onboarding:

When a new org is set up, Andrew (or eventually a self-service wizard) runs a Style Guide generation step:

Provide the site's existing URL or design brief
AI analyses the site (or brief) and generates a structured Style Guide JSON
Review and adjust
Sanitise and validate
Store in D1 — this is L2 for all future Tier 2 jobs for that org

Exit criteria: A complex request ("add a photo gallery above the events section, make it feel more outdoorsy") is handled end-to-end without Mimir involvement. User receives a preview URL within 5 minutes.

Phase 5 — Scale & Polish

Goal: The system handles multiple orgs reliably. Onboarding a new org is a defined, documented process that doesn't require Andrew's time per site.

Features:

Third org: use the playbook to onboard a third organisation (ideally unrelated to sea scouts or YWNA — proves generality)
Self-service onboarding wizard: web UI where an org provides their details (name, WhatsApp number, site preferences, persona preference) and the system provisions everything automatically
- Twilio number provisioning via REST API
- Cloudflare Pages project creation via API
- GitHub repo creation from template
- D1 record creation
- Style Guide generation and sanitisation
- Persona assignment
Dedicated numbers: each org gets their own WhatsApp number (removes shared-number limitation)
Persona library: 6+ pre-built personas (Skipper, Ranger, Coach, Bloom, Spark, Anchor) for orgs to choose from at signup
Admin auto-detection: any WhatsApp group admin automatically gets deploy approval rights (no manual admin list maintenance)
Operator dashboard: web UI showing all orgs, job history, uptime, usage, errors, observability metrics
Weekly digest: automated summary sent to each org's group — "your site had 3 updates this week, here's what changed"
Rollback: admin can request rollback to any previous version ("undo the last change") — git revert → preview → approval → live

Exit criteria: A new org can be fully onboarded without Andrew writing any code or config.

Phase 6 — Product

Goal: PromptCMS is a sellable SaaS. External operators (orgs, agencies) pay for the service.

Features:

Stripe billing: subscription per org (target: $20–30/month per org)
- Free tier: 1 org, limited complex requests/month
- Standard: full access, dedicated number, persona customisation
- Agency tier: multiple orgs under one account, white-label option
Custom domains: org maps their own domain (scouts.example.org.au) to the Pages site
White-label: agency resellers can remove PromptCMS branding
Custom persona design: premium tier — Andrew (or operator) designs a bespoke persona for the org
Google Photos integration: orgs can connect a Google Photos album; bot can pull latest photos into gallery section
Analytics: orgs can ask the bot "how many people visited the site this week?" — basic Pages analytics surfaced conversationally
Public launch: landing page, pricing, signup flow

Strategic option — vertical play:

Rather than broad horizontal SaaS, consider approaching a national body (Scouts Australia, Surf Life Saving Australia, Tennis Australia) to endorse PromptCMS to their member clubs. One partnership → hundreds of orgs, zero marketing spend. This is the fastest path to meaningful scale and a defensible position.

Exit criteria: 10+ paying orgs. Positive unit economics. System runs without Andrew's operational involvement.

11. Open Decisions

Decision	Options	Recommended	Blocker for
YWNA persona name	Keep "Ranger" / "Willy Wagtail" / other	Decide with YWNA team	Phase 1
Site template architecture	Separate content/template vs keep editing HTML	Content/template separation	Phase 3
Tier 2 runtime	Cloudflare Durable Objects vs separate VPS	Durable Objects (simpler ops, better isolation)	Phase 4
Distribution model	Horizontal SaaS vs vertical (national body deal)	Vertical first	Phase 6
Telegram support	Twilio vs direct Telegram Bot API	Direct Telegram API (simpler)	Phase 4
Agency tier pricing	Flat per-org vs per-message	Flat per-org	Phase 6
Style Guide generation	Manual (Andrew) vs AI-assisted wizard	AI-assisted via Mimir	Phase 4
YWNA React complexity	Keep React build / rebuild as plain HTML	Plain HTML (simpler, faster, more reliable, uniform stack)	Phase 3
Conversation history window	50 messages vs 7 days vs both	Both (whichever is smaller)	Phase 3
Preview expiry duration	24h / 48h / 72h	48h with 24h nudge	Phase 4

Note on YWNA: The React/Vite/TypeScript stack was appropriate for a full-stack app but adds significant complexity for a primarily-static community website. Strong recommendation to rebuild YWNA as plain HTML/CSS (like Brighton) during Phase 3 — the Tier 1 content/template model is much simpler without a build step, and keeping a uniform stack means the Tier 2 agent behaves predictably across all orgs with no special-casing.

12. Guiding Principles

Reliability over features. A bot that sometimes fails destroys trust with community volunteers who have little patience for technology issues. Every phase must leave the system more reliable than it found it.

The user experience is the conversation. Every design decision should be evaluated against the experience of a 60-year-old committee member opening WhatsApp and typing a message. If it makes that interaction more confusing, it's wrong regardless of how elegant it is technically.

Preview before publish, always. No AI change should ever go live without a human seeing it first. This is non-negotiable at every tier and every phase. The preview-then-approve flow is the product's trust mechanism. This applies equally to rollbacks.

Persona is identity, not decoration. The fact that the bot is Skipper (not "PromptCMS Bot") is what makes members engage with it naturally. Every product decision that dilutes the persona weakens the product. Protect it.

Mac Mini is the prototype, not the product. The current architecture works and taught us everything we need to know. It is not a foundation to build on — it's a proof of concept to replace. Phase 3 exists to make this clean.

Content and code are different jobs. The AI's job in Tier 1 is to update data. The AI's job in Tier 2 is to edit code. These require fundamentally different architectures. Never conflate them — keep the separation clean.

Isolation through architecture, not infrastructure. Multi-tenancy is achieved via scoped tokens, org_id-filtered queries, and per-job Durable Object instances — not container-per-customer. This scales to thousands of orgs without proportional operational overhead.

Within-org context is shared by design. The bot is a shared resource for a shared website. Intra-org conversation history enables collaboration (Person A changes the meeting time; Person B asks what's on — the bot knows). Isolation boundaries are between orgs, not between members.

Measure before optimising. Instrument observability from Phase 3. Routing decisions, model selection, and cost caps should be tuned on real data — not assumptions. The Tier 1 classifier is the biggest cost lever at scale; you need to see it before you can tune it.

Mimir moves up the stack. In the target architecture, Mimir (and Andrew) do the high-value, high-judgment work: onboarding, persona design, Style Guide generation, edge case handling. Not the repetitive, latency-sensitive work of answering WhatsApp messages. That's what the cloud infrastructure is for.

This document is a living spec. Update it as architectural decisions are made, phases complete, and the product evolves. The goal is that at any point, a developer could read this document and understand exactly what PromptCMS is, where it stands, and what needs to be built next.

PromptCMS

PromptCMS — Product Specification

Table of Contents

1. Product Vision

2. Core Thesis

3. Current State (v0.1)

What exists

How it works today

Configuration model

Current limitations

4. Target Architecture

What each tier handles

5. System Components

5.1 Cloudflare Worker (Tier 1)

5.2 Agentic Engine (Tier 2)

5.3 Content Layer (Cloudflare D1 + Pages)

5.4 GitHub (Source of Truth)

5.5 Messaging Layer (Twilio)

5.6 Observability Layer

5.7 Operator Dashboard (future Phase 5)

6. Data Model

Organisation

Conversation History (per org)

Content Schema (per org)

Job (Tier 2)

7. Persona & Style Guide System

Persona

Style Guide

The 5-Layer System Prompt (Tier 2)

Persona Library (future)

8. Messaging Layer

Supported Channels (current + planned)

WhatsApp Compliance

Message Routing

Admin vs Member Actions

9. Security & Multi-tenancy

Isolation Model

Approval Security

Audit Trail

Rollback

Secrets Management

10. Phase Roadmap

Phase 0 — Foundation ✅ COMPLETE

Phase 1 — YWNA Activation

Phase 2 — Robustness

Phase 3 — Architecture Migration (Decouple from Mac Mini)

Phase 4 — Tier 2 Agentic Engine

Phase 5 — Scale & Polish

Phase 6 — Product

11. Open Decisions

12. Guiding Principles