PromptCMS
A conversational content management system for community organisations. Members update their website by messaging a bot. No logins, no dashboards, no training required.
PromptCMS โ Product Specification
A conversational content management system for community organisations. Members update their website by messaging a bot. No logins, no dashboards, no training required.
Version: 0.2-concept
Last updated: 2026-03-09
Status: Live prototype (2 orgs) โ Production SaaS roadmap
Table of Contents
- Product Vision
- Core Thesis
- Current State (v0.1)
- Target Architecture
- System Components
- Data Model
- Persona & Style Guide System
- Messaging Layer
- Security & Multi-tenancy
- Phase Roadmap
- Open Decisions
- Guiding Principles
1. Product Vision
PromptCMS is a website CMS delivered entirely through messaging โ WhatsApp, Telegram, Slack โ without any web interface, login, or training for end users.
A sea scout leader wants to update the club's meeting time. They open WhatsApp, message the group bot ("Skipper, change Tuesday meeting to 7pm"), and the site is updated, previewed, and deployed live within seconds. No browser tabs. No passwords. No asking the volunteer webmaster who set the site up three years ago.
The product targets community organisations โ scout groups, nature associations, sports clubs, school committees, neighbourhood groups โ who have websites they rarely update because the tools are too cumbersome for non-technical volunteers.
The long-term vision is a platform: an operator signs up, provides their organisation's details, and within minutes has a live website with a conversational bot their members can talk to. The bot knows the organisation's visual identity, speaks in the org's voice, and can handle everything from updating event dates to redesigning sections โ escalating to a human preview for anything significant.
2. Core Thesis
The problem with existing CMSes is not feature gaps โ it's friction. WordPress, Squarespace, and Wix all work, but they require logging in, navigating dashboards, and understanding layout concepts. For a volunteer running a monthly meeting, that friction means the site goes stale.
The PromptCMS thesis: the messaging app is already open. Community groups already coordinate via WhatsApp. If the website bot lives there too, the update cost drops to near zero โ and websites actually stay current.
The differentiator: the "persona" approach. The bot isn't a generic assistant โ it's Skipper for sea scouts, Ranger for a nature association. That identity makes the bot feel native to the organisation, not like a SaaS tool bolted on. Members engage with it like they would a helpful committee member.
The key technical insight: the hard work (understanding intent, generating valid code, deploying correctly) happens invisibly in the cloud. The user experience is just: type a message, get a preview link, send thumbs-up.
3. Current State (v0.1)
What exists
Two live deployments, both fully operational end-to-end:
| Site | Persona | Stack | Status |
|---|---|---|---|
| Brighton 1st/14th Sea Scouts | Skipper โ | Plain HTML/CSS (no build) | โ Live, tested |
| Yalukit Willam Nature Association | Ranger ๐ฟ | React/Vite/TypeScript | โ ๏ธ Configured โ bot not yet in WA group |
How it works today
WhatsApp message
โ Twilio webhook
โ OpenClaw gateway (Mac Mini, Mimisbrunnr)
โ Mimir AI (Claude Sonnet, adopts persona via SITE.json)
โ Makes file edits directly on disk
โ Deploys preview via wrangler CLI
โ Replies in WhatsApp with preview URL
โ Admin says "deploy" โ merges to main โ production deploy
Configuration model
Each site is defined by a SITE.json file:
{
"slug": "brighton-scouts",
"title": "Brighton 1st/14th Sea Scouts",
"cfProject": "brighton-sea-scouts",
"workDir": "/Users/mimisbrunnr/workspace/brighton-scouts",
"buildCmd": null,
"distDir": ".",
"gitRepo": "https://github.com/andrew-julian/brighton-sea-scouts.git",
"siteUrl": "https://brighton-sea-scouts.pages.dev",
"channels": [
{ "type": "whatsapp", "id": "120363425375840006@g.us", "admins": ["+61412345678"] }
],
"persona": {
"name": "Skipper โ",
"greeting": "Hi! I'm Skipper...",
"style": "friendly, concise, community-spirited."
}
}
Routing: promptcms/index.json maps channel IDs โ site slugs. When a message arrives, Mimir checks this index to know which persona to adopt.
Current limitations
- Single point of failure: Mac Mini offline = all bots offline
- Not scalable: every message routed through one machine and one AI instance
- Not sellable: tightly coupled to Andrew's personal OpenClaw, API keys, and file system
- No multi-tenancy: sites live as directories on one machine; adding an org means SSH access
- YWNA complexity: React/Vite build step means changes require npm build before deploy โ slower and more failure-prone than the Brighton pure-HTML setup
- No config validation: array-vs-object bugs in SITE.json have caused silent failures
- Shared phone number: both sites share Andrew's personal Twilio WhatsApp number during testing
4. Target Architecture
The production system has three tiers that handle different levels of complexity:
User message (WhatsApp / Telegram / Slack)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TIER 1 โ Fast-Path Worker โ
โ Cloudflare Worker (edge, ~100ms) โ
โ โข Classify request (simple vs complex) โ
โ โข Simple path: extract data โ update D1 โ
โ โ trigger Pages rebuild โ reply "Done" โ
โ โข Complex path: queue Tier 2 job โ
โ โ reply "Working on it, few mins..." โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ (complex requests only)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TIER 2 โ Async Agentic Engine โ
โ Cloud-hosted coding agent โ
โ โข Reads site repo via GitHub API โ
โ โข Makes targeted HTML/CSS/JS changes โ
โ โข Creates PR โ preview deploy โ
โ โข Notifies user with preview URL โ
โ โข User approves โ Worker merges โ live โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ (site setup, persona design, schema)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TIER 3 โ Operator / Admin (Mimir) โ
โ Setup, configuration, style guide gen โ
โ โข Onboard new orgs โ
โ โข Design/update templates โ
โ โข Handle edge cases + escalations โ
โ Not in any real-time user-facing path โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
What each tier handles
Tier 1 (instant, ~seconds)
- Structured content updates: event dates, text copy, contact info, opening hours
- Announcement additions/removals
- Anything that maps cleanly to a content schema field
Tier 2 (async, 2โ5 minutes)
- Layout changes: new sections, reordering existing sections
- Design updates: colour adjustments, typography, component style changes
- New pages or major content restructures
- Photo gallery additions, image updates
- Anything requiring file reading, planning, and multi-file edits
Tier 3 (human-in-the-loop, no SLA)
- Initial site creation and template design
- Style Guide generation at onboarding
- Persona definition
- Structural changes outside the agent's safe scope
- Debug and recovery from agent failures
5. System Components
5.1 Cloudflare Worker (Tier 1)
The primary runtime for all user-facing interactions. Runs at the edge globally.
Responsibilities:
- Receive and verify Twilio webhook signatures
- Route inbound messages to the correct org (by phone number โ org lookup in D1)
- Classify requests: simple (content data update) vs complex (layout/code change)
- Execute simple requests: call AI to extract structured data, write to D1, trigger Pages rebuild
- Queue complex requests to Tier 2 job queue (Cloudflare Queues)
- Handle approval webhooks (thumbs-up reaction โ merge PR)
- Send reply messages via Twilio API
Request classification (within Worker):
Simple triggers:
- "change the meeting time to..."
- "update the contact email to..."
- "add an event: [title, date, description]"
- "remove the [event name] event"
- "update the announcement to say..."
Complex triggers:
- "add a photo gallery section"
- "make the hero more welcoming"
- "redesign the events section layout"
- "add a new page for..."
- "change the colour scheme to..."
- Any request mentioning layout, design, sections, pages
Classification is a lightweight AI call (cheap model, structured output) at the start of every request.
Worker hardening (Phase 3):
- Rate limiting per org: one org cannot starve others โ enforce per-org request quotas at the Worker level
- Cost caps per org: token usage tracked in D1; hard cap prevents runaway spend from a bug or bad actor
- Graceful degradation: if Tier 2 queue is backed up, Tier 1 simple-path continues to function normally
- Circuit breakers: if an org's GitHub token is revoked or their repo is unreachable, fail fast with a clear user message โ do not retry indefinitely
5.2 Agentic Engine (Tier 2)
An async coding agent triggered by the Worker's job queue. Runs in cloud (Cloudflare Durable Object or dedicated serverless function).
Tool registry (6 tools, no more):
| Tool | Purpose |
|---|---|
read_file(path) |
Read any file in the site repo via GitHub API |
write_file(path, content) |
Create or overwrite a file; auto-commits to working branch |
list_files(path?) |
List directory contents; called at job start to map site structure |
search_content(query) |
Find sections by text or data-section attribute |
create_pr(title, body) |
Open PR from working branch; triggers Cloudflare Pages preview deploy |
notify(channel, message) |
Send WhatsApp/Telegram message to requesting user |
The agent cannot: browse the web, run shell commands, access other orgs' repos, modify its own system prompt, or call any API beyond this list.
Execution isolation:
- Each Tier 2 job runs in its own dedicated Durable Object instance
- Hard timeout of 10 minutes per job; automatic cleanup on expiry
- No shared in-memory state between concurrent jobs โ isolation is structural, not policy
Job serialisation per org:
- Maximum of one active Tier 2 job per org at any time
- If a second request arrives while a job is running or awaiting approval, it is queued with a notification: "I've got another change lined up โ I'll start on it once you approve or discard the current preview"
- This prevents concurrent PR conflicts and gives members clarity about what's in flight
- Tier 1 (D1 updates) is unaffected โ D1 is ACID-compliant and handles concurrency natively
Execution flow:
- Receive job from queue (org ID, repo, raw request, style guide, notify target)
list_files()โ map site structureread_file()on relevant files โ understand current state- Plan the change internally
write_file()to implement (may loop back to verify)read_file()on modified files โ verify outputcreate_pr()โ preview deploy auto-triggersnotify()user with preview URL and approval prompt- Wait for approval webhook (Worker receives thumbs-up โ merges PR โ live)
Preview lifecycle:
- Preview URLs expire after 48 hours if not actioned
- A nudge notification is sent at 24 hours: "Just checking โ did you want to approve or discard that preview?"
- On expiry: branch is deleted, job marked
discarded, user notified; they can simply ask again - Prevents unbounded GitHub PR and branch accumulation at scale
5.3 Content Layer (Cloudflare D1 + Pages)
D1 database stores all multi-tenant state:
- Organisation config (replaces SITE.json files)
- Content schemas per org (events, announcements, team, etc.)
- Content data (the actual structured content, not in HTML)
- Style Guide per org (fed into Tier 2 system prompt as Layer 2)
- Job queue state and history
- Approval tokens
- Conversation history per org (see ยง6)
- Observability metrics per org (token usage, job counts, latency)
Cloudflare Pages serves all sites:
- Each org has a Pages project
- Preview deploys auto-created on PR open (for Tier 2 approvals)
- Production deploy triggered on PR merge (approval) or direct (Tier 1 simple updates)
- Build hooks for programmatic triggers
5.4 GitHub (Source of Truth)
All site files live in GitHub repos:
- One repo per org
- All Tier 2 changes go through PRs with full history
- Preview branches auto-deleted on merge or discard
- Audit trail: every AI-generated change has a commit with description
- Rollback: the last successfully merged commit is tagged
last-known-goodon each merge โ enables instant revert viagit revertif a deployed change causes problems
5.5 Messaging Layer (Twilio)
- One dedicated WhatsApp number per org (Twilio Business API)
- Inbound webhooks routed to Cloudflare Worker
- Twilio handles message delivery reliability
- Supports WhatsApp, SMS fallback
- Future: Telegram Business API (similar model, different provider)
5.6 Observability Layer
Instrumented from Phase 3 onwards. All metrics stored in D1 and surfaced in the operator dashboard (Phase 5).
Per-org metrics tracked:
- Token usage: input + output tokens per job and per Tier 1 request โ basis for billing and anomaly detection
- Tier routing accuracy: what % of requests are escalated to Tier 2 unnecessarily? (calibrates the classifier)
- Approval rate: how often do admins reject AI-generated PRs? (quality signal โ declining approval rate = degraded output)
- Time-to-publish: end-to-end latency from inbound message โ live site
- Error patterns: which orgs, request types, or site stacks fail most often?
- Preview abandonment rate: how often do previews expire without action?
These metrics enable "profile first, then optimise" โ routing decisions, model selection, and cost caps should be tuned based on real data, not assumptions.
5.7 Operator Dashboard (future Phase 5)
A web UI for Andrew (or future operators) to:
- View all orgs, their status, and last activity
- Manage Style Guides and personas
- See job history (Tier 2 runs, successes, failures)
- Provision new organisations
- View billing and usage per org
- Monitor observability metrics
6. Data Model
Organisation
type Organisation = {
id: string; // "brighton-scouts"
name: string; // "Brighton 1st/14th Sea Scouts"
status: "active" | "inactive" | "onboarding";
created_at: string;
// Messaging
whatsapp_number: string; // Twilio provisioned number
telegram_bot_token?: string;
notify_channels: NotifyChannel[];
// Site
site_repo: string; // GitHub repo URL
cf_project: string; // Cloudflare Pages project name
site_url: string; // Production URL
preview_url_template: string; // "{branch}.cf-project.pages.dev"
build_cmd: string | null; // null = no build step
dist_dir: string; // "." or "dist/public"
// Style
style_guide: StyleGuide; // See ยง7
persona: Persona; // See ยง7
// Admins
admin_contacts: string[]; // Phone numbers who can approve deploys
// Limits
monthly_token_cap?: number; // Hard spend cap; null = operator default
tier2_rate_limit?: number; // Max Tier 2 jobs per day
};
Conversation History (per org)
Design principle: Intra-org shared context is a feature, not a bug. Multiple committee members should be able to interact with the same bot and have it understand what's already been done. Skipper should know that Person A just changed the meeting time, so when Person B asks "what events do we have?", the answer is current.
Conversation context is scoped by org_id only โ the whole organisation shares one bot context.
type ConversationMessage = {
id: string;
org_id: string; // Isolation boundary: between orgs, not between users
sender_phone: string; // Attributed for audit, not for isolation
role: "user" | "assistant";
content: string;
created_at: string;
};
Context window policy:
- Load the most recent N messages (suggested: 50) or messages within the last 7 days โ whichever is smaller
- Older history is archived, not deleted โ available for audit but not injected into prompts
- Explicit context reset is available as an admin command if needed ("Skipper, start fresh")
Content Schema (per org)
Rather than the AI editing HTML directly, simple content is stored structurally:
type ContentSchema = {
org_id: string;
sections: {
[sectionName: string]: SectionSchema;
};
};
type SectionSchema = {
type: "list" | "single" | "richtext" | "contacts";
label: string;
items?: ContentItem[]; // for type=list
value?: string; // for type=single or richtext
};
// Example for Brighton Sea Scouts:
// sections.events = { type: "list", label: "Events", items: [...] }
// sections.announcement = { type: "single", label: "Notice board", value: "..." }
// sections.contact_email = { type: "single", label: "Contact", value: "..." }
For Tier 1 simple updates, the Worker updates D1 content directly and triggers a template rebuild โ the HTML is generated from the template + content data, not edited directly.
Job (Tier 2)
type AgentJob = {
id: string;
org_id: string;
status: "queued" | "running" | "awaiting_approval" | "approved" | "discarded" | "failed" | "expired";
request: string; // Raw user message
requested_by: string; // Phone number
queued_at: string;
started_at?: string;
completed_at?: string;
expires_at?: string; // Set when status=awaiting_approval; 48h from pr creation
pr_url?: string;
preview_url?: string;
pr_number?: number;
change_summary?: string; // Agent's description of what it did
token_usage?: { input: number; output: number; }; // For billing/observability
error?: string; // If status=failed
approval_token: string; // Used to verify thumbs-up came from right user
};
7. Persona & Style Guide System
This is PromptCMS's key differentiator. The bot isn't a generic assistant โ it has a specific identity that matches the organisation.
Persona
type Persona = {
name: string; // "Skipper โ" / "Ranger ๐ฟ"
greeting: string; // First message sent when bot joins group
style: string; // Writing style guidance for the AI
theme: string; // "nautical" / "nature / conservation"
avoid: string[]; // ["corporate", "jargon", "markdown headers"]
emoji_use: "sparse" | "moderate" | "none";
};
The persona is injected as Layer 1 of the system prompt in all AI interactions. The bot never breaks character, never mentions OpenClaw, and never identifies as an AI unless directly asked.
Style Guide
Generated once at site onboarding. Stored per-org in D1. Injected as Layer 2 of the Tier 2 agent's system prompt โ it's the agent's visual bible for that site.
Style Guide safety: before a Style Guide is stored in D1 or injected into a system prompt, it is:
- Validated against the
StyleGuideschema (structured JSON, not freeform text) - Sanitised to remove any instruction-like patterns (e.g. strings containing "ignore", "system prompt", "instructions")
- Tested in a sandbox prompt before being activated for live requests
This prevents both accidental ambiguity and deliberate prompt injection via admin-uploaded style data.
type StyleGuide = {
colours: {
primary: string; // "#1B3A6B"
secondary: string; // "#F0A500"
background: string;
text: string;
accent: string;
};
typography: {
heading_font: string; // "Montserrat"
body_font: string; // "Open Sans"
base_size: string; // "16px"
scale_ratio: string; // "1.25"
};
layout: {
feel: string; // "clean, outdoorsy, family-friendly"
whitespace: string; // "generous"
imagery: string; // "nature, community, action"
mobile_first: boolean;
};
brand_voice: {
tone: string; // "warm, active"
avoid: string[]; // ["corporate", "jargon"]
cta_style: string; // "imperative"
};
technical: {
stack: string; // "plain HTML/CSS/JS"
build_step: boolean;
css_vars: boolean;
image_path: string; // "/images/"
protected_files: string[];
};
site_structure: {
sections: string[]; // ["hero", "about", "events", "gallery", "contact"]
nav_style: string; // "sticky-top"
footer: boolean;
section_id_attr: string; // "data-section"
};
};
The 5-Layer System Prompt (Tier 2)
Clear separation between what the agent can do (L1, L4, L5 โ static, same for all orgs) and how it presents (L2, L3 โ dynamic, per org). This consistency is essential for predictable debugging and testing across orgs.
| Layer | Content | Static/Dynamic |
|---|---|---|
| L1 | Core identity: who the agent is, what its mandate is | Static โ identical across all orgs |
| L2 | Site context: Style Guide (palette, typography, voice, structure) | Dynamic โ per org from D1 |
| L3 | Technical constraints: stack, protected files, conventions | Semi-static โ per org |
| L4 | Execution approach: read before writing, minimal changes, verify | Static โ identical across all orgs |
| L5 | Safety rules: always branch, always preview, never delete without explicit instruction | Static โ identical across all orgs |
Persona Library (future)
A curated library of persona templates organisations can choose from:
| Persona | Theme | Suitable for |
|---|---|---|
| Skipper | Nautical / sea scouts | Scout groups, sailing clubs |
| Ranger | Nature / conservation | Environmental groups, nature associations |
| Coach | Active / sporting | Sports clubs, fitness groups |
| Bloom | Garden / community | Garden clubs, community gardens |
| Spark | Educational / youth | School committees, youth groups |
| Anchor | Civic / community | Neighbourhood groups, ratepayer associations |
Orgs can use a template as-is, customise the name, or (premium) have a fully custom persona designed.
8. Messaging Layer
Supported Channels (current + planned)
| Channel | Status | Notes |
|---|---|---|
| WhatsApp (Twilio) | โ Live | Requires Twilio compliance bundle per number |
| Slack | โ Live | Via OpenClaw โ migrates to Worker in Phase 3 |
| Telegram | Planned Phase 4 | Simpler compliance, no per-number bundle |
| SMS | Future | Fallback for orgs without smartphones |
WhatsApp Compliance
WhatsApp Business API via Twilio requires:
- A dedicated phone number per organisation (or per namespace)
- A Twilio compliance bundle (business registration)
- Approval for the specific use case
Current status: Andrew's personal number +61 485 033 211 is provisioned. Compliance bundle BUead167ea... submitted. Once approved, orgs get dedicated numbers provisioned programmatically via Twilio API.
At scale, dedicated numbers are provisioned automatically on org signup via Twilio's REST API:
// Pseudo-code: provision number on org signup
const number = await twilio.incomingPhoneNumbers.create({
areaCode: '61',
smsApplicationSid: APP_SID,
voiceApplicationSid: APP_SID,
});
await db.organisations.update(orgId, { whatsapp_number: number.phoneNumber });
Message Routing
Inbound WhatsApp message
โ Twilio webhook โ POST /webhook/whatsapp
โ Worker validates Twilio signature
โ Lookup org: SELECT * FROM orgs WHERE whatsapp_number = ?
โ Load org config, style guide, persona
โ Load conversation history: last 50 messages for this org_id
โ Process request (Tier 1 or queue for Tier 2)
โ Reply via Twilio REST API
Admin vs Member Actions
Any group member can: request content changes, ask what the site looks like, ask for help.
Only admins can: approve deploys, discard previews, make admin-level config changes.
Admin detection: phone numbers in org.admin_contacts. Future: any WhatsApp group admin auto-detected via Twilio Group Management API.
9. Security & Multi-tenancy
Isolation Model
PromptCMS achieves multi-tenant isolation through data architecture, not container-per-customer infrastructure. This scales to thousands of orgs without proportional operational overhead.
Between-org isolation (strict):
- Each org has a scoped GitHub personal access token with write access to their repo only โ the Tier 2 agent is structurally blinded to other orgs' code
- The Tier 2 agent's job payload contains only that org's token โ no shared credential
- All D1 queries are filtered by
org_idโ cross-tenant data leakage is architecturally impossible - Cloudflare Pages projects are org-specific
- Each Tier 2 job runs in an isolated Durable Object instance with no shared state
Within-org context sharing (by design):
- Conversation history is scoped to
org_idonly โ all members of an org share one bot context - This is intentional and correct: Person A updates the meeting time; Person B asks "what's on this week?" โ Skipper knows
- No per-user memory isolation within an org; this is a CMS for a shared website, not a personal assistant
Approval Security
The approval flow (thumbs-up โ live deploy) requires:
- Message sender is in
org.admin_contacts approval_tokenmatches the pending job (prevents replay attacks)- Job status is
awaiting_approval(not already approved/discarded)
Audit Trail
Every AI-generated change:
- Produces a git commit with description (Tier 2)
- Stored in
jobstable with full request, response, change summary, and token usage - Preview URL generated before anything goes live
- Human approval required for production deploy
last-known-goodtag on each successful merge enables instant rollback
Rollback
Any admin can request rollback conversationally: "Skipper, undo the last change"
Implementation: git revert on the last merge commit โ new PR โ preview โ approval โ live. The same preview-before-publish flow applies to rollbacks โ no changes bypass human review.
Secrets Management
- Twilio credentials: Cloudflare Worker environment secrets (per org, or shared for operator account)
- GitHub tokens: Cloudflare Worker secrets, one per org
- Anthropic API key: Worker environment secret (operator-level)
- Cloudflare API token: Worker environment secret
10. Phase Roadmap
Phase 0 โ Foundation โ COMPLETE
What was built:
- Brighton Sea Scouts live on Cloudflare Pages
- WhatsApp โ OpenClaw โ Mimir โ wrangler deploy pipeline working
- SITE.json config model
- Preview โ admin approval โ production deploy flow
- Slack mirror relay (every exchange relayed to #prompt-cms and #brighton-sea-scouts)
- YWNA site created and configured (Ranger persona)
Proved: the core thesis works. A non-technical user can update a live website by typing a message in WhatsApp.
Phase 1 โ YWNA Activation
Goal: Both sites fully operational. Validate the second deployment proves the config model, not a one-off.
Tasks:
- Add YWNA bot number to the YWNA WhatsApp group
- Run full E2E test: YWNA member makes a change request โ preview โ admin approves โ live
- Decision: keep "Ranger" or rename persona (Willy Wagtail / "Willy" proposed)
- Finalise YWNA admin list (who has approval rights beyond Andrew)
- Merge or close stale
preview-mem3tierbranch in YWNA repo - Document the current operational model (runbook) so a second operator could follow it
Exit criteria: An YWNA team member (not Andrew) successfully deploys a change end-to-end.
Phase 2 โ Robustness
Goal: The current architecture is reliable enough to onboard a third org without Andrew babysitting it.
Tasks:
- Config validation: schema-validate SITE.json on load; hard error with clear message on malformed config (the array-vs-object bug has hit twice)
- Deploy reliability: replace ad-hoc wrangler CLI calls with a proper deploy script with retry logic and error reporting
- GitHub-triggered deploys: connect Brighton and YWNA repos to Cloudflare Pages git integration, so
git push maindeploys automatically โ remove manual wrangler calls where possible - Error handling: if the AI makes an edit that breaks the build (YWNA especially), surface the error clearly to the user and rollback gracefully
- Onboarding playbook: document exactly how to add a new site (SITE.json, index.json, git setup, Cloudflare Pages project, Twilio channel)
- Pending state persistence: pending.json is currently in-memory/file; move to a proper store so it survives Mimir restarts
- Thumbs-up reaction deploy: implement emoji reaction (๐) on preview message as alternative to typing "deploy" โ better UX for mobile users
Exit criteria: Can add a third org by following the playbook, zero OpenClaw debugging required.
Phase 3 โ Architecture Migration (Decouple from Mac Mini)
Goal: The Mac Mini and Mimir are removed from the real-time message handling path. All live user interactions run in cloud infrastructure.
This is the most significant engineering phase โ it's a full re-architecture, not an extension of the current system.
Part A โ Cloudflare Worker (Tier 1)
Build the fast-path Worker:
- Twilio webhook handler with signature verification
- Org lookup: D1 database (
SELECT * FROM orgs WHERE whatsapp_number = ?) - Request classifier: cheap AI call โ "simple" or "complex"
- Simple path: AI extracts structured data (event date, text, etc.) โ writes to D1 content store โ triggers Pages rebuild โ replies "Done โ "
- Complex path: enqueues Tier 2 job โ replies "Working on it, usually a few minutes"
- Approval handler: receives thumbs-up โ verifies โ merges PR via GitHub API โ fires production deploy
- Worker hardening: rate limiting, cost caps, circuit breakers, and graceful Tier 2 degradation (see ยง5.1)
Part B โ D1 Multi-tenant Database
Schema:
organisationsโ replaces all SITE.json filescontentโ structured content per org (events list, announcements, etc.)style_guidesโ Style Guide JSON per orgjobsโ Tier 2 job history and statepending_approvalsโ outstanding preview approvalsconversation_historyโ per-org message context with TTL (see ยง6)metricsโ per-org observability data (token usage, latency, routing accuracy)
Data migration:
- Migrate Brighton and YWNA configs from SITE.json to D1
- Generate initial content schemas from existing site HTML
- Seed initial
last-known-goodtags on existing repos
Part C โ Site Template Architecture
For Tier 1 to work (updating content data without touching HTML), sites need a clean content/template separation:
- Template:
template.htmlโ fixed layout, never touched by AI in normal operation - Content: stored in D1, injected at build time
- Builder: a small Pages Function or Worker that combines template + D1 content on each deploy
Note: This requires rebuilding (or adapting) the Brighton and YWNA sites. Brighton (pure HTML) is straightforward โ extract content into JSON, generate the HTML via template. YWNA (React/Vite) is more complex โ and the strong recommendation is to rebuild it as plain HTML (see ยง11 Open Decisions).
Part D โ Cut Over
- Point Twilio webhooks at the new Worker (replacing OpenClaw)
- Verify end-to-end flow for both sites
- Mac Mini / OpenClaw deprecated from critical path
- Mimir retained for admin operations only (onboarding, debugging, config updates)
Exit criteria: Both sites receiving and processing messages with zero Mac Mini involvement. Mimir not required for any user-facing operation.
Phase 4 โ Tier 2 Agentic Engine
Goal: Complex layout and design requests (new sections, visual redesign, substantial page changes) handled by an async cloud agent, not Mimir.
What gets built:
- Job queue: Cloudflare Queues or Durable Object for async job management, with per-org serialisation (max 1 active Tier 2 job per org)
- Agentic runtime: Cloudflare Durable Objects (one DO instance per job, isolated)
- GitHub integration: read/write via GitHub API with per-org scoped tokens
- Agent system prompt: 5-layer architecture (see ยง7)
- Style Guide injection: L2 of system prompt loaded from D1 per org, sanitised before injection
- Preview flow: PR creation โ Cloudflare Pages preview โ notify user
- Approval webhook: Worker receives thumbs-up โ verifies approval token โ merges PR โ tags
last-known-good - Preview expiry: 48h TTL with 24h nudge notification; expired jobs are auto-discarded and branch cleaned up
Style Guide generation at onboarding:
When a new org is set up, Andrew (or eventually a self-service wizard) runs a Style Guide generation step:
- Provide the site's existing URL or design brief
- AI analyses the site (or brief) and generates a structured Style Guide JSON
- Review and adjust
- Sanitise and validate
- Store in D1 โ this is L2 for all future Tier 2 jobs for that org
Exit criteria: A complex request ("add a photo gallery above the events section, make it feel more outdoorsy") is handled end-to-end without Mimir involvement. User receives a preview URL within 5 minutes.
Phase 5 โ Scale & Polish
Goal: The system handles multiple orgs reliably. Onboarding a new org is a defined, documented process that doesn't require Andrew's time per site.
Features:
- Third org: use the playbook to onboard a third organisation (ideally unrelated to sea scouts or YWNA โ proves generality)
- Self-service onboarding wizard: web UI where an org provides their details (name, WhatsApp number, site preferences, persona preference) and the system provisions everything automatically
- Twilio number provisioning via REST API
- Cloudflare Pages project creation via API
- GitHub repo creation from template
- D1 record creation
- Style Guide generation and sanitisation
- Persona assignment
- Dedicated numbers: each org gets their own WhatsApp number (removes shared-number limitation)
- Persona library: 6+ pre-built personas (Skipper, Ranger, Coach, Bloom, Spark, Anchor) for orgs to choose from at signup
- Admin auto-detection: any WhatsApp group admin automatically gets deploy approval rights (no manual admin list maintenance)
- Operator dashboard: web UI showing all orgs, job history, uptime, usage, errors, observability metrics
- Weekly digest: automated summary sent to each org's group โ "your site had 3 updates this week, here's what changed"
- Rollback: admin can request rollback to any previous version ("undo the last change") โ git revert โ preview โ approval โ live
Exit criteria: A new org can be fully onboarded without Andrew writing any code or config.
Phase 6 โ Product
Goal: PromptCMS is a sellable SaaS. External operators (orgs, agencies) pay for the service.
Features:
- Stripe billing: subscription per org (target: $20โ30/month per org)
- Free tier: 1 org, limited complex requests/month
- Standard: full access, dedicated number, persona customisation
- Agency tier: multiple orgs under one account, white-label option
- Custom domains: org maps their own domain (scouts.example.org.au) to the Pages site
- White-label: agency resellers can remove PromptCMS branding
- Custom persona design: premium tier โ Andrew (or operator) designs a bespoke persona for the org
- Google Photos integration: orgs can connect a Google Photos album; bot can pull latest photos into gallery section
- Analytics: orgs can ask the bot "how many people visited the site this week?" โ basic Pages analytics surfaced conversationally
- Public launch: landing page, pricing, signup flow
Strategic option โ vertical play:
Rather than broad horizontal SaaS, consider approaching a national body (Scouts Australia, Surf Life Saving Australia, Tennis Australia) to endorse PromptCMS to their member clubs. One partnership โ hundreds of orgs, zero marketing spend. This is the fastest path to meaningful scale and a defensible position.
Exit criteria: 10+ paying orgs. Positive unit economics. System runs without Andrew's operational involvement.
11. Open Decisions
| Decision | Options | Recommended | Blocker for |
|---|---|---|---|
| YWNA persona name | Keep "Ranger" / "Willy Wagtail" / other | Decide with YWNA team | Phase 1 |
| Site template architecture | Separate content/template vs keep editing HTML | Content/template separation | Phase 3 |
| Tier 2 runtime | Cloudflare Durable Objects vs separate VPS | Durable Objects (simpler ops, better isolation) | Phase 4 |
| Distribution model | Horizontal SaaS vs vertical (national body deal) | Vertical first | Phase 6 |
| Telegram support | Twilio vs direct Telegram Bot API | Direct Telegram API (simpler) | Phase 4 |
| Agency tier pricing | Flat per-org vs per-message | Flat per-org | Phase 6 |
| Style Guide generation | Manual (Andrew) vs AI-assisted wizard | AI-assisted via Mimir | Phase 4 |
| YWNA React complexity | Keep React build / rebuild as plain HTML | Plain HTML (simpler, faster, more reliable, uniform stack) | Phase 3 |
| Conversation history window | 50 messages vs 7 days vs both | Both (whichever is smaller) | Phase 3 |
| Preview expiry duration | 24h / 48h / 72h | 48h with 24h nudge | Phase 4 |
Note on YWNA: The React/Vite/TypeScript stack was appropriate for a full-stack app but adds significant complexity for a primarily-static community website. Strong recommendation to rebuild YWNA as plain HTML/CSS (like Brighton) during Phase 3 โ the Tier 1 content/template model is much simpler without a build step, and keeping a uniform stack means the Tier 2 agent behaves predictably across all orgs with no special-casing.
12. Guiding Principles
Reliability over features. A bot that sometimes fails destroys trust with community volunteers who have little patience for technology issues. Every phase must leave the system more reliable than it found it.
The user experience is the conversation. Every design decision should be evaluated against the experience of a 60-year-old committee member opening WhatsApp and typing a message. If it makes that interaction more confusing, it's wrong regardless of how elegant it is technically.
Preview before publish, always. No AI change should ever go live without a human seeing it first. This is non-negotiable at every tier and every phase. The preview-then-approve flow is the product's trust mechanism. This applies equally to rollbacks.
Persona is identity, not decoration. The fact that the bot is Skipper (not "PromptCMS Bot") is what makes members engage with it naturally. Every product decision that dilutes the persona weakens the product. Protect it.
Mac Mini is the prototype, not the product. The current architecture works and taught us everything we need to know. It is not a foundation to build on โ it's a proof of concept to replace. Phase 3 exists to make this clean.
Content and code are different jobs. The AI's job in Tier 1 is to update data. The AI's job in Tier 2 is to edit code. These require fundamentally different architectures. Never conflate them โ keep the separation clean.
Isolation through architecture, not infrastructure. Multi-tenancy is achieved via scoped tokens, org_id-filtered queries, and per-job Durable Object instances โ not container-per-customer. This scales to thousands of orgs without proportional operational overhead.
Within-org context is shared by design. The bot is a shared resource for a shared website. Intra-org conversation history enables collaboration (Person A changes the meeting time; Person B asks what's on โ the bot knows). Isolation boundaries are between orgs, not between members.
Measure before optimising. Instrument observability from Phase 3. Routing decisions, model selection, and cost caps should be tuned on real data โ not assumptions. The Tier 1 classifier is the biggest cost lever at scale; you need to see it before you can tune it.
Mimir moves up the stack. In the target architecture, Mimir (and Andrew) do the high-value, high-judgment work: onboarding, persona design, Style Guide generation, edge case handling. Not the repetitive, latency-sensitive work of answering WhatsApp messages. That's what the cloud infrastructure is for.
This document is a living spec. Update it as architectural decisions are made, phases complete, and the product evolves. The goal is that at any point, a developer could read this document and understand exactly what PromptCMS is, where it stands, and what needs to be built next.