eu-central-1)Original phased intent (still useful for scope storytelling) · Core platform shipped to production · see LOCAL_DEV_SKIPS.md for any remaining ops notes
Historical sprint outcomes · release gate today is root README.md checklist (typecheck, lint, unit tests; Playwright when validating UI)
| Day | Status | Phase | Demoable Outcome | Acceptance Gate | Story Set |
|---|---|---|---|---|---|
| DAY 1 · Fri May 8 | ✓ Built | Foundation | Super Admin login, instance branding, projects/categories, question library with PRD question types, audit trail — matches shipped admin foundation. | pnpm exec prisma generate · pnpm -r run typecheck && pnpm -r run lint && pnpm -r run test green · optional cd app && pnpm exec playwright test |
FND-001..003 · INF-001 · OPS-001 · AUTH-001..006 · BRD-001..003 · PRJ-001 · CAT-001 · QL-001..002 |
| DAY 2 · Mon May 11 | ✓ Built | Builder + Respondent | Multi-stage builder, QR/slug distribution, chat-style respondent flow, PIN + forgot-PIN (6-digit), magic-link resume, DB-backed lockout — implemented in hxiq-platform with Playwright coverage (forgot-PIN, lockout, magic link specs). |
Same root gate as Day 1 · E2E specs 07–16 etc. per app/tests/e2e/README.md |
BLD-001..004 · BRN-001..003 · LIF-001..003 · RSP-001..009 · PIN-001..005 |
| DAY 3 · Tue May 12 | ✓ Built | Distribution + Identity | CSV import, guest codes, custom fields, SES-capable dispatch + worker queue, reminder cron (rules A/B/C/E), scheduled dispatch, distribution tags — prod uses SES from app.hxiq.ae; integration Mailpit spec optional. |
Root test gate · prod: SES verified in eu-central-1 · webhook hardening still listed in LOCAL_DEV_SKIPS.md |
DST-001..005 · CF-001..005 · GC-001..004 · IMP-001..004 · EML-001..009 · DLO-001 · REM-001..006 |
| DAY 4 · Wed May 13 | ✓ Built | Reporting + AI | Questionnaire dashboard (filters, charts, NPS, transcripts, PDF/CSV), AI insight slots, ⌘K reporting assistant, theme extraction, anomalies, correlations — OpenAI-backed in prod with local stub mode. | Root gate · AI requires OPENAI_API_KEY in SSM for prod · circuit breaker / usage stats in app |
RPT-001..012 · EXP-001..004 · AI-001..012 · CAS-001..005 |
| DAY 5 · Thu May 14 | ✓ Built · live | Hardening + Launch 🚀 | app.hxiq.ae live with TLS (Caddy/LE), Docker Compose deploy from GitHub Actions, Slack deploy notifications, Sentry; multi-tenant platform ready for additional instances via DNS/branding when needed. |
See DEPLOY-RUNBOOK.md · no shell script named run-phase-validation.sh in repo — use README regression gate + manual QA CSV |
HRD-001..004 · INF-002..006 · OPS-002..010 · RET-001..004 · LCH-001..006 · TST-005 |
Third-party services and accounts · Status column reflects what is wired in production
| Phase | Status | Service / Provider | What's Needed | Owner | Cost Model | Lead Time |
|---|---|---|---|---|---|---|
| DAY 0 (pre-flight) | ✓ Done | AWS Account · eu-central-1 | Deployed footprint uses Frankfurt (eu-central-1): EC2, ECR, SES, S3, KMS, SSM. OIDC from GitHub Actions for deploy — no long-lived keys on runners. |
Apptology CTO | Pay-per-use | Same day · 5 min |
| DAY 0 (pre-flight) | ✓ Done | AWS SES Production Access | Submit production-access form on SES console with justification ("Per-instance transactional email for HXIQ survey platform"). Sandbox limits to verified-only addresses; production access lifts the cap. Sending domain (live: app.hxiq.ae) verified with DKIM + SPF + DMARC via verify-ses-domain.sh. |
Agent (script-driven; AWS approves) | Free signup | ~24h AWS approval — submit Day 0 morning |
| DAY 0 (pre-flight) | ✓ Done | Cloudflare · DNS + Turnstile | Cloudflare account with the production domain added as a Zone. API token scoped to Zone:DNS:Edit + Account:Turnstile:Edit. Turnstile site key + secret key for the respondent landing's bot prevention. |
Apptology CTO | Free tier covers all v1 needs | Same day · 10 min |
| DAY 0 (pre-flight) | ✓ Done | OpenAI API · Responses + embeddings | Production uses OpenAI (OPENAI_API_KEY in SSM). Defaults: gpt-5.4-mini for insights/assistant, text-embedding-3-large for theme extraction. Anthropic/Bedrock paths retired — see DEPLOY-RUNBOOK.md. |
Operator / client key | Usage-based | Key in SSM before enabling AI panels in prod |
| DAY 0 (pre-flight) | ✓ Done | Sentry · Error tracking | Managed Sentry (Apptology org) + org auth token for CI source maps; single project hxiq with instance tagging — see LOCAL_DEV_SKIPS.md. |
Apptology ops | Org plan | Done — token in SSM |
| DAY 0 (pre-flight) | ✓ Done | GitHub · Apptologyteam/hxiq |
Repo at https://github.com/Apptologyteam/hxiq with default branch main. GitHub CLI (gh) authenticated on the agent's machine. CI runs on GitHub Actions free tier (2,000 min/month). |
Apptology CTO | Free tier | Same day · 5 min |
| DAY 0 (pre-flight) | ✓ Live | Production Domain | Live: app.hxiq.ae → EC2 (EIP). Additional customer domains: same TLS + SES verification pattern when onboarding new instances. |
Apptology CTO | Per domain | Same day if delegated CNAME · up to 48h for full DNS |
| DAY 1 | Partial | Brand Assets · HXIQ default | HXIQ default brand assets in `brand/` folder (logo SVG/PNG, mark, app icon, favicon — ~10 files per HXIQ Brand Guidelines §2.3). Agent falls back to a placeholder gradient if missing. Per-project branding is configurable in admin. | Apptology design | No external cost | Ongoing · PNG/app icon optional polish |
| DAY 3 | ✓ Done | SES Sender Domain Verified | Production sender uses verified domain (e.g. app.hxiq.ae) in eu-central-1 with DKIM/SPF/DMARC on Cloudflare. Additional customer domains follow same verify-ses-domain.sh pattern. |
AWS SES | ~$0.10 per 1,000 emails | Day 3 morning hard cutoff |
| DAY 4 | ✓ Done | OpenAI quota / budgets | Set spending limits in OpenAI dashboard; app tracks usage (AiTokenUsage, admin AI usage page). Circuit breaker degrades AI panels when provider fails; charts and exports remain. |
Operator | Usage-based | Monitor via OpenAI + in-app stats |
| DAY 5 | ✓ Done | DNS A Record · Cloudflare | app.hxiq.ae A → EC2 EIP (documented in runbook). Additional instance hostnames follow the same pattern. |
Operator / DNS (runbook) | Cloudflare free tier | 5 min · TTL 300s · global propagation typically < 30 min |
| DAY 5 | ✓ Done | Operator Admin Email | One email address that becomes the Super Admin user on the live instance. Credentials sent via secure channel (Signal / 1Password / SSE) — never email. Agent generates a temp password; Super Admin forced to change on first login. | Apptology operator | N/A | Day 5 evening (during hand-off) |
20 production risks documented · all with concrete mitigations
| Risk | Phase | Severity | Mitigation |
|---|---|---|---|
| AWS SES production-access not approved by Day 3 | DAY 3 | Critical | Submit Day 0 morning. AWS approves within ~24h. Hard cutoff: Day 3 morning. If not approved, fall back to Mailpit-only Day 3 demo and escalate to AWS support. |
| OpenAI API outage (full provider down) | DAY 4 | High | Circuit breaker and graceful degrade on AI surfaces; dashboards and non-AI exports continue to work; local dev uses stub mode without a key. |
| Cloudflare DNS propagation slow | DAY 5 | High | TTL 300s. Multi-geo `dig` test from cloud shells. If > 2h propagation, escalate to Cloudflare support. Ansible has "wait for DNS" gate before issuing certbot. |
| Let's Encrypt rate limits during testing | DAY 5 | Medium | Use `--staging` flag during all pre-launch testing. Production cert issued exactly once, with 5x retry + 60s back-off if first issuance fails. |
| SES bounce / spam complaint storm | DAY 3 | High | CloudWatch alarms at 0.05% complaint rate (warn) and 2% bounce rate (warn). Worker reads `email-paused` SSM flag to halt sends within 60s of alarm. |
| Disk full on EC2 t3.large (single host) | DAY 5 | High | Docker log driver capped at 50MB × 5 files per container. Beszel alerts at 75% disk. Export TTL deletes old artifacts. Runbook: `docker system prune -af --volumes`. |
| Worker container OOM during AI / PDF generation | DAY 4 | Medium | Worker `mem_limit: 1.5g` + `mem_reservation: 768m`. OOM-kill auto-restarts; BullMQ retries the job. Beszel alarm if > 3 worker restarts / 24h. |
| PII leak via AI prompts | DAY 4 | Critical | NER + regex redaction before prompts reach OpenAI. Identity fields (name, email) not sent in bulk insight paths; respondent identifiers minimized per implementation. Review worker + assistant routes when changing AI prompts. |
| PIN brute-force | DAY 2 | Medium | Token-bucket rate limit (6 attempts / 15 min) + DB-persisted PinLockout (30 min) + 6-digit forgot-PIN code recovery (15-min TTL, 5-attempt cap, 3 codes / email / hour). |
| Backup restore drift after Day 5 | DAY 5 | Medium | Weekly cron `0 5 * * 0` runs `restore-drill.ts` — spins up temp Postgres, restores latest pg_dump, asserts row count > 0, tears down. Failures raise Sentry warning. |