ADR-0006 — Test-phase infrastructure (local + Cloudflare Tunnel)
ADR-0006 — Test-phase infrastructure (local + Cloudflare Tunnel)
Status: Proposed Date: 2026-05-18 Owner: tech-architect Supersedes: none Amends scope (for phase 8b only): ADR-0001, ADR-0002, ADR-0003, ADR-0004, ADR-0005 Related canon: D-007, D-008, D-009 (production targets — UNCHANGED)
Context
Phase 8a delivered the production-target spec layer: ADRs 0001-0005, four API contracts, and the Prisma schema. Phase 8b was queued as “implement the spec” — provisioning a GCP project in europe-central2, wiring Cloud SQL, Firebase Auth (Path B), Firebase App Check, and the reconciliation worker.
The cost projection in ADR-0001 was sound for the production target (Band A ~$143/month at 100 DAU). It is, however, disproportionate for the closed self-test we are actually about to run. The realistic test cohort is ~20 self-testers — CEO plus a handful of trusted recruits — driving the first build to validate that the loop holds together end-to-end. Provisioning a full EU-resident GCP estate to host 20 testers is operationally and financially wasteful. It would also front-load decisions (Firebase Auth provider mix, App Check thresholds, anti-cheat tuning) that benefit from real walker data we do not yet have.
CEO ratified a phase split: test-phase implementation today, production migration when we have data to design against.
Decision
Phase 8b ships against a local-only test infrastructure, not the production GCP target.
| Component | Test phase (this ADR) | Production target (ADR-0001) |
|---|---|---|
| Compute | Local Docker Compose on CEO laptop | Cloud Run + Cloud Run realtime + walkrpg-jobs |
| Database | Local Postgres 16 container | Cloud SQL Postgres 16 europe-central2-warsaw |
| Auth | Mock JWT (NestJS-issued HS256/RS256) | Firebase Auth Path B (Google + Apple federation) |
| Attestation | None (no App Check) | Firebase App Check (DeviceCheck + Play Integrity) |
| External reach | Cloudflare Tunnel (free tier) | Cloud Load Balancer + Cloud Armor + EU region pin |
| Reconciliation worker | None — synchronous accept on ingest | Cloud Tasks reconcile-steps with 60s coalescing |
| Anti-cheat | Schema validation + impossible-burst-rate only | Three-layer defense in depth (ADR-0004) |
| Multiplayer / chat / co-op | Not in scope | walkrpg-realtime WebSocket gateway |
| Op cost | $0/mo (laptop electricity + CEO time) | Band A ~$143/mo, scaling per ADR-0001 |
The Prisma schema is identical between test and production. Only the deployment, auth, and reconciliation layers differ. This protects the migration path: production migration adds layers, it does not rewrite the data model.
Test scope — what plays does this enable?
The infrastructure must let a tester:
- Register. Hit
POST /auth/callbackwith{ email, displayName }and receive a mock-signed session JWT. No Firebase project required. - Ingest steps. Submit a per-day step bucket via
POST /step/ingestand see streak update synchronously. Provisional state is alwaysfalsein mock mode (no reconciliation worker exists to flip it). - Take Quest 001 and allocate the keystone. Walk through the starter tree loop — quest grants points, walker spends 4 points on the Plenny starting circle, allocates the
Krok Niezachwianykeystone, and sees it appear in tree state. - Read profile + tree state.
GET /walker/profileandGET /tree/statereturn the cold-start payload and the tree topology + allocations.
Explicitly NOT in scope for phase 8b: multiplayer, guild chat, co-op walking sessions, regional events, faction tier rewards, GDPR export endpoint, crafting, watch-native, real-time presence. Those land at their respective production-target phases.
Out of scope — deferred to production migration
Everything below is in the production target (D-007/8/9 and ADRs 0001-0005) and stays there. Phase 8b does not implement any of it.
- Firebase App Check (DeviceCheck + Play Integrity)
- Firebase Auth (Google / Apple federation, Path B shim)
- GCP project provisioning (any region, any service)
- EU residency provisioning (
europe-central2-warsawCloud SQL, KMS, Secret Manager, Storage) - Reconciliation worker (
reconcile-stepsCloud Tasks pipeline) - Layer 3 behavioural anti-cheat (sliding-window pattern detection)
- 7-day offline cap enforcement
- Manual review queue (Cloud SQL view + admin UI)
- 30-day delete-on-demand pipeline (
gdpr-delete-soft/gdpr-delete-hardCloud Tasks jobs) - 12-month inactivity anonymization sweep
- GDPR export bundle generation
- Cloud Logging anti-cheat sink (90-day forensic retention)
- WebSocket gateway, presence server, chat persistence
- ChatMessage 30-day retention
- Three-strikes ban policy + appeal flow
Mock auth detail
POST /auth/callback accepts { email, displayName } in mock mode and returns a NestJS-issued session JWT.
// Request body (mock mode — AUTH_MODE=mock){ "displayName": "Wanderer of Plenny"}Server flow (mock):
- Look up
Userbyemail. If missing, create User + Walker + StreakState + FactionRep rows exactly as the production callback does (sections 5.a-5.d ofauth-callback.mdx). - Mint session JWT signed by the NestJS-local signing key:
- Algorithm: HS256 (symmetric secret loaded from
.env) or RS256 (asymmetric key pair underbackend/keys/— tech-architect picks one at phase 8b implementation, both are acceptable). - Claims:
sub= internal User UUID,walkerId= internal Walker UUID,iss = walkrpg-api-local,aud = walkrpg-mobile-test,exp = now + 7 days(extended from production’s 24h to reduce test-session friction).
- Algorithm: HS256 (symmetric secret loaded from
- Return the same response envelope production returns —
{ session, walker, isFirstLogin, forcedUpgradeRequired }. The mobile (or curl) client sees no shape difference.
ENV flag AUTH_MODE switches behaviour:
AUTH_MODE | Active path |
|---|---|
mock (default in .env.example) | Trust the body, no token verification, mint local JWT |
firebase | Verify firebaseIdToken + App Check token via Firebase Admin SDK (production path) |
Production swap is a phase-X concern (post cost redesign, post VPS upgrade — see “Migration plan” below). The branch point is a single if (process.env.AUTH_MODE === 'firebase') at the controller boundary.
Light anti-cheat — what stays, what’s deferred
Phase 8b implements only:
- Schema validation. Zod-parse the
/step/ingestbody. Reject 422 on shape violation. - Impossible-burst-rate guard.
count / sampleSpan_seconds <= 12 steps/sec. Reject 422 on violation. - Day-in-future guard.
daymay not exceed tomorrow in the claimed tz. Reject 422. - Hard cap.
count <= 50_000. Reject 422.
That’s it. Specifically deferred:
- No GPS check (was deferred to phase 11 anyway per ADR-0004).
- No TZ-jump guard.
- No App Check / device attestation.
- No cross-source double-counting check (no reconciliation worker exists).
- No Layer 3 behavioural patterns.
- No 7-day offline cap (the cap is enforced by the production reconciliation worker; in test phase, late submissions are silently accepted).
Accepted risk: mock-step apps (e.g. iOS simulators, Android emulators, manual step injectors) WILL pass these checks. This is an explicit tradeoff for the test phase — testers are trusted, and the data we’re collecting is loop-validation data, not step-economy-balance data. Step-economy balance data requires the production attestation path and is gathered at a later phase.
Cloudflare Tunnel setup
Cloudflare Tunnel exposes the local NestJS service to testers without requiring CEO to manage public ports, dynamic DNS, or a VPS. Free tier covers our volume.
High-level steps (CEO runs once, then cloudflared tunnel run per test session):
-
Install cloudflared.
brew install cloudflaredon macOS / package manager on Linux. Reference: Cloudflare Tunnel docs. -
Authenticate.
cloudflared tunnel loginopens a browser; CEO selects the Cloudflare account + hostname (e.g.walkrpg-test.example.com). -
Create tunnel.
cloudflared tunnel create walkrpg-test. Outputs a tunnel UUID and a credentials JSON. -
Configure DNS.
cloudflared tunnel route dns walkrpg-test walkrpg-test.example.com— Cloudflare creates the DNS record automatically. -
Configure routing. Create
~/.cloudflared/config.yml:tunnel: <uuid>credentials-file: /home/<user>/.cloudflared/<uuid>.jsoningress:- hostname: walkrpg-test.example.comservice: http://localhost:3000- service: http_status:404 -
Run during test sessions.
cloudflared tunnel run walkrpg-test. Test traffic terminates at Cloudflare’s edge, tunnels over QUIC/HTTP2 to the local NestJS service.
Operational notes:
- Tunnel is only up while CEO laptop is on and
cloudflaredis running. Asynchronous testing across time zones is unreliable — testers in non-overlapping hours will see connection failures. This is an accepted limitation for the test phase and one of the triggers for the VPS migration. - Cloudflare provides TLS termination and basic DDoS at the edge for free. The tunnel handshake is mTLS between cloudflared and Cloudflare’s edge.
- No data is stored at Cloudflare’s edge — it’s a proxy. EU residency is therefore not violated for ingress/egress traffic, though the production posture (Path B + GCP
europe-central2-warsaw) still applies as the target.
Migration plan
This ADR closes phase 8b. The full architecture path is:
- Now — phase 8b. Local + Cloudflare Tunnel + mock auth. Build the four endpoints. Run the closed self-test.
- VPS upgrade point. Triggered by ONE of: (a) tester feedback asks for asynchronous testing across time zones, (b) test cohort grows past
30 walkers, (c) laptop-uptime requirement becomes a blocker for the CEO. At this point, a new ADR (working name ADR-0007) authors the Hetzner CX22 (€5/month) migration: same Docker Compose stack, same mock auth, but always-on. No Cloudflare Tunnel needed (VPS has a public IP). Roadmap entry “VPS migration” tracks this as a deferred undertaking, not a numbered phase. - Production migration — currently paused indefinitely. Triggered only after CEO greenlights the cost redesign post-VPS-upgrade. At that point ADRs 0001-0005 unfreeze and Path B Firebase residency engages. The production migration is itself decomposed into sub-ADRs (GCP project provisioning, Firebase Auth wiring, App Check rollout, reconciliation worker bring-up, GDPR pipeline activation, EU residency verification, anti-cheat Layer 3, multiplayer infrastructure). Each sub-ADR is authored when its prerequisite signal lands.
Until the production migration is greenlit, the entries in ADRs 0001-0005 remain authoritative for the production target but are not implemented. Phase 8b implementation reads this ADR (ADR-0006) for current scope.
Consequences
- $0/month operating cost for the test phase. CEO time is the only cost line.
- Same Prisma schema between test and production preserves the migration path with zero data-model rework.
- Trust-based step economy in test phase. Mock-step submissions are not detected. Acceptable because testers are trusted and step economy balance is not what we’re testing at phase 8b.
- Laptop uptime is the SLA ceiling. Asynchronous testing is fragile. This is the explicit trigger for the VPS migration when it bites.
- No App Check, no Firebase, no GCP cost until production migration is greenlit. Cost redesign happens with real walker-density data in hand.
- D-007/8/9 production targets are preserved. This ADR amends scope for phase 8b only. The production targets stay in canon, unchanged, deferred not cancelled.
- Three phase-8a open questions are PARKED. Email/Password provider mix, GCP timeline, manual review queue staffing — all three were unanswered at the close of phase 8a. The re-scope makes them moot for phase 8b. They re-open at production-migration design time.