ADR-0006 — Test-phase infrastructure (local + Cloudflare Tunnel)

Status: Proposed Date: 2026-05-18 Owner: tech-architect Supersedes: none Amends scope (for phase 8b only): ADR-0001, ADR-0002, ADR-0003, ADR-0004, ADR-0005 Related canon: D-007, D-008, D-009 (production targets — UNCHANGED)

Context

Phase 8a delivered the production-target spec layer: ADRs 0001-0005, four API contracts, and the Prisma schema. Phase 8b was queued as “implement the spec” — provisioning a GCP project in europe-central2, wiring Cloud SQL, Firebase Auth (Path B), Firebase App Check, and the reconciliation worker.

The cost projection in ADR-0001 was sound for the production target (Band A ~$143/month at 100 DAU). It is, however, disproportionate for the closed self-test we are actually about to run. The realistic test cohort is ~20 self-testers — CEO plus a handful of trusted recruits — driving the first build to validate that the loop holds together end-to-end. Provisioning a full EU-resident GCP estate to host 20 testers is operationally and financially wasteful. It would also front-load decisions (Firebase Auth provider mix, App Check thresholds, anti-cheat tuning) that benefit from real walker data we do not yet have.

CEO ratified a phase split: test-phase implementation today, production migration when we have data to design against.

Decision

Phase 8b ships against a local-only test infrastructure, not the production GCP target.

Component	Test phase (this ADR)	Production target (ADR-0001)
Compute	Local Docker Compose on CEO laptop	Cloud Run + Cloud Run realtime + walkrpg-jobs
Database	Local Postgres 16 container	Cloud SQL Postgres 16 `europe-central2-warsaw`
Auth	Mock JWT (NestJS-issued HS256/RS256)	Firebase Auth Path B (Google + Apple federation)
Attestation	None (no App Check)	Firebase App Check (DeviceCheck + Play Integrity)
External reach	Cloudflare Tunnel (free tier)	Cloud Load Balancer + Cloud Armor + EU region pin
Reconciliation worker	None — synchronous accept on ingest	Cloud Tasks `reconcile-steps` with 60s coalescing
Anti-cheat	Schema validation + impossible-burst-rate only	Three-layer defense in depth (ADR-0004)
Multiplayer / chat / co-op	Not in scope	walkrpg-realtime WebSocket gateway
Op cost	$0/mo (laptop electricity + CEO time)	Band A ~$143/mo, scaling per ADR-0001

The Prisma schema is identical between test and production. Only the deployment, auth, and reconciliation layers differ. This protects the migration path: production migration adds layers, it does not rewrite the data model.

Test scope — what plays does this enable?

The infrastructure must let a tester:

Register. Hit POST /auth/callback with { email, displayName } and receive a mock-signed session JWT. No Firebase project required.
Ingest steps. Submit a per-day step bucket via POST /step/ingest and see streak update synchronously. Provisional state is always false in mock mode (no reconciliation worker exists to flip it).
Take Quest 001 and allocate the keystone. Walk through the starter tree loop — quest grants points, walker spends 4 points on the Plenny starting circle, allocates the Krok Niezachwiany keystone, and sees it appear in tree state.
Read profile + tree state. GET /walker/profile and GET /tree/state return the cold-start payload and the tree topology + allocations.

Explicitly NOT in scope for phase 8b: multiplayer, guild chat, co-op walking sessions, regional events, faction tier rewards, GDPR export endpoint, crafting, watch-native, real-time presence. Those land at their respective production-target phases.

Out of scope — deferred to production migration

Everything below is in the production target (D-007/8/9 and ADRs 0001-0005) and stays there. Phase 8b does not implement any of it.

Firebase App Check (DeviceCheck + Play Integrity)
Firebase Auth (Google / Apple federation, Path B shim)
GCP project provisioning (any region, any service)
EU residency provisioning (europe-central2-warsaw Cloud SQL, KMS, Secret Manager, Storage)
Reconciliation worker (reconcile-steps Cloud Tasks pipeline)
Layer 3 behavioural anti-cheat (sliding-window pattern detection)
7-day offline cap enforcement
Manual review queue (Cloud SQL view + admin UI)
30-day delete-on-demand pipeline (gdpr-delete-soft / gdpr-delete-hard Cloud Tasks jobs)
12-month inactivity anonymization sweep
GDPR export bundle generation
Cloud Logging anti-cheat sink (90-day forensic retention)
WebSocket gateway, presence server, chat persistence
ChatMessage 30-day retention
Three-strikes ban policy + appeal flow

Mock auth detail

POST /auth/callback accepts { email, displayName } in mock mode and returns a NestJS-issued session JWT.

// Request body (mock mode — AUTH_MODE=mock)
{
  "email": "[email protected]",
  "displayName": "Wanderer of Plenny"
}

Server flow (mock):

Look up User by email. If missing, create User + Walker + StreakState + FactionRep rows exactly as the production callback does (sections 5.a-5.d of auth-callback.mdx).
Mint session JWT signed by the NestJS-local signing key:
- Algorithm: HS256 (symmetric secret loaded from .env) or RS256 (asymmetric key pair under backend/keys/ — tech-architect picks one at phase 8b implementation, both are acceptable).
- Claims: sub = internal User UUID, walkerId = internal Walker UUID, iss = walkrpg-api-local, aud = walkrpg-mobile-test, exp = now + 7 days (extended from production’s 24h to reduce test-session friction).
Return the same response envelope production returns — { session, walker, isFirstLogin, forcedUpgradeRequired }. The mobile (or curl) client sees no shape difference.

ENV flag AUTH_MODE switches behaviour:

`AUTH_MODE`	Active path
`mock` (default in `.env.example`)	Trust the body, no token verification, mint local JWT
`firebase`	Verify `firebaseIdToken` + App Check token via Firebase Admin SDK (production path)

Production swap is a phase-X concern (post cost redesign, post VPS upgrade — see “Migration plan” below). The branch point is a single if (process.env.AUTH_MODE === 'firebase') at the controller boundary.

Light anti-cheat — what stays, what’s deferred

Phase 8b implements only:

Schema validation. Zod-parse the /step/ingest body. Reject 422 on shape violation.
Impossible-burst-rate guard. count / sampleSpan_seconds <= 12 steps/sec. Reject 422 on violation.
Day-in-future guard. day may not exceed tomorrow in the claimed tz. Reject 422.
Hard cap. count <= 50_000. Reject 422.

That’s it. Specifically deferred:

No GPS check (was deferred to phase 11 anyway per ADR-0004).
No TZ-jump guard.
No App Check / device attestation.
No cross-source double-counting check (no reconciliation worker exists).
No Layer 3 behavioural patterns.
No 7-day offline cap (the cap is enforced by the production reconciliation worker; in test phase, late submissions are silently accepted).

Accepted risk: mock-step apps (e.g. iOS simulators, Android emulators, manual step injectors) WILL pass these checks. This is an explicit tradeoff for the test phase — testers are trusted, and the data we’re collecting is loop-validation data, not step-economy-balance data. Step-economy balance data requires the production attestation path and is gathered at a later phase.

Cloudflare Tunnel setup

Cloudflare Tunnel exposes the local NestJS service to testers without requiring CEO to manage public ports, dynamic DNS, or a VPS. Free tier covers our volume.

High-level steps (CEO runs once, then cloudflared tunnel run per test session):

Install cloudflared. brew install cloudflared on macOS / package manager on Linux. Reference: Cloudflare Tunnel docs.
Authenticate. cloudflared tunnel login opens a browser; CEO selects the Cloudflare account + hostname (e.g. walkrpg-test.example.com).
Create tunnel. cloudflared tunnel create walkrpg-test. Outputs a tunnel UUID and a credentials JSON.
Configure DNS. cloudflared tunnel route dns walkrpg-test walkrpg-test.example.com — Cloudflare creates the DNS record automatically.

Configure routing. Create ~/.cloudflared/config.yml:

tunnel: <uuid>
credentials-file: /home/<user>/.cloudflared/<uuid>.json
ingress:
  - hostname: walkrpg-test.example.com
    service: http://localhost:3000
  - service: http_status:404

Run during test sessions. cloudflared tunnel run walkrpg-test. Test traffic terminates at Cloudflare’s edge, tunnels over QUIC/HTTP2 to the local NestJS service.

Operational notes:

Tunnel is only up while CEO laptop is on and cloudflared is running. Asynchronous testing across time zones is unreliable — testers in non-overlapping hours will see connection failures. This is an accepted limitation for the test phase and one of the triggers for the VPS migration.
Cloudflare provides TLS termination and basic DDoS at the edge for free. The tunnel handshake is mTLS between cloudflared and Cloudflare’s edge.
No data is stored at Cloudflare’s edge — it’s a proxy. EU residency is therefore not violated for ingress/egress traffic, though the production posture (Path B + GCP europe-central2-warsaw) still applies as the target.

Migration plan

This ADR closes phase 8b. The full architecture path is:

Now — phase 8b. Local + Cloudflare Tunnel + mock auth. Build the four endpoints. Run the closed self-test.
VPS upgrade point. Triggered by ONE of: (a) tester feedback asks for asynchronous testing across time zones, (b) test cohort grows past ~~30 walkers, (c) laptop-uptime requirement becomes a blocker for the CEO. At this point, a new ADR (working name ADR-0007) authors the Hetzner CX22 (~~€5/month) migration: same Docker Compose stack, same mock auth, but always-on. No Cloudflare Tunnel needed (VPS has a public IP). Roadmap entry “VPS migration” tracks this as a deferred undertaking, not a numbered phase.
Production migration — currently paused indefinitely. Triggered only after CEO greenlights the cost redesign post-VPS-upgrade. At that point ADRs 0001-0005 unfreeze and Path B Firebase residency engages. The production migration is itself decomposed into sub-ADRs (GCP project provisioning, Firebase Auth wiring, App Check rollout, reconciliation worker bring-up, GDPR pipeline activation, EU residency verification, anti-cheat Layer 3, multiplayer infrastructure). Each sub-ADR is authored when its prerequisite signal lands.

Until the production migration is greenlit, the entries in ADRs 0001-0005 remain authoritative for the production target but are not implemented. Phase 8b implementation reads this ADR (ADR-0006) for current scope.

Consequences

$0/month operating cost for the test phase. CEO time is the only cost line.
Same Prisma schema between test and production preserves the migration path with zero data-model rework.
Trust-based step economy in test phase. Mock-step submissions are not detected. Acceptable because testers are trusted and step economy balance is not what we’re testing at phase 8b.
Laptop uptime is the SLA ceiling. Asynchronous testing is fragile. This is the explicit trigger for the VPS migration when it bites.
No App Check, no Firebase, no GCP cost until production migration is greenlit. Cost redesign happens with real walker-density data in hand.
D-007/8/9 production targets are preserved. This ADR amends scope for phase 8b only. The production targets stay in canon, unchanged, deferred not cancelled.
Three phase-8a open questions are PARKED. Email/Password provider mix, GCP timeline, manual review queue staffing — all three were unanswered at the close of phase 8a. The re-scope makes them moot for phase 8b. They re-open at production-migration design time.