Skip to content

ADR-0001 — GCP-native architecture topology

ADR-0001 — GCP-native architecture topology

Status: draft Date: 2026-05-18 Owner: tech-architect Supersedes: none Related canon: D-008 (stack), D-009 (residency)

Context

D-008 ratified GCP as the backend platform. D-009 requires all data resident in europe-central2-warsaw with strict GDPR posture. The product profile (D-007) is hardcore, full social, real-time co-op + chat, restrictive anti-cheat. This ADR locks the component shape, region pinning, and three scaling bands so that backend-engineer (phase 8b) and ops planning have a deterministic target.

Decision — component topology

Mobile (iOS Swift / Android Kotlin)
|
| HTTPS + WebSocket (mTLS optional)
v
Cloud Load Balancer (global anycast, region pinning policy = EU)
|
+----> Cloud Armor (DDoS + WAF + bot rules)
|
v
Cloud Run service: walkrpg-api (NestJS 11) europe-central2
| \
| \--> Cloud Run service: walkrpg-realtime (WS gateway) europe-central2
| (separate service so REST scales independently of WS connections)
|
v
Cloud SQL for Postgres 16 (db-custom, HA optional) europe-central2-warsaw
|
v
Memorystore Redis (streak cache, presence, rate limit) europe-central2
Cloud KMS (session JWT signing key) europe-central2
Secret Manager (Firebase Admin creds, DB password) europe-central2
Cloud Storage (asset bucket, GDPR export bucket) europe-central2 (CMEK with KMS key)
Cloud Logging + Cloud Monitoring log sink configured to EU bucket
Cloud Scheduler (cron triggers) europe-central2
Cloud Tasks (deferred reconciliation, GDPR delete) europe-central2
Firebase (federation broker — see ADR-0005) pass-through, no PII at rest in US
Firebase App Check (Play Integrity / DeviceCheck) pass-through verification

All compute and storage that touches user PII or game state lives in europe-central2-warsaw. The single exception is the Firebase Auth federation broker (US-based, pass-through, holds only opaque sub claims) — covered in ADR-0005.

Service breakdown

ServicePurposeWhy separate
walkrpg-apiREST + Swagger + auth callbacks + step ingest + tree state + GDPR exportStateless, autoscales on CPU. Should not be tied to long-lived WS connections.
walkrpg-realtimeWebSocket gateway: guild chat, co-op session presence, regional event broadcastsLong-lived TCP. Different scaling rules (per-connection, not per-request). Targets Cloud Run min-instances=1 for low ping.
walkrpg-jobs (lightweight)Cron + Cloud Tasks consumer: reconciliation passes, streak decay sweep, GDPR delete pipeline, anonymizationIsolated so ad-hoc long jobs don’t block API workers.

Phase 8b ships walkrpg-api only. walkrpg-realtime and walkrpg-jobs land in phase 11 (closed beta vertical slice).

Region pinning policy

  • Cloud SQL: europe-central2-warsaw (multi-zone HA disabled for closed beta, enabled at 1k DAU).
  • Cloud Run services: europe-central2, min-instances tuned per band.
  • Redis (Memorystore): europe-central2, single-region. Replication enabled at 10k DAU band.
  • KMS, Secret Manager, Storage, Logging: all pinned to europe-central2 / eu bucket location.
  • Load Balancer: Global anycast IP, but routing policy restricts forwarding to EU backends so a non-EU edge node still terminates traffic at the Warsaw service. Acceptable 100-300ms RTT for non-EU hardcore players (acknowledged in D-009 R3 synthesis tension #2).

Three-band cost estimate

Estimates are monthly USD, rounded, list price (no committed-use discount). All include Cloud SQL backups and 7-day log retention. Excludes mobile app store fees.

Amendment (2026-05-18) — Band Zero added per ADR-0006. Band Zero is the test-phase only band (~20 self-testers on CEO laptop + Cloudflare Tunnel). It is NOT a production target — none of the GCP components described elsewhere in this ADR are provisioned at Band Zero. Bands A / B / C below remain the production-target scaling horizon and unfreeze when the production migration is greenlit.

Band Zero — test phase (~20 self-testers, local + Cloudflare Tunnel)

ComponentSpecCost
ComputeLocal Docker Compose on CEO laptop (NestJS + Postgres)$0
DatabaseLocal Postgres 16 container, no managed backup$0
External reachCloudflare Tunnel free tier$0
AuthMock JWT signed by NestJS-local key (no Firebase)$0
AttestationDisabled (no App Check)$0
Logging / monitoringstdout + local file rotation$0
Band Zero total$0/month + CEO time + electricity

Test scope: register, ingest steps, complete Quest 001, allocate 4 points, unlock keystone, read profile + tree state. Multiplayer / chat / co-op / reconciliation / GDPR pipeline / Layer 3 anti-cheat — all out of scope at Band Zero. See ADR-0006 for the full delta and migration plan to Band A.

Band A — launch (100 DAU)

ComponentSpecCost
Cloud SQL Postgresdb-custom-1-3840 (1 vCPU / 3.75 GiB), 20 GB SSD, daily backup$55
Cloud Run (walkrpg-api)min-instances=0, ~5M requests/mo, 1 vCPU / 512 MiB$15
Memorystore RedisBASIC 1 GB$40
Cloud KMS1 key, ~50k operations$1
Secret Manager3 secrets, low-frequency access$1
Cloud Storage5 GB assets + GDPR exports$1
Cloud Logging5 GB retention$3
Cloud Armorper-policy + per-request$7
Cloud Load Balancerglobal, low traffic$20
Band A total~$143/month

Band B — validation (1k DAU)

ComponentSpecCost
Cloud SQL Postgresdb-custom-2-7680 HA enabled, 50 GB SSD$230
Cloud Run (walkrpg-api)min-instances=1, ~50M req/mo$80
Cloud Run (walkrpg-realtime)min-instances=1, ~1k concurrent WS$40
Memorystore RedisSTANDARD_HA 2 GB$120
Cloud KMS~500k operations$3
Secret Managerunchanged$1
Cloud Storage50 GB$3
Cloud Logging + Monitoring30 GB retention$15
Cloud Armorscaled rules$20
Cloud Load Balancerscaled$50
Firebase App Checkincluded free up to 10M/mo$0
Band B total~$562/month

Band C — early growth (10k DAU)

ComponentSpecCost
Cloud SQL Postgresdb-custom-4-15360 HA + read replica, 200 GB SSD$850
Cloud Run (walkrpg-api)min-instances=3, ~500M req/mo$400
Cloud Run (walkrpg-realtime)min-instances=3, ~10k concurrent WS$300
Cloud Run (walkrpg-jobs)min-instances=1$30
Memorystore RedisSTANDARD_HA 5 GB$260
Cloud KMS~5M operations$25
Secret Managerunchanged$1
Cloud Storage500 GB$20
Cloud Logging + Monitoring100 GB retention$50
Cloud Armorhardened rules$70
Cloud Load Balancerscaled$120
Cloud CDN (asset bucket)1 TB egress$80
Anti-cheat ops (Play Integrity / DeviceCheck verification quota)per D-007 $10k/year band$830
Band C total~$3,036/month

Band C aligns with the D-007 anti-cheat ops note ($10k/year) and gives headroom for early closed beta scale. Crossing 10k DAU is the trigger to move from db-custom-4 to a regional Cloud Spanner evaluation, but Spanner is out of scope until post-launch.

Scaling triggers + autoscaling defaults

  • Cloud Run autoscaling: target CPU 60%, max-instances 50 (Band A) / 200 (B) / 1000 (C).
  • Cloud SQL: monitor database_cpu_utilization >70% sustained → upgrade tier.
  • Redis: monitor memory_usage >75% → upgrade tier.
  • Postgres connection pool: PgBouncer sidecar deployed at Band B, target pool size 100.

Backup + DR

  • Cloud SQL daily snapshots, 7d retention at Band A / 30d at Band B+.
  • Point-in-time recovery (PITR) enabled at all bands.
  • KMS key import/export disabled. Disaster: KMS keys are versioned but not exportable, regional outage of europe-central2 = downtime, acceptable for closed-beta SLA. Multi-region KMS evaluated at Band C+.

Open follow-ups

  • WebSocket service: Cloud Run vs GKE Autopilot benchmark needed at Band B before live co-op rolls out. Tracked in phase 11 brief.
  • CDN: enabled only at Band C. Closed beta serves assets directly from walkrpg-api.

Consequences

  • Cost discipline at launch holds the bill under $200/month.
  • All compute pinned to Warsaw means non-EU players ship traffic through transit ISPs to EU. Latency tension acknowledged.
  • Two-Cloud-Run-service split (api + realtime) is a Path B-friendly shape: scaling REST workers does not churn long-lived WS connections.