ADR-0004 — Anti-cheat layer

Status: draft Date: 2026-05-18 Owner: tech-architect + qa-engineer (anti-cheat metrics) Related canon: D-007 (restrictive anti-cheat, $10k/year ops), D-009 (Firebase App Check)

Status note (2026-05-18): This ADR describes the production target per D-007/D-009. Phase 8b implementation uses the simplified test-phase scope per ADR-0006 — schema validation + impossible-burst-rate guard (>12 steps/sec) + day-in-future + hard-cap-50_000 only. No App Check, no TZ-jump check, no Layer 3 behavioural patterns, no manual review queue, no three-strikes ban. Mock-step apps may pass — accepted risk for the closed self-test. The full three-layer defense in depth described below is deferred to the production-migration phase, currently paused.

Context

D-007 calls for restrictive anti-cheat: mock-step rejection, GPS sanity, TZ-jump rejection, signed device attestation (Play Integrity + Apple DeviceCheck). The system-reviewer flagged that aggressive anti-cheat can also misfire on legitimate walkers (transit users with weak GPS, time-zone-crossing travelers, accessibility cases). This ADR defines the layer with a quarantine band between accept and deny so the system never silently denies a walker their earned steps.

Decision — defense in depth

Three layers, evaluated in order on every /step/ingest call. Each layer can REJECT, QUARANTINE, or PASS.

Layer 1 — platform attestation (binary)

Firebase App Check token attached to every request. Verified server-side via Firebase Admin SDK.

iOS: DeviceCheck via App Attest API. Apple-signed assertion that the device is genuine and the app binary is unmodified.
Android: Play Integrity API at the MEETS_DEVICE_INTEGRITY level. Google-signed assertion that the device is genuine, Play Store-installed app, no debugger attached.

Verdict:

Token invalid / expired → REJECT (HTTP 401). AttestationLog REJECTED.
Token shows UNEVALUATED integrity (e.g. dev build, sideloaded) → REJECT in production environment; PASS in staging.
Token shows MEETS_DEVICE_INTEGRITY but NOT MEETS_STRONG_INTEGRITY (rooted Android, jailbroken iOS) → QUARANTINE (provisional state held, manual review queue).
Token shows full integrity → PASS to Layer 2.

Layer 2 — physiological heuristics (per-request)

Run on the submitted day’s reported count.

Check	Threshold	Verdict on violation
`count <= MAX_STEPS_PER_DAY`	50_000	REJECT
Burst rate: `count / sampleSpan_seconds <= MAX_RATE × 3`	`MAX_RATE = 4 steps/sec` → 12 steps/sec ceiling	REJECT
Gyro absence flag from device	mobile reports `gyroSamplesObserved=true` for ingest	QUARANTINE if false
Timestamp clustering: over 80% of `count` falls in under 5% of `sampleSpan`	density ratio	QUARANTINE
TZ jump >12h within 24h since prior submission	compare to last StepLog.tz	REJECT
day > tomorrow or day > 7 days past	day vs server clock	REJECT
Cross-source double-count: another source bundle reported the same time window with overlapping count	reconciliation step	QUARANTINE the duplicate, accept the first-by-bundleId

Burst-rate ceiling rationale: world-record marathon runner sustains ~3.5 steps/sec for 2 hours; a 3x tolerance (~12 steps/sec) gives generous headroom for sprint intervals and accounts for HealthKit/HC bucketing artifacts that can briefly inflate measured rates.

Layer 3 — behavioural patterns (per-walker, sliding window)

Run as a Cloud Tasks job triggered nightly per walker.

Pattern	Trigger	Action
5+ Layer 2 REJECTS in 24h	counter on StepLog	flag walker for manual review
Streak holder with sudden 10× daily increase	T2_30D walker submits day 31 with 10× prior 30-day average	QUARANTINE day, flag for review
Same source bundle, same hour, multiple submissions with non-monotonic counts	reconciliation worker	QUARANTINE duplicates
App Check `MEETS_DEVICE_INTEGRITY` ratio drops under 90% over 7 days	per-walker rolling stat	flag for review

GPS sanity — the privacy tradeoff (deferred to phase 11)

D-007 named GPS sanity. After analyzing the privacy posture (D-009 — no profiling, marketable as “privacy-first walking RPG”), GPS sanity is NOT in the closed-beta build.

Reasoning:

GPS would let us correlate claimed step count with actual displacement, which detects “phone strapped to dog/pendulum” mock setups.
But GPS is continuous location data, which is exactly what the privacy-first marketing promises NOT to collect.
The platform attestation (Layer 1) already kills mock-step apps. The remaining mock vector is mechanical (pendulum, dog collar), which is real but low-volume — qa-engineer estimates it would account for under 0.5% of cheat attempts based on PoE-community surveys.

Phase-11 revisit: if telemetry shows mechanical mock-step abuse >2% of attestation passes, mechanics-designer + tech-architect + game-director re-open the question. Possible mitigation: opt-in GPS verification for tournament/leaderboard contexts only, with explicit per-event consent. Never default-on.

Bands — accept / quarantine / deny

Critical UX principle: never silently deny earned steps. Three bands:

Band	Server state	Walker UX
PASS	StepLog ACCEPTED, Energy credited, streak advanced	Normal play
QUARANTINE	StepLog QUARANTINED, provisional Energy granted (badge shown), allocation slots held but flagged “pending validation”	Walker sees “we’re validating your last walk — this may take up to 24h” toast. Allocations made under quarantine survive if review accepts; revert + refund + apology push if review rejects.
REJECT	StepLog REJECTED, no Energy granted, AttestationLog reason logged	Walker sees explicit error message naming the cause: “We could not verify your device attestation. Please update WalkRPG or check your device security settings.” NEVER “you cheated”.

Quarantine is the safety net for false positives on transit users, jailbroken hobbyists who aren’t actually cheating, and timezone-crossing travelers.

Manual review queue

Quarantined walkers and pattern-flagged walkers land in a Cloud Tasks-backed review queue. Phase 11 builds the admin UI; closed-beta-1 uses a Cloud SQL view + manual SQL triage.

Review verdicts:

CLEAR → flip StepLog to ACCEPTED, run reconciliation, restore Energy and streak.
WARN → keep StepLog QUARANTINED, walker stays in heightened-scrutiny pool for 30 days.
STRIKE → flip StepLog to REJECTED, refund allocations, walker stays in scrutiny pool for 90 days.
BAN → terminal action. Walker account disabled, GDPR delete pipeline can still be invoked.

Three STRIKEs within 180 days → automatic BAN.

Operational cost (per D-007 $10k/year band)

Item	Annual cost band
Play Integrity API quota (Band C — 500M req/yr)	~$3_500
Apple DeviceCheck (free tier covers our volume)	$0
Cloud Logging retention (forensic trail, 90 days)	~$1_200
Cloud Armor managed rules (anti-DDoS + bot)	~$1_500
Cloud Tasks (review queue)	~$200
Manual review human-hours (~0.5 FTE-equivalent contractor, phase 11)	~$3_500
Anti-cheat ops total	~$9_900

Aligns with D-007 forecast. Band B (1k DAU) is roughly 1/10 of this; Band A (100 DAU) ~$200/year for the technology line and review is in-CEO.

Cloud Logging integration

Every layer writes a structured log entry to a dedicated anti-cheat log sink:

{
  "walkerId": "<uuid>",
  "day": "2026-05-18",
  "layer": 1 | 2 | 3,
  "verdict": "PASS" | "QUARANTINE" | "REJECT",
  "reasons": ["<reason-code>", ...],
  "attestationTokenId": "<short-hash>",
  "ipCountry": "<iso2>",
  "sourceBundleId": "...",
  "ts": "..."
}

Logs are retained 90 days (forensic trail) then auto-deleted (GDPR posture). Logs DO NOT contain step counts or location — only the verdict + reason codes.

Consequences

The system is restrictive but not punitive. Quarantine gives a 24h safety window for false positives.
GPS sanity is explicitly deferred to preserve the privacy-first marketing line. Phase 11 revisit triggered by metrics.
Manual review queue requires a 0.5 FTE-equivalent at Band C — closed beta uses CEO-as-reviewer, scale up at launch.
Three-strikes ban is enforceable; appeal flow is a phase-11 UX item.
The anti-cheat layer is separable from gameplay: turning it off (e.g. for a hackathon mode) is a config flag flip, not a schema change.