Files
whatsapp-inbox-platform/production-readiness-checklist.md
Wira Basalamah adde003fba
Some checks failed
CI - Production Readiness / Verify (push) Has been cancelled
chore: initial project import
2026-04-21 09:29:29 +07:00

64 lines
2.4 KiB
Markdown

# Production Readiness Checklist - WhatsApp Inbox
## Pre-Deployment
- [ ] Semua migration sudah terapkan (`npm run db:deploy`).
- [ ] Seed data baseline siap (`npm run db:seed`) jika memang dibutuhkan environment pertama.
- [ ] Environment sudah lengkap:
- `DATABASE_URL`
- `AUTH_SECRET`
- `APP_URL` (atau `OPS_BASE_URL`)
- `WHATSAPP_WEBHOOK_VERIFY_TOKEN` + `WHATSAPP_WEBHOOK_SECRET`
- `CAMPAIGN_RETRY_JOB_TOKEN`
- optional production hardening:
- `HEALTHCHECK_TOKEN`
- `WEBHOOK_FAILURE_RATE_THRESHOLD_PER_HOUR`
- `RETRY_WORKER_STALE_MINUTES`
- `WHATSAPP_WEBHOOK_RATE_LIMIT_GET`
- `WHATSAPP_WEBHOOK_RATE_LIMIT_POST`
- `WHATSAPP_WEBHOOK_RATE_LIMIT_WINDOW_MS`
- `AUTH_TOKEN_CONSUMED_RETENTION_HOURS`
- `CAMPAIGN_RETRY_STALE_LOCK_MINUTES`
- `WEBHOOK_EVENT_RETENTION_DAYS`
- `AUDIT_LOG_RETENTION_DAYS`
- [ ] Run `npm run typecheck`.
- [ ] Run `npm run build`.
- [ ] Run `npm run ops:smoke` sebelum deploy dan setelah restart service.
- [ ] Review `.env` tidak mengandung placeholder (contoh `change-me`, `your-*`).
## Runtime Readiness (Post-Deploy)
- [ ] Start app (`npm start` atau platform runner).
- [ ] Start campaign retry worker:
- one-shot cron, or daemon (`npm run job:campaign-retry:daemon`)
- token dan url job benar.
- run `npm run ops:maintenance` on a broader schedule (daily/weekly) for cleanup.
- [ ] Run ops readiness: `npm run ops:readiness`.
- [ ] Check endpoint health: `GET /api/health`.
- [ ] Verify webhook endpoint reachable: `POST /api/webhooks/whatsapp`.
- [ ] Verify campaign retry endpoint state:
- `GET /api/jobs/campaign-retry?token=<CAMPAIGN_RETRY_JOB_TOKEN>`
## Observability
- [ ] Pastikan alert webhook di `CAMPAIGN_RETRY_ALERT_WEBHOOK_URL` aktif.
- [ ] Pantau dashboard:
- `super-admin/alerts` untuk channel disconnect/webhook retry warning.
- `super-admin/webhook-logs` untuk event gagal.
- [ ] Jadwalkan pengecekan:
- `/api/health` setiap 1 menit.
- campaign retry satu run berkala (cron/daemon) sesuai SLA.
## Incident & Recovery
- [ ] SOP fallback jika `/api/health` status `down`:
- cek `prisma.backgroundJobState` (`campaign-retry-worker`)
- cek `prisma.webhookEvent` status `failed`
- cek status channel di `super-admin/channels`
- cek koneksi provider WhatsApp
- [ ] SOP rollback:
- `git checkout`/release target sebelumnya
- restart aplikasi
- verifikasi `/api/health` kembali `ok`
- jalanin `npm run ops:readiness`