Deployment Runbook
Overview
Agentix deploys as two services:| Service | Platform | Trigger | Build |
|---|---|---|---|
| Web (Next.js frontend) | Vercel | Push to main | next build |
| API (Express + BullMQ workers) | Railway | Push to main | Docker build from Dockerfile |
main. This runbook covers the full deployment lifecycle: pre-deploy checks, deployment process, database migrations, rollback procedures, secret rotation, and post-deploy verification.
Related runbooks:
- Staging Environment Setup — set up and verify staging before production deploy
- Database Backup & Restore — backup procedures before migrations
- Redis Persistence — verify Redis state after deploys
1. Pre-Deploy Checklist
Before deploying to production, verify each item:- CI passes on
main: Check GitHub Actions — lint, type-check, test, and build must all pass. - If database schema changed: Migration has been tested on staging first (see Staging Runbook, Section 4).
- If new environment variables added: Variables are set in both Vercel (for web) and Railway (for API) dashboards before deploy.
- Staging verification passed: The change has been deployed and tested on staging (see Staging Runbook, Section 5).
- For breaking API changes: Coordinate deploy order — if the API changes break the current web build, deploy API first. If the web depends on a new API endpoint, deploy API first.
- Database backup taken (if migration involved): Run a manual backup before applying schema changes (see Database Backup Runbook, Section 2).
2. Vercel Deployment (Frontend)
Auto-Deploy (Default)
Pushing tomain triggers a Vercel build and deployment automatically.
-
Build process:
Vercel handles this automatically based on the project’s root directory configuration.
-
Production URL:
https://app.agentix.app(or the configured custom domain). - Build logs: Vercel Dashboard > Project > Deployments tab > click the latest deployment to view build output.
- Build failure behavior: If the build fails, Vercel keeps the previous deployment active. There is no downtime. Fix the build error and push again.
Manual Deploy
If auto-deploy is disabled or you need to deploy without pushing:vercel login).
Build Configuration
The web app builds using the Next.js build pipeline:| Setting | Value |
|---|---|
| Framework | Next.js |
| Root directory | apps/web |
| Build command | cd ../.. && npm run build (Turborepo) |
| Output directory | .next |
| Node.js version | 22 |
3. Railway Deployment (Backend)
Auto-Deploy (Default)
Pushing tomain triggers a Railway build from the Dockerfile at the repository root.
-
Build process:
-
Health check: Railway pings
/healthafter deployment (configured inrailway.toml):The deployment is only promoted to active if the health check passes. If it fails, Railway keeps the previous deployment running. - Build logs: Railway Dashboard > Project > API service > Deployments tab.
- Build failure behavior: Railway does not promote failed deployments. The previous healthy deployment continues serving traffic.
Manual Deploy
Build Configuration
| Setting | Value |
|---|---|
| Builder | Dockerfile |
| Dockerfile path | Dockerfile (repo root) |
| Health check | GET /health |
| Restart policy | On failure, max 3 retries |
| Exposed port | 3001 |
4. Database Migrations
When to Run
Run migrations when any PR adds or modifies files inapps/api/prisma/migrations/. Check with:
Migration Procedure
Step 1: Take a backup Before any migration, create a manual backup (see Database Backup Runbook, Section 2):Database schema is up to date!
Important Notes
prisma migrate deployis safe: it only applies pending migrations and never resets data.- Migrations are applied in order based on the migration directory timestamps.
- If a migration fails midway, check the
_prisma_migrationstable for a failed entry:Fix the issue, then re-runprisma migrate deploy.
WARNING: Never useprisma migrate resetorprisma db pushin production. These commands can drop tables and delete data. Use onlyprisma migrate deployfor production databases.
5. Rollback Procedures
Vercel Rollback (Frontend)
Option A: Dashboard (Instant)- Open Vercel Dashboard > Project > Deployments.
- Find the previous successful deployment.
- Click the ”…” menu > “Promote to Production”.
- The rollback takes effect immediately (no rebuild required).
Railway Rollback (Backend)
Option A: Dashboard (Redeploy Previous)- Open Railway Dashboard > Project > API service > Deployments.
- Find the previous successful deployment.
- Click the deployment > “Redeploy”.
- Railway rebuilds from the previous commit’s Docker image.
Database Rollback
Database rollbacks are the most complex because Prisma does not support down migrations. Scenario 1: Additive migration (new columns, new tables) Rollback is usually not needed. Old application code ignores new columns and tables. Simply roll back the application code and the unused schema remains harmlessly. Scenario 2: Destructive migration (dropped columns, renamed tables) Restore from backup (see Database Backup Runbook):- Do not roll back the application code yet — it may depend on the new schema.
- Restore the database from the pre-migration backup:
- Update
DATABASE_URLto point to the restored instance. - Roll back the application code.
- Redeploy.
6. Secret Rotation
Rotation Procedures by Secret
| Secret | Location | Rotation Steps |
|---|---|---|
BETTER_AUTH_SECRET | Railway | 1. Generate new secret: openssl rand -hex 32. 2. Update in Railway env vars. 3. Redeploy API. 4. Impact: All existing sessions are invalidated — users must re-login. |
OPENAI_API_KEY | Railway | 1. Create new key in OpenAI Dashboard. 2. Update in Railway env vars. 3. Redeploy API. 4. Revoke old key in OpenAI Dashboard. |
CREDENTIAL_ENCRYPTION_KEY | Railway | WARNING: Rotating this key makes all existing encrypted tool credentials unreadable. 1. Decrypt all credentials with old key. 2. Update env var in Railway. 3. Re-encrypt all credentials with new key. 4. Redeploy API. |
RESEND_API_KEY | Railway | 1. Create new key in Resend Dashboard. 2. Update in Railway env vars. 3. Redeploy API. 4. Revoke old key in Resend. |
SENTRY_DSN | Railway + Vercel | DSN rarely changes. If needed: 1. Update in both Railway and Vercel env vars. 2. Redeploy both services. |
SENTRY_AUTH_TOKEN | Vercel | 1. Create new token in Sentry. 2. Update in Vercel env vars. 3. Redeploy web. 4. Revoke old token. |
POSTHOG_API_KEY | Railway + Vercel | 1. Rotate in PostHog Project Settings. 2. Update in both Railway and Vercel env vars. 3. Redeploy both services. |
General Rotation Procedure
For any secret:- Generate a new credential in the service’s dashboard.
- Update the env var in the deployment platform (Railway and/or Vercel).
- Redeploy the affected service(s) to pick up the new value.
- Verify the service works with the new credential (health check, test request).
- Revoke the old credential in the service’s dashboard.
Important: Always generate the new credential before revoking the old one. There will be a brief window where both credentials are valid — this is expected and prevents downtime.
Key Generation Commands
7. Post-Deploy Verification
After every production deployment, verify:-
API health check passes:
Expected response:
-
Web loads correctly:
Visit
https://app.agentix.appand confirm the login page renders without errors. - Sentry: Check the Sentry Dashboard for new errors in the first 15 minutes post-deploy. A spike in errors indicates a regression.
- BetterStack: Confirm the uptime monitor shows green for both web and API endpoints.
-
If migration was run: Spot-check affected data via API requests or Prisma Studio:
- If webhook changes: Send a test WhatsApp message and verify it is processed correctly. Check Railway API logs for the webhook event and worker processing.
- BullMQ workers: Check Railway API logs for worker startup messages confirming all 3 workers are running (message-processing, broadcast-sending, audit-processing).
8. Emergency Procedures
Site Down After Deploy
- Immediately rollback using the fastest method:
- Vercel: Dashboard > Promote previous deployment (instant).
- Railway: Dashboard > Redeploy previous deployment.
- Verify the rollback resolved the issue (health check, site load).
- Investigate the root cause on the rolled-back commit.
- Fix, test on staging, then re-deploy.
Database Migration Failed
- Do NOT rollback application code yet — the code may depend on the new schema.
- Check the
_prisma_migrationstable: - If migration partially applied (started but not finished): Restore from the pre-migration backup (see Database Backup Runbook).
- If migration failed cleanly (error before any changes): Fix the migration SQL and re-run
prisma migrate deploy. - After fixing: redeploy and verify.
Secret Leaked
- Immediately rotate the leaked secret (see Section 6).
- Check audit logs and access logs for unauthorized access during the exposure window.
- If the secret was a
BETTER_AUTH_SECRET: all sessions are invalidated on rotation. Users re-login. - If the secret was an
OPENAI_API_KEY: check OpenAI usage dashboard for unexpected charges. - If the secret was a
CREDENTIAL_ENCRYPTION_KEY: all tool credentials need re-encryption after rotation. - Notify affected users if customer data may have been exposed.
- Document the incident: what leaked, exposure window, impact, remediation.
High Error Rate After Deploy (No Outage)
- Check Sentry for the new error pattern.
- If errors are isolated to a specific feature: consider a targeted fix instead of full rollback.
- If errors affect core functionality (auth, webhooks, workflows): rollback immediately.
- If errors are transient (connection timeouts, cold starts): monitor for 5 minutes. Railway and Vercel may need time to stabilize after deploy.