# Rotor — Full Documentation > Complete docs for Rotor (rotor.sh). Durable job queues for GTM engineers. > Source: https://rotor.sh/llms-full.txt | Summary: https://rotor.sh/llms.txt --- ## Get Started ### Quickstart URL: https://rotor.sh/docs/quickstart This quickstart is a strict, opinionated path. Every command works on macOS and Linux. Windows users: run from Git Bash or WSL. **Total wall-clock:** ~20 minutes if you're new; under 10 minutes once you're warmed up. ## Prerequisites You need three things before starting: - **Node 18+** on your `PATH`. Verify with `node --version`. - **A Rotor workspace API key** starting with `rt_ws_`. Sign up at [rotor.sh/signup](https://rotor.sh/signup) — your key is shown once after checkout. Treat it like a database password. - **A server you can run on a public URL** for step 4 (callback handler). Localhost works during development if you tunnel via ngrok / cloudflared / Tailscale Funnel. ## Install the CLI ```bash npm i -g @rotorsh/cli rotor --version # confirm 0.1.x prints ``` ## Login ```bash rotor login # Paste your rt_ws_* key when prompted. Stored at ~/.rotor/config.json (mode 0600). ``` You can also export the key as `ROTOR_API_KEY` in your shell — `rotor` commands prefer the env var over the saved config when both are present. ## Create a queue ```bash rotor queues create welcome-emails \ --concurrency 5 \ --retry-attempts 3 ``` The queue is now live and accepting jobs. Confirm with `rotor queues list`. ## Create a callback handler This is where most of the time goes. Rotor delivers each job as an HMAC-signed `POST` to the URL you configure on the queue. Your handler verifies the signature, runs your job logic, and returns a 2xx on success. The simplest production-quality handler — Hono on Node, using `verifyRotorSignature` from the SDK: ```bash npm i hono @rotorsh/sdk ``` ```typescript // server.ts import { Hono } from "hono"; import { serve } from "@hono/node-server"; import { verifyRotorSignature } from "@rotorsh/sdk"; const app = new Hono(); app.post("/rotor-callback", async (c) => { const body = await c.req.text(); const ok = verifyRotorSignature( body, c.req.header("x-rotor-signature"), process.env.ROTOR_CALLBACK_SECRET!, { id: c.req.header("x-rotor-job-id")!, timestamp: Number(c.req.header("x-rotor-timestamp")), maxAgeSeconds: 300, }, ); if (!ok) return c.text("invalid signature", 401); // Your job logic — keep it idempotent (BullMQ delivers at-least-once) const { contactId, campaignId } = JSON.parse(body); console.log(`Sending welcome email: contact=${contactId} campaign=${campaignId}`); return c.json({ status: "ok" }); }); serve({ fetch: app.fetch, port: 3000 }); console.log("listening on :3000"); ``` ```bash # Run it ROTOR_CALLBACK_SECRET=__placeholder__ npx tsx server.ts ``` You'll set the real secret in the next step. For local development, expose `:3000` to the public internet via your tunnel of choice — `cloudflared tunnel --url http://localhost:3000` is one option. **Idempotent handlers.** BullMQ delivers jobs at-least-once. Use the `x-rotor-job-id` header as a dedup key (e.g., upsert on `(workspace, jobId)` in your DB) so a redelivery after a worker crash doesn't double-send. ## Wire callback URL + secret into the queue ```bash rotor queues update welcome-emails \ --callback-url https://your-tunnel.example.com/rotor-callback \ --rotate-callback-secret # Output: New callback secret: # Copy this — it's shown once. ``` Set the printed secret as `ROTOR_CALLBACK_SECRET` in your server's environment and restart your handler. ## Create a cron schedule ```bash rotor schedules create \ --cron "*/2 * * * *" \ --timezone "UTC" \ --queue welcome-emails # Fires every 2 minutes. Adjust to "0 9 * * 1-5" for 9am weekdays. ``` ## Watch the first cron tick land ```bash # Wait up to 2 minutes for the first cron tick rotor schedules list # Look for `last_fired_at` advancing within the last 2 minutes. # Or watch live: rotor status --watch ``` Within 2 minutes you should see your `welcome-emails` queue's depth advance and your callback handler's logs print the contact/campaign payload. ## Where next Add Rotor tools to Claude Code, Cursor, or Claude Desktop. Worker class, idempotency helpers, error hierarchy. Side-by-side comparison + migration playbook. From cron-as-a-service to Rotor schedules. --- ## Guides ### Node.js Guide URL: https://rotor.sh/docs/guides/node-quickstart ## Prerequisites - Node.js 18+ - Redis 7.4+ (Railway or self-hosted; must be `noeviction` policy) - A Rotor workspace key (`rt_ws_...`) ## Install ```bash pnpm add @rotorsh/sdk bullmq ioredis ``` Rotor pins a specific BullMQ major version. Check your `package.json` after install — if `bullmq` is a different major than `@rotorsh/sdk` expects, you'll get a startup warning. Pin to the version listed in `@rotorsh/sdk`'s peer dependencies. ## The 5 Resources ### 1. Queues Create and inspect queues: ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); // Create a queue const queue = await rotor.queues.create({ name: "outreach", retry_attempts: 3, }); // List queues const queues = await rotor.queues.list(); console.log(queues.data.map((q) => q.name)); // Get queue stats const stats = await rotor.queues.stats("outreach"); console.log(stats.waiting, stats.active, stats.completed, stats.failed); ``` ### 2. Jobs Enqueue individual or batch jobs: ```typescript // Single job const job = await rotor.jobs.enqueue("outreach", { payload: { leadId: "lead_123" }, }); // Batch (quota checked atomically — all-or-nothing) const jobs = await rotor.jobs.addBatch("outreach", [ { payload: { leadId: "lead_001" } }, { payload: { leadId: "lead_002" } }, { payload: { leadId: "lead_003" } }, ]); // Get job status const status = await rotor.jobs.get("outreach", job.id); console.log(status.state); // waiting | active | completed | failed | delayed ``` ### 3. Schedules Recurring jobs with cron syntax: ```typescript // Create a recurring enrichment job const schedule = await rotor.schedules.create({ queue: "enrichment", name: "nightly-enrichment", cron: "0 2 * * *", // 2am UTC daily payload: { source: "clearbit" }, timezone: "UTC", }); // Pause / resume via update await rotor.schedules.update(schedule.id, { enabled: false }); await rotor.schedules.update(schedule.id, { enabled: true }); ``` ### 4. Status Check workspace health and quota usage in a single call: ```typescript const status = await rotor.status.get(); // { // redis: 'ok', // api: 'ok', // version: '1.0.0', // quota: { used: 12450, limit: 100000, plan: 'pro', resetAt: '2025-05-01T00:00:00Z' } // } ``` ### 5. Usage Billing-period usage breakdown: ```typescript const usage = await rotor.usage.current(); // { period: { from, to }, executions: { total, byQueue: {...} } } ``` ## Idempotent Handler Pattern Prevent duplicate outreach when jobs retry: ```typescript import { RotorWorker } from "@rotorsh/sdk"; const worker = new RotorWorker({ workspaceId: process.env.WORKSPACE_ID!, queueName: "outreach", connection: process.env.REDIS_URL!, concurrency: 5, processor: async (job) => { const { leadId } = job.data.payload; // Use job.id as idempotency key for downstream APIs const alreadySent = await db.outreach.findUnique({ where: { rotorJobId: job.id }, }); if (alreadySent) { return { skipped: true, reason: "already-sent" }; } await sendEmail(leadId); await db.outreach.create({ data: { rotorJobId: job.id, leadId, sentAt: new Date() }, }); return { sent: true }; }, }); await worker.ready(); ``` ## SIGTERM Drain Graceful shutdown — in-flight jobs complete before the process exits: ```typescript const worker = new RotorWorker({ workspaceId: process.env.WORKSPACE_ID!, queueName: "outreach", connection: process.env.REDIS_URL!, concurrency: 5, drainTimeout: 30_000, // wait up to 30s for in-flight jobs processor: handler, }); // RotorWorker registers SIGTERM/SIGINT handlers automatically. // On Railway/Fly.io/Heroku: deploy sends SIGTERM → worker finishes // current jobs → process exits cleanly. No jobs are lost. await worker.ready(); ``` ## Railway Deploy Notes 1. Set `ROTOR_API_KEY` in Railway environment variables 2. Set `REDIS_URL` to your Railway Redis private URL (use `redis://...`) 3. Ensure Redis Memory Policy is `noeviction`: ``` railway run redis-cli CONFIG SET maxmemory-policy noeviction ``` 4. Worker `Procfile` or start command: `node dist/worker.js` 5. Deploy: Railway will send SIGTERM on deploys — SIGTERM drain ensures zero job loss ## Next Steps - [CLI Reference](/docs/guides/cli-reference) — all `rotor` commands - [Job Tags](/docs/guides/job-tags) — label and filter jobs by campaign or contact - [Concurrency Keys](/docs/guides/concurrency-keys) — prevent duplicate concurrent runs - [Failure Alerts](/docs/guides/failure-alerts) — get notified when jobs terminally fail - [Idempotency](/docs/guides/idempotency) — two-layer dedup strategy - [Webhooks](/docs/guides/webhooks) — receive HTTP callbacks on job lifecycle events - [API Reference](/docs/api-reference/introduction) — all REST endpoints ### CLI quickstart URL: https://rotor.sh/docs/guides/cli-quickstart The CLI quickstart has been consolidated into the main [Quickstart](/quickstart). That page walks through `npm i -g @rotorsh/cli`, `rotor login`, queue + schedule creation, and writing a HMAC-verifying callback handler — start to finish in under 30 minutes. ### CLI Reference URL: https://rotor.sh/docs/guides/cli-reference Install the CLI globally: ```bash npm i -g @rotorsh/cli rotor --version ``` Authenticate once, then all commands use your saved key: ```bash rotor login # Paste your rt_ws_* key. Saved to ~/.rotor/config.json (mode 0600). ``` You can also pass the key inline via `ROTOR_API_KEY` — this takes precedence over the saved config. --- ## rotor queues ### `rotor queues create` ```bash rotor queues create [options] Options: --concurrency Max jobs processed in parallel (default: 1) --retry-attempts Attempts per job before moving to DLQ (default: 3) --callback-url HTTPS URL Rotor POSTs each job to --rotate-callback-secret Generate a new HMAC signing secret (printed once) ``` ```bash rotor queues create enrichment \ --concurrency 10 \ --retry-attempts 5 \ --callback-url https://your-app.example.com/rotor/enrichment ``` ### `rotor queues list` ```bash rotor queues list # NAME CONCURRENCY JOBS (waiting/active/failed) # enrichment 10 142 / 3 / 1 # outreach 5 0 / 0 / 0 ``` ### `rotor queues get` ```bash rotor queues get ``` Prints full queue config: callback URL, retry settings, concurrency, metrics. ### `rotor queues update` ```bash rotor queues update [options] Options: --concurrency --retry-attempts --callback-url --rotate-callback-secret Rotate the signing secret; new value printed once ``` ```bash # Update callback URL and rotate the secret in one command rotor queues update enrichment \ --callback-url https://new-host.example.com/rotor/enrichment \ --rotate-callback-secret ``` ### `rotor queues delete` ```bash rotor queues delete # Prompts for confirmation. Pass --force to skip. ``` --- ## rotor jobs ### `rotor jobs enqueue` ```bash rotor jobs enqueue --payload [options] Options: --payload Job payload as a JSON string (required) --delay Delay before job becomes eligible --tags Comma-separated list of tags --concurrency-key Mutual exclusion key (one job per key at a time) --idempotency-key Deduplicate — same key = same job, not a new one ``` ```bash rotor jobs enqueue outreach \ --payload '{"contactId":"cid_123","campaignId":"camp_456"}' \ --tags "campaign:q2-outbound,contact:cid_123" \ --concurrency-key "contact:cid_123" ``` ### `rotor jobs get` ```bash rotor jobs get ``` Prints job state, payload, tags, retry count, timestamps, and failure reason if applicable. ### `rotor jobs list` ```bash rotor jobs list [options] Options: --status Filter by state: waiting|active|completed|failed|delayed --tag Filter by tag --limit Results per page (default: 50) ``` ```bash rotor jobs list enrichment --status failed rotor jobs list outreach --tag campaign:q2-outbound ``` ### `rotor jobs retry` ```bash rotor jobs retry # Moves a failed job back to waiting. Resets attempt count. ``` ### `rotor jobs cancel` ```bash rotor jobs cancel # Cancels a waiting or delayed job. Cannot cancel an active job. ``` --- ## rotor schedules ### `rotor schedules create` ```bash rotor schedules create [options] Options: --queue Queue to enqueue jobs into (required) --cron Cron expression (required) --timezone IANA timezone (required) — e.g. "America/New_York" --name Human-readable name for the schedule --payload Payload for each generated job ``` ```bash rotor schedules create \ --queue enrichment \ --cron "0 9 * * 1-5" \ --timezone "America/New_York" \ --name "weekday-morning-enrichment" \ --payload '{"source":"clearbit"}' ``` ### `rotor schedules list` ```bash rotor schedules list # NAME CRON NEXT FIRE # weekday-morning-enrichment 0 9 * * 1-5 2026-05-21T09:00:00-04:00 ``` ### `rotor schedules pause` ```bash rotor schedules pause # No new jobs fire until resumed. In-flight jobs are not affected. ``` ### `rotor schedules resume` ```bash rotor schedules resume ``` ### `rotor schedules delete` ```bash rotor schedules delete [--force] ``` --- ## rotor secrets Secrets are workspace-scoped encrypted values. Reference them in callback URLs and job payloads as `${{ secrets.SECRET_NAME }}` — Rotor injects the plaintext at dispatch time. The value never appears in your code or logs. ### `rotor secrets create` ```bash rotor secrets create --value # Or pipe the value to avoid shell history: echo -n "sk_live_abc123" | rotor secrets create STRIPE_KEY --stdin ``` Secret names must be `UPPER_SNAKE_CASE`. Values are encrypted at rest (AES-256). ### `rotor secrets list` ```bash rotor secrets list # NAME HINT CREATED # STRIPE_KEY sk_li… 2026-05-01 # RESEND_KEY re_… 2026-05-10 ``` Only the masked hint is shown. Plaintext values are never returned after creation. ### `rotor secrets delete` ```bash rotor secrets delete [--force] ``` --- ## rotor status ```bash rotor status # API: ok # Redis: ok # Plan: pro # Quota: 12,450 / 100,000 jobs used this period # Resets: 2026-06-01 ``` Add `--watch` to poll every 5 seconds: ```bash rotor status --watch ``` --- ## rotor template ```bash rotor template add ``` Scaffolds a starter workflow into your current directory. Available templates: | Template | Description | |----------|-------------| | `outbound-with-approvals` | Multi-step outbound sequence with human approval gate | | `multi-step-outbound` | Email sequence with `step.waitForEvent` reply detection | | `event-driven-enrichment` | Parallel contact enrichment on `campaign.started` | | `scheduled-with-approval` | Weekly cron report requiring approval before delivery | ```bash rotor template add outbound-with-approvals ``` --- ## Global flags All commands accept: ``` --api-key Override saved key for this command only --json Output raw JSON instead of formatted table --quiet Suppress all output except errors --help Show help for any command ``` ### MCP quickstart URL: https://rotor.sh/docs/guides/mcp-quickstart The MCP quickstart has been consolidated into the [MCP install page](/mcp/install). That page has copy-pasteable `.mcp.json` snippets for Claude Code, Claude Desktop, and Cursor, plus a "first prompt" demo using `rotor_create_queue`. ### Schedules URL: https://rotor.sh/docs/guides/schedules A schedule is a named cron rule attached to a queue. When a tick fires, Rotor enqueues a job to that queue with the payload you specified — your callback handler receives it exactly like any other job. You don't run a cron process; Rotor does. Use schedules when you need to: - Kick off a nightly enrichment or sync at a specific time in a specific timezone - Send a weekly digest every Monday morning in each customer's local time - Trigger a daily report at 9 AM New York time regardless of daylight saving ## Creating a schedule ### CLI ```bash rotor schedules create \ --queue enrichment \ --name nightly-enrichment \ --cron "0 2 * * *" \ --timezone "America/New_York" \ --job-data '{"source":"clearbit"}' ``` ### SDK ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); const schedule = await rotor.schedules.create({ queue_name: "enrichment", name: "nightly-enrichment", cron: "0 2 * * *", timezone: "America/New_York", job_data: { source: "clearbit" }, enabled: true, }); console.log(schedule.external_id); // sched_<32hex> — save this ``` The `external_id` (`sched_<32hex>`) is your handle for all subsequent operations. Save it. ## The timezone requirement `timezone` is required. Rotor rejects schedule creation without it. The reason matters: cron expressions like `0 9 * * 1-5` (9 AM weekdays) are meaningless without a timezone. If you use UTC when your team is in New York, the job fires at 9 AM UTC — which is 5 AM ET in winter and 4 AM ET when daylight saving ends. With a timezone, Rotor recalculates the next-fire timestamp whenever clocks change so the job always fires at the wall-clock time you intended. Pass a valid IANA timezone string (e.g., `"America/New_York"`, `"Europe/London"`, `"Asia/Tokyo"`). Offsets like `"+05:30"` are not accepted — they don't account for DST transitions. ## `external_id` vs internal `id` Every schedule has two identifiers: | Field | Format | Use | |-------|--------|-----| | `external_id` | `sched_<32hex>` | Use in all SDK and CLI calls | | `id` | `ws_abc:name` | BullMQ internal key — ignore this | Always use `external_id`. The internal `id` is an implementation detail that may change. ## Updating a schedule You can update the cron expression, timezone, payload, or enabled state independently — any combination of fields. ```typescript // Change the cron — now fires at 3 AM instead of 2 AM await rotor.schedules.update("sched_abc123...", { cron: "0 3 * * *", }); // Change the payload await rotor.schedules.update("sched_abc123...", { job_data: { source: "apollo", limit: 500 }, }); // Change timezone — e.g., follow a customer moving regions await rotor.schedules.update("sched_abc123...", { timezone: "Europe/London", }); ``` ## Pausing and resuming Set `enabled: false` to pause a schedule. No ticks fire while paused. Set `enabled: true` to resume — the next tick fires at the next cron-calculated time after resume. ```typescript // Pause await rotor.schedules.update("sched_abc123...", { enabled: false }); // Resume await rotor.schedules.update("sched_abc123...", { enabled: true }); ``` Pausing a schedule does not cancel jobs that are already enqueued and waiting in the queue. It only prevents future ticks from creating new jobs. ## Deleting a schedule ```typescript await rotor.schedules.delete("sched_abc123..."); ``` CLI: ```bash rotor schedules delete sched_abc123... ``` Deletion is permanent. The schedule stops firing immediately. Run history is retained for your plan's retention window. ## Fire now `fireNow` enqueues a job immediately — the same job the schedule would enqueue at its next tick. Use it to test a new schedule without waiting for the first cron tick, or to manually backfill a missed run. ```typescript const { jobId } = await rotor.schedules.fireNow("sched_abc123..."); console.log(jobId); // job_ — track this in rotor.jobs.get() ``` CLI: ```bash rotor schedules fire-now sched_abc123... ``` Manual `fireNow` calls appear in the run history alongside cron-driven ticks — both are ScheduleRun records. ## Run history Rotor records every tick (cron-driven and manual) as a `ScheduleRun`. ```typescript const { runs } = await rotor.schedules.runs("sched_abc123...").list({ limit: 20 }); for (const run of runs) { console.log(run.id); // sr_<32hex> console.log(run.scheduled_for); // ISO timestamp — when the tick was due console.log(run.fired_at); // ISO timestamp — when Rotor actually enqueued console.log(run.job_id); // job_ — the resulting job console.log(run.callback_status); // pending | 2xx | 4xx | 5xx | timeout | null console.log(run.error); // error message if status is 4xx/5xx/timeout } ``` ### `callback_status` values | Value | Meaning | |-------|---------| | `pending` | Job enqueued, callback not yet called | | `2xx` | Your handler returned a success response | | `4xx` | Your handler returned a client error (check `error` field) | | `5xx` | Your handler returned a server error — will retry per queue config | | `timeout` | Your handler did not respond within the deadline | | `null` | Status not yet recorded | CLI: ```bash rotor schedules get sched_abc123... ``` ## Listing schedules ```bash rotor schedules list ``` ```typescript const schedules = await rotor.schedules.list(); ``` ## Common cron patterns | Expression | Meaning | Typical timezone | |------------|---------|-----------------| | `0 9 * * 1-5` | 9 AM weekdays | `America/New_York` | | `0 2 * * *` | 2 AM daily | `UTC` | | `0 8 * * 1` | 8 AM every Monday | `America/Chicago` | | `0 0 1 * *` | Midnight on the 1st of each month | `UTC` | | `*/15 * * * *` | Every 15 minutes | `UTC` | | `0 18 * * 5` | 6 PM every Friday | `Europe/London` | Use [crontab.guru](https://crontab.guru) to verify expressions before creating a schedule. Paste the expression and confirm the human-readable description matches your intent. ## Next steps Full walkthrough of all Rotor resources including schedules. All `rotor schedules` subcommands and flags. Get notified when a scheduled job terminally fails. Gate jobs behind human approval before they run. ### Approvals URL: https://rotor.sh/docs/guides/approvals Approvals are human-in-the-loop gates on jobs. When a queue has approvals enabled, every job enqueued to it pauses and waits for a human to approve or reject before the callback handler is called. The job payload is visible during review so you can make an informed decision. Use approvals when: - An AI agent is generating outbound emails and a human should review before send - A financial transaction above a threshold requires sign-off - Sensitive data access (e.g., exporting a contact list) needs audit-trail approval - An automated process touches production systems and you want a manual gate ## How the flow works Set `approval_required: true` when creating or updating the queue. Your code enqueues a job normally. The job enters a `pending_approval` state instead of `waiting`. An `approval_record` row is created with `status: "pending"` and an `apv_<32hex>` ID. If you've configured a webhook, Rotor sends an `approval.pending` event to your endpoint with the approval ID, job ID, queue name, and payload. Via CLI, SDK, REST API, or the MCP `approve_job` tool in Claude Code. On approval, the job moves to `waiting` and your callback handler runs. On rejection, the job is marked `failed` with the rejection reason. ## Enable approvals on a queue ### Create a new queue with approvals ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); await rotor.queues.create({ name: "ai-outreach", callback_url: "https://your-app.example.com/rotor/ai-outreach", approval_required: true, defaultJobOptions: { attempts: 3 }, }); ``` ### Enable approvals on an existing queue ```typescript await rotor.queues.update("ai-outreach", { approval_required: true, }); ``` ## Listing pending approvals ### CLI ```bash rotor approvals list --status pending ``` ### SDK ```typescript const { approvals } = await rotor.approvals.list({ status: "pending", limit: 50, }); for (const apv of approvals) { console.log(apv.id); // apv_<32hex> console.log(apv.jobId); // the job waiting for approval console.log(apv.queueName); // which queue console.log(apv.payload); // the job payload — inspect before deciding console.log(apv.createdAt); // when the approval record was created } ``` ### Filter by queue ```typescript const { approvals } = await rotor.approvals.list({ status: "pending", queue: "ai-outreach", }); ``` ### Get a single approval ```typescript const apv = await rotor.approvals.get("apv_abc123..."); console.log(apv.payload); // inspect the job payload ``` ## Approving and rejecting ### CLI ```bash # Approve rotor approvals approve apv_abc123... # Reject rotor approvals reject apv_abc123... ``` ### SDK ```typescript // Approve — job proceeds to your callback handler await rotor.approvals.approve("apv_abc123..."); // Reject — job is marked failed await rotor.approvals.reject("apv_abc123..."); ``` ### REST API ```bash # Approve curl -X POST "https://api.rotor.sh/v1/approvals/apv_abc123.../approve" \ -H "Authorization: Bearer $ROTOR_API_KEY" # Reject curl -X POST "https://api.rotor.sh/v1/approvals/apv_abc123.../reject" \ -H "Authorization: Bearer $ROTOR_API_KEY" ``` ## The `approval.pending` webhook If you configure a webhook endpoint, Rotor sends a POST to it when an approval record is created. The payload type is `ApprovalPendingData`. ```typescript // Webhook payload shape type ApprovalPendingData = { approvalId: string; // apv_<32hex> jobId: string; // the waiting job queueName: string; // which queue payload: unknown; // the job payload — safe to log and display workspaceId: string; // your workspace }; ``` Example payload: ```json { "event": "approval.pending", "data": { "approvalId": "apv_a1b2c3d4...", "jobId": "job_e5f6g7h8...", "queueName": "ai-outreach", "payload": { "contactId": "cid_123", "subject": "Quick question about your stack", "body": "Hi Sarah, ..." }, "workspaceId": "ws_abc123" } } ``` Use this webhook to post a Slack message to your team, create a Notion entry, or trigger any other notification flow. Your webhook handler can embed a direct link to the approval using the `approvalId`. You can approve and reject directly from a Slack message by posting the `approvalId` back to the Rotor API via a Slack shortcut or button. The MCP `approve_job` tool makes this even easier from Claude Code. ## Using the MCP `approve_job` tool If you use the [Rotor MCP server](/docs/mcp/install), Claude Code gains an `approve_job` tool. From your terminal you can list pending approvals and approve or reject them without leaving your editor. ``` # In Claude Code > List pending approvals for the ai-outreach queue > Approve apv_abc123... > Reject apv_xyz789... — the subject line is too aggressive ``` This is the fastest review path for Claude Code-using teams: see the payload, make a call, continue building. See [MCP quickstart](/docs/guides/mcp-quickstart) to connect the MCP server. ## What happens after rejection A rejected job is marked `failed` with `reason: "rejected"`. It does not retry — rejection is terminal. The job appears in your failed jobs list with the rejection timestamp. If you need to re-attempt a rejected job, enqueue a new job with the corrected payload. Approvals pause the job indefinitely — there is no automatic timeout. If an approval sits pending for a long time, the job will not run until someone acts. Build a notification flow (webhook → Slack, email) so approvals don't silently stall. ## Filtering by status The `status` filter accepts: `pending`, `approved`, `rejected`. ```typescript // All approved approvals for audit purposes const { approvals } = await rotor.approvals.list({ status: "approved", queue: "ai-outreach", limit: 100, }); ``` Use `cursor` for pagination: ```typescript const page1 = await rotor.approvals.list({ status: "pending", limit: 20 }); const page2 = await rotor.approvals.list({ status: "pending", limit: 20, cursor: page1.nextCursor, }); ``` ## Next steps Run automated content checks before jobs ever reach the approval queue. Connect Claude Code to Rotor and approve jobs from your terminal. Configure webhook endpoints for `approval.pending` events. Combine approvals with recurring scheduled jobs. ### Guardrails URL: https://rotor.sh/docs/guides/guardrails Guardrails are a pre-dispatch content policy layer. Before Rotor calls your callback URL, it runs each job payload through a configurable pipeline of checks. A job that fails a check is blocked — it never reaches your handler. This protects you from: - Accidentally emailing opted-out contacts - Leaking PII in job payloads that flow into downstream systems - Referencing competitor names in AI-generated outreach - Sending off-brand copy that passed no human review Guardrails apply to **all queues in your workspace**. There is no per-queue override. ## The pipeline Four rule types run in order, cheapest first. Each stage short-circuits on block — if a job is blocked by the DNC check, the remaining stages don't run. | Stage | Rule type | Blocks or modifies | |-------|-----------|-------------------| | 1 | **DNC** — do-not-contact list | Blocks matching email or domain | | 2 | **Competitor block** — domain deny-list | Blocks jobs referencing competitor domains | | 3 | **PII** — personally identifiable information | Redacts or blocks based on `piiMode` | | 4 | **Brand tone** — LLM judge (Claude Haiku) | Blocks if score falls below rubric threshold | Running cheaper checks first (DNC and competitor are simple string lookups) means most blocked jobs exit in stage 1 or 2 without incurring LLM costs. ## Viewing your current config ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); const config = await rotor.guardrails.get(); console.log(config); // GuardrailConfig ``` The `GuardrailConfig` object has a `grc_` prefixed ID and contains the current state of all four rule types. ## Enabling each rule type ### DNC — do-not-contact list Block jobs where the payload contains an email address or domain on your DNC list. ```typescript await rotor.guardrails.update({ dncEmails: ["alice@example.com", "bob@example.com"], dncDomains: ["noemail.example.org"], }); ``` Add entries incrementally without re-posting the full list: ```typescript // Append a single unsubscribe await rotor.guardrails.appendDnc({ emails: ["newunsub@example.com"], }); // Append a domain block await rotor.guardrails.appendDnc({ domains: ["competitor-customer.io"], }); ``` Remove a single email: ```typescript await rotor.guardrails.removeDncEmail("alice@example.com"); ``` Use `appendDnc` for real-time unsubscribes — call it from your unsubscribe webhook handler so the contact is protected immediately without rebuilding the full list. ### Managing a large DNC list For bulk imports (thousands of emails), build the full list and pass it to `update`: ```typescript // Build from your CRM unsubscribe export const emails = await fetchUnsubscribedEmails(); // string[] await rotor.guardrails.update({ dncEmails: emails, }); ``` `update` replaces the entire DNC list for the fields you pass. If you pass `dncEmails`, the previous `dncEmails` value is replaced. If you only want to add entries, use `appendDnc` instead. ### Competitor block Block jobs that reference any of your competitor's domains anywhere in the payload. ```typescript await rotor.guardrails.update({ competitorDomains: ["rivalcrm.com", "othertool.io", "competitorapp.co"], }); ``` Rotor scans the serialized job payload for any occurrence of these domain strings. A job mentioning `rivalcrm.com` in a message body, subject line, or URL field is blocked. ### PII detection PII detection scans the payload for patterns like email addresses, phone numbers, SSNs, and credit card numbers. You choose the response mode. #### Redact mode Detected PII is replaced with a placeholder before the job is dispatched. The job runs with the redacted payload. ```typescript await rotor.guardrails.update({ piiRedactionEnabled: true, piiMode: "redact", }); ``` A phone number like `555-867-5309` becomes `[REDACTED-PHONE]` in the payload your handler receives. #### Block mode The job is blocked entirely if PII is detected. Use this when your handler should never receive raw PII. ```typescript await rotor.guardrails.update({ piiRedactionEnabled: true, piiMode: "block", }); ``` | Mode | Behavior | Use when | |------|---------|---------| | `redact` | Strips PII, job runs with sanitized payload | Handler is safe to receive data but PII shouldn't flow through | | `block` | Stops the job entirely | Handler must never see raw PII | ### Brand tone judge The brand tone judge uses Claude Haiku to score the job payload against a rubric you define. If the score falls below the threshold, the job is blocked. ```typescript await rotor.guardrails.update({ brandToneJudgeEnabled: true, brandToneRubric: "Professional and helpful. Do not use urgency language, scarcity claims, or manipulative CTAs. " + "Avoid phrases like 'Act now', 'Limited time', 'Don't miss out'. " + "Tone should feel like a knowledgeable peer, not a sales rep.", }); ``` Brand tone is stage 4 — it runs last. It only fires if the job passed DNC, competitor, and PII checks. This means you're only paying for LLM inference on payloads that are otherwise clean. #### Writing an effective rubric The rubric is a plain-text description of acceptable tone. Be specific: - Describe what good looks like ("conversational, concise, direct") - Explicitly call out what to avoid ("no urgency language", "no superlatives like 'best' or 'revolutionary'") - Include examples of acceptable and unacceptable phrases if needed Vague rubrics produce inconsistent results. Specific rubrics produce reliable blocks. ## When a job is blocked When any guardrail stage blocks a job, Rotor: 1. Sets the job status to `guardrail.blocked` 2. Fires a `guardrail.blocked` webhook event (if configured) 3. Records which stage blocked it and why The job does not reach your callback handler. It does not retry. ### The `guardrail.blocked` webhook payload ```json { "event": "guardrail.blocked", "data": { "jobId": "job_abc123...", "queueName": "ai-outreach", "blockedBy": "dnc", "reason": "Email address matched DNC list", "payload": { "to": "alice@example.com", "subject": "..." } } } ``` `blockedBy` is one of: `dnc`, `competitor`, `pii`, `brand_tone`. ## Use case: AI outbound agent A common Rotor pattern for GTM teams: an AI agent generates personalized outreach emails, enqueues them to an `ai-outreach` queue, and a callback handler sends via your ESP. Guardrails protect every job before it leaves the system. ```typescript // Configure guardrails once await rotor.guardrails.update({ // Block opted-out contacts dncEmails: await loadUnsubscribes(), dncDomains: await loadBouncedDomains(), // Never mention competitors by name competitorDomains: ["rivalcrm.com", "otherplatform.io"], // Redact any PII the LLM accidentally included piiRedactionEnabled: true, piiMode: "redact", // Score brand tone before send brandToneJudgeEnabled: true, brandToneRubric: "Helpful and direct. No urgency language. No superlatives. " + "Reads like a message from a thoughtful colleague, not marketing copy.", }); // Enqueue AI-generated emails — guardrails run automatically await rotor.jobs.enqueueBatch("ai-outreach", emails.map((email) => ({ payload: email })) ); ``` Any job that references an opted-out contact, a competitor, slips in a phone number, or fails the tone check is silently dropped — and you get a `guardrail.blocked` webhook so you can review and correct the source. Hook `guardrail.blocked` events into your Slack channel. If the brand tone judge starts blocking frequently, it usually means your prompt has drifted — review the rubric and your system prompt together. ## Full config reference | Field | Type | Description | |-------|------|-------------| | `dncEmails` | `string[]` | Email addresses to block | | `dncDomains` | `string[]` | Domains to block | | `competitorDomains` | `string[]` | Domains that trigger a competitor block | | `piiRedactionEnabled` | `boolean` | Whether PII detection is active | | `piiMode` | `"redact" \| "block"` | What to do when PII is detected | | `brandToneJudgeEnabled` | `boolean` | Whether brand tone scoring is active | | `brandToneRubric` | `string` | Plain-text description of acceptable tone | ## Next steps Add a human gate after guardrails pass — for final review before send. Get notified when a job terminally fails after all retries. Label blocked jobs with tags to track guardrail hit rates by campaign. Full REST reference for guardrail endpoints. ### Webhooks URL: https://rotor.sh/docs/guides/webhooks Rotor POSTs a signed JSON payload to your endpoint whenever a lifecycle event occurs. You register one or more webhook endpoints and choose which event types each one receives. ## Event types | Event | When it fires | |---|---| | `job.completed` | A job finishes successfully | | `job.failed` | A job exhausts all retry attempts and enters terminal failure | | `job.stalled` | A job's lock expires while it was active (usually a crashed worker) | | `approval.pending` | A job step is waiting for human approval | | `approval.approved` | An approval request was approved | | `approval.rejected` | An approval request was rejected | | `guardrail.blocked` | A guardrail rule blocked a job from executing | | `queue.drained` | A queue's waiting count dropped to zero | | `queue.error` | A queue-level error occurred (e.g. Redis unreachable) | | `webhook.test` | Fired by `rotor.webhooks.test()` to verify your endpoint | ## Create a webhook endpoint ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); const webhook = await rotor.webhooks.create({ url: "https://yourapp.com/webhooks/rotor", eventTypes: ["job.completed", "job.failed", "approval.pending"], }); console.log(webhook.id); // wh_<32hex> console.log(webhook.secret); // shown ONCE — store it in your secrets manager immediately ``` The webhook secret is returned only at creation time. If you lose it, delete the webhook and create a new one. Store the secret in an environment variable (`ROTOR_WEBHOOK_SECRET`) before the creation response goes out of scope. ## Verify signatures Every request Rotor sends includes three headers: | Header | Contents | |---|---| | `Rotor-Webhook-Id` | Unique delivery ID | | `Rotor-Webhook-Timestamp` | Unix timestamp of the delivery | | `Rotor-Webhook-Signature` | `v1,{base64(HMAC-SHA256(secret, "{id}.{timestamp}.{body}"))}` | Rotor uses a Svix-compatible HMAC-SHA256 scheme. Verify with the SDK helper: ```typescript import { verifyRotorSignature } from "@rotorsh/sdk"; import type { IncomingMessage, ServerResponse } from "http"; export async function POST(req: Request): Promise { const body = await req.text(); const id = req.headers.get("rotor-webhook-id")!; const timestamp = req.headers.get("rotor-webhook-timestamp")!; const signature = req.headers.get("rotor-webhook-signature")!; let event: unknown; try { event = verifyRotorSignature(body, signature, process.env.ROTOR_WEBHOOK_SECRET!, { id, timestamp, maxAgeSeconds: 300, // reject deliveries older than 5 minutes }); } catch (err) { return new Response("Invalid signature", { status: 401 }); } // event is now typed and verified — handle it await handleRotorEvent(event as RotorWebhookEvent); return new Response("OK", { status: 200 }); } ``` Always return `200` quickly — do your processing asynchronously or in a background task. If your endpoint takes longer than 10 seconds to respond, Rotor treats the delivery as a timeout and schedules a retry. ## Handle specific events ### job.completed ```typescript import type { RotorWebhookEvent } from "@rotorsh/sdk"; async function handleRotorEvent(event: RotorWebhookEvent) { switch (event.type) { case "job.completed": { const { jobId, queueName, result, completedAt } = event.data; console.log(`Job ${jobId} in ${queueName} finished:`, result); // update your DB, trigger downstream steps, etc. break; } case "job.failed": { const { jobId, queueName, failReason, attempts } = event.data; console.error(`Job ${jobId} failed after ${attempts} attempts: ${failReason}`); // send a Slack alert, open a support ticket, etc. break; } case "approval.pending": { const { jobId, approvalId, queueName, payload } = event.data; // notify a reviewer — e.g. post to Slack await notifyReviewer({ approvalId, payload }); break; } } } ``` ### Pipe job events to Slack ```typescript case "job.failed": { const { jobId, queueName, failReason } = event.data; await fetch(process.env.SLACK_WEBHOOK_URL!, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ text: `❌ Job \`${jobId}\` failed in \`${queueName}\`\n>${failReason}`, }), }); break; } ``` ### Route approval.pending to a bot ```typescript case "approval.pending": { const { approvalId, queueName, payload } = event.data; await fetch("https://yourapp.com/internal/approval-bot", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ approvalId, queueName, payload }), }); break; } ``` ## Test a webhook endpoint Send a `webhook.test` event to confirm your endpoint is reachable and signature verification works: ```typescript await rotor.webhooks.test("wh_abc123def456..."); ``` Your endpoint will receive a payload like: ```json { "type": "webhook.test", "webhookId": "wh_abc123def456...", "data": { "message": "This is a test event from Rotor.", "timestamp": "2026-05-20T10:00:00.000Z" } } ``` Verify your endpoint returns `200` and that signature verification passes before relying on it in production. ## Retry behavior If your endpoint returns a non-2xx status code, times out, or is unreachable, Rotor retries with exponential backoff — up to **5 attempts over approximately 3 hours**. After 5 failed attempts, the delivery is abandoned and marked as permanently failed. | Attempt | Approximate delay | |---|---| | 1 | Immediate | | 2 | ~30 seconds | | 3 | ~5 minutes | | 4 | ~30 minutes | | 5 | ~2 hours | Delivery failures do not affect your job execution. Your jobs continue running — only the webhook notification is retried. ## Manage endpoints ```typescript // List all webhook endpoints const webhooks = await rotor.webhooks.list(); // Get a specific endpoint const webhook = await rotor.webhooks.get("wh_abc123def456..."); console.log(webhook.url, webhook.eventTypes, webhook.createdAt); // Delete an endpoint await rotor.webhooks.delete("wh_abc123def456..."); ``` Deleting a webhook stops all future deliveries immediately. In-flight deliveries that have already started may still complete. ## callback_status on schedule runs When a schedule fires a job and that job has a callback configured, the `callback_status` field on the run record reflects the webhook delivery outcome: | Value | Meaning | |---|---| | `pending` | Delivery has not been attempted yet | | `2xx` | Endpoint responded with a success status | | `4xx` | Endpoint responded with a client error | | `5xx` | Endpoint responded with a server error | | `timeout` | Endpoint did not respond within the timeout window | | `null` | No callback configured for this run | ### Idempotency URL: https://rotor.sh/docs/guides/idempotency BullMQ delivers jobs **at least once**. If a worker crashes mid-execution, loses its Redis lock, or restarts, the job is redelivered and your handler runs again. Without idempotency guards, this causes duplicate emails, double charges, and phantom records. Rotor gives you two independent layers of protection: 1. **Enqueue dedup** — the same `jobId` is never enqueued twice. BullMQ silently discards the second submission. 2. **Handler dedup** — your handler checks a DB record before doing work, preventing double execution even when the same job is delivered more than once. Use both layers. Enqueue dedup prevents the common case; handler dedup is the safety net for redeliveries that slip through. ## Layer 1 — Explicit jobId at enqueue Pass a deterministic `jobId` when enqueueing. BullMQ deduplicates at the queue level: submitting the same `jobId` again is a no-op. ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); const job = await rotor.jobs.enqueue("outreach", { payload: { contactId: "cid_123", campaignId: "cmp_456" }, jobId: "outreach:cmp_456:cid_123", // deterministic key }); ``` If `outreach:cmp_456:cid_123` is already in the queue, BullMQ returns without creating a new job. No error is thrown; the call is silent. Build your `jobId` from the combination of entities that must be unique: e.g. `{queue}:{campaignId}:{contactId}`. This makes duplicate submissions safe regardless of how they occur — retried HTTP requests, double-clicks, or re-triggered workflows. ## Batch with idempotency keys For batch enqueue, pass an `idempotencyKeys` array alongside your jobs. The array length must match the jobs array length, and keys must be unique within the batch. ```typescript const contacts = [ { id: "cid_001", email: "alice@example.com" }, { id: "cid_002", email: "bob@example.com" }, { id: "cid_003", email: "carol@example.com" }, ]; await rotor.jobs.addBatch( "outreach", contacts.map((c) => ({ payload: { contactId: c.id, email: c.email }, })), { idempotencyKeys: contacts.map((c) => `outreach:cmp_456:${c.id}`), } ); ``` Rules for `idempotencyKeys`: - Length must equal the number of jobs in the batch - No duplicates within a single batch call - Cannot conflict with explicit `jobId` fields on individual jobs in the same call - Re-submitting the same key is a silent no-op — BullMQ deduplicates ## Layer 2 — Handler-level dedup Enqueue dedup prevents most duplicates. Handler dedup catches redeliveries: cases where the job was enqueued once but executed more than once (e.g. worker crash after partial execution). The pattern: check your DB for `rotorJobId` before doing work; write it after. ```typescript import { RotorWorker } from "@rotorsh/sdk"; const worker = new RotorWorker({ workspaceId: process.env.WORKSPACE_ID!, queueName: "outreach", connection: process.env.REDIS_URL!, concurrency: 5, processor: async (job) => { const { contactId } = job.data.payload; // Check if this exact job execution already completed const existing = await db.outreachSent.findUnique({ where: { rotorJobId: job.id }, }); if (existing) { return { skipped: true, reason: "already-executed" }; } // Do the work await sendWelcomeEmail(contactId); // Record completion — use upsert in case of a tight race await db.outreachSent.upsert({ where: { rotorJobId: job.id }, create: { rotorJobId: job.id, contactId, sentAt: new Date() }, update: {}, // already exists — ignore }); return { sent: true }; }, }); await worker.ready(); ``` ### Schema for the dedup table ```sql CREATE TABLE outreach_sent ( id SERIAL PRIMARY KEY, rotor_job_id TEXT NOT NULL, contact_id TEXT NOT NULL, sent_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (rotor_job_id) -- enforces dedup at DB level ); ``` Or with a compound unique constraint to scope dedup per workspace: ```sql UNIQUE (workspace_id, rotor_job_id) ``` ## The Idempotency-Key header When making HTTP API calls to Rotor directly (not via the SDK), include an `Idempotency-Key` header. The SDK parses this with `parseIdempotencyKey`: ```typescript import { parseIdempotencyKey } from "@rotorsh/sdk"; // In an API route handler const idempotencyKey = parseIdempotencyKey(req.headers); if (idempotencyKey) { await rotor.jobs.enqueue("outreach", { payload: { contactId }, jobId: idempotencyKey, // use the caller's key as the jobId }); } ``` This lets upstream callers (e.g. webhook consumers, frontend clients) pass their own idempotency keys through to Rotor. ## RotorIdempotencyConflict In strict mode, submitting a duplicate idempotency key throws `RotorIdempotencyConflict` (HTTP 409) instead of silently deduplicating. Handle it explicitly when you need to distinguish a genuine duplicate from a new submission: ```typescript import { Rotor, RotorIdempotencyConflict } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); try { await rotor.jobs.enqueue("outreach", { payload: { contactId: "cid_123" }, jobId: "outreach:cmp_456:cid_123", }); } catch (err) { if (err instanceof RotorIdempotencyConflict) { // A job with this key already exists // Safe to treat as success — the job is already queued console.log("Duplicate submission — job already enqueued"); return; } throw err; } ``` ## The 24-hour replay window Rotor does not maintain an application-state dedup table for you. Handler dedup — the DB record you write after executing — is your responsibility. A robust pattern is a 24-hour replay table: any `rotorJobId` processed in the last 24 hours is recorded and checked at handler entry. Clean up records older than 24 hours with a scheduled job or DB TTL. ```typescript // Check and record in one atomic upsert const { created } = await db.jobReplay.upsert({ where: { rotorJobId: job.id }, create: { rotorJobId: job.id, processedAt: new Date() }, update: {}, // already exists — do nothing select: { created: true }, // Prisma: true if INSERT, false if no-op }); if (!created) { return { skipped: true, reason: "replay-deduplicated" }; } // Proceed with actual work ``` The replay table prevents double-execution when a job is redelivered within the retention window. Records older than your window can be pruned — Rotor guarantees at-least-once, not unbounded redelivery. ## Practical example — welcome emails without double-sending A new user signs up. Your app enqueues a welcome email job. If your worker crashes after calling the email API but before the job is marked complete, BullMQ redelivers and your handler runs again. Without idempotency, the user gets two welcome emails. With two-layer defense: ```typescript import { Rotor, RotorWorker } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); // At signup — deterministic jobId prevents double-enqueue await rotor.jobs.enqueue("welcome-email", { payload: { userId: user.id, email: user.email }, jobId: `welcome:${user.id}`, // one welcome email per userId, ever }); // Worker — handler dedup prevents double-execution on redelivery const worker = new RotorWorker({ workspaceId: process.env.WORKSPACE_ID!, queueName: "welcome-email", connection: process.env.REDIS_URL!, concurrency: 10, processor: async (job) => { const { userId, email } = job.data.payload; const alreadySent = await db.welcomeEmailLog.findUnique({ where: { rotorJobId: job.id }, }); if (alreadySent) { return { skipped: true }; } await emailClient.send({ to: email, template: "welcome", data: { userId }, }); await db.welcomeEmailLog.create({ data: { rotorJobId: job.id, userId, sentAt: new Date() }, }); return { sent: true, userId }; }, }); await worker.ready(); ``` Layer 1 (`jobId: "welcome:{userId}"`) ensures a second signup event or retry of the enqueue call never creates a second job. Layer 2 (`welcomeEmailLog` check) ensures a redelivered job never sends a second email. ### Job Tags URL: https://rotor.sh/docs/guides/job-tags Tags are short string labels you attach to a job at enqueue time. They pass through to the dashboard, the REST API, and the archived job history — letting you filter and group runs without embedding metadata in your job payload. ## Add tags when enqueueing ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); const job = await rotor.jobs.enqueue("outreach", { payload: { contactId: "cid_123" }, tags: ["campaign:q2-outbound", "contact:cid_123", "env:production"], }); ``` Tags are returned in the response: ```json { "id": "job_abc", "state": "waiting", "tags": ["campaign:q2-outbound", "contact:cid_123", "env:production"] } ``` **Limits:** up to 10 tags per job, each tag max 64 characters. ## Batch enqueue with tags Each job in a batch can carry its own tags: ```typescript await rotor.jobs.enqueueBatch("outreach", [ { payload: { contactId: "cid_001" }, tags: ["campaign:q2-outbound", "contact:cid_001"] }, { payload: { contactId: "cid_002" }, tags: ["campaign:q2-outbound", "contact:cid_002"] }, ]); ``` ## Retrieve tags on a job ```typescript const job = await rotor.jobs.get("outreach", "job_abc"); console.log(job.tags); // ["campaign:q2-outbound", "contact:cid_123", "env:production"] ``` ## Filter the job list by tag ```typescript const jobs = await rotor.jobs.list("outreach", { tag: "campaign:q2-outbound" }); ``` Or via REST: ```bash curl "https://api.rotor.sh/v1/queues/outreach/jobs?tag=campaign%3Aq2-outbound" \ -H "Authorization: Bearer $ROTOR_API_KEY" ``` ## Tags in the dashboard The **Runs** page shows tags as grey pill badges on each row. You can scan which jobs belong to a campaign at a glance — no payload inspection required. ## Tags are archived When a job completes or fails, its tags are written to the `job_history` table alongside the result. Historical tag data is retained for your plan's full retention window (7d free, 30d pro, 90d team). ## Naming conventions There is no enforced format. Common patterns: | Pattern | Example | Use case | |---------|---------|----------| | `key:value` | `campaign:q2-outbound` | Group by named campaign | | `entity:id` | `contact:cid_123` | Link a job to a specific record | | `env:name` | `env:production` | Distinguish prod vs staging runs | | `source:name` | `source:hubspot` | Track the system that triggered the job | Tags are for filtering and grouping — not for routing or conditional logic. If you need jobs to behave differently based on a value, put that value in the job payload, not in tags. ### Concurrency Keys URL: https://rotor.sh/docs/guides/concurrency-keys A concurrency key is a string you attach to a job that guarantees at most one job with that key is processing at a time. If a second job with the same key is dispatched while the first is still running, it waits — it gets delayed 10 seconds and retried automatically. This is the n8n equivalent of "don't run this workflow if it's already running for this contact." ## Typical use case You're running contact enrichment. You fan out 500 jobs — one per contact. Without concurrency keys, if the same contact appears in two lists, two enrichment jobs can run simultaneously, causing duplicate API calls and conflicting writes. ```typescript import { Rotor } from "@rotorsh/sdk"; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); await rotor.jobs.enqueue("enrichment", { payload: { contactId: "cid_123" }, concurrencyKey: "contact:cid_123", }); ``` Any subsequent job with `concurrencyKey: "contact:cid_123"` will wait until the first completes or fails. ## How it works 1. When a job is dispatched to your callback URL, Rotor attempts to acquire a Redis lock: `SET NX EX 300 {workspace}:ckey:{concurrencyKey} {jobId}`. 2. If the lock is **free**, the job proceeds normally. The lock is released when the job completes or fails. 3. If the lock is **held** (another job with the same key is in flight), the current job is delayed by 10 seconds and Rotor retries it automatically. This repeats until the lock is free. 4. A 300-second safety-net TTL ensures the lock is always released, even if the worker crashes mid-job. Concurrency keys are scoped to your workspace — there is no cross-workspace collision. ## Batch enqueue ```typescript const contacts = ["cid_001", "cid_002", "cid_003"]; await rotor.jobs.enqueueBatch("enrichment", contacts.map((id) => ({ payload: { contactId: id }, concurrencyKey: `contact:${id}`, })) ); ``` Each contact gets its own lock. Two contacts can enrich in parallel; the same contact cannot. ## REST API ```bash curl -X POST "https://api.rotor.sh/v1/queues/enrichment/jobs" \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "payload": { "contactId": "cid_123" }, "concurrencyKey": "contact:cid_123" }' ``` ## Key naming Any string up to 256 characters. Common patterns: | Pattern | Example | |---------|---------| | Entity ID | `contact:cid_123` | | Account + action | `account:acc_456:sync` | | User + resource | `user:usr_789:report` | Concurrency keys only apply to **callback-mode queues** (queues with a `callback_url`). They have no effect on durable workflow jobs. ## Combining with tags You can use both on the same job: ```typescript await rotor.jobs.enqueue("enrichment", { payload: { contactId: "cid_123" }, tags: ["campaign:q2-outbound", "contact:cid_123"], concurrencyKey: "contact:cid_123", }); ``` Tags are for filtering and visibility; the concurrency key is for mutual exclusion. They serve different purposes. ### Failure Alerts URL: https://rotor.sh/docs/guides/failure-alerts Rotor can notify you when a job fails after all retry attempts are exhausted. Alerts fire per workspace, not per queue — one configuration covers all your queues. ## Configure alerts Go to **Dashboard → Settings → Notifications**. You can configure either or both: - **Email** — enter any email address. Alerts are sent from `alerts@rotor.sh`. - **Slack incoming webhook** — paste a webhook URL from your Slack app. Messages arrive as a formatted Block Kit card. Save. That's it. ## What triggers an alert Alerts fire on **terminal failure** only — when a job exhausts all retry attempts and moves to `failed` state. They do not fire on intermediate retry attempts. If a queue has `attempts: 3` and a job fails on attempt 1 and 2 (and retries), no alert is sent. The alert fires when attempt 3 fails. ## Slack message format ``` ❌ Job failed in `enrichment` Reason: Connection timeout after 30s Job ID: job_abc123 Queue: enrichment [View run →] ``` The "View run" link opens the job directly in your Rotor dashboard. ## Create a Slack incoming webhook 1. Go to **api.slack.com/apps** and create or open an app. 2. Enable **Incoming Webhooks** under Features. 3. Click **Add New Webhook to Workspace** and choose a channel. 4. Copy the `https://hooks.slack.com/services/...` URL. 5. Paste it into **Settings → Notifications → Slack Incoming Webhook**. The Slack webhook URL must start with `https://hooks.slack.com/`. Rotor validates this on save and rejects anything else. ## Email alerts Email alerts are sent from `alerts@rotor.sh` with the subject: ``` Job failed: {jobName} in queue {queueName} ``` The body includes the queue name, job ID, failure reason, and a direct link to the run in your dashboard. Add `alerts@rotor.sh` to your contacts to prevent spam filtering. ## Testing alerts The easiest way to test: enqueue a job to a queue whose callback URL returns a 5xx response, with `attempts: 1`. ```typescript await rotor.jobs.enqueue("test-queue", { payload: { test: true }, }); ``` With `attempts: 1`, the first failure is terminal and triggers the alert immediately. Failure alerts are workspace-level. If you want different channels for different queues, that's a future feature — today it's one config per workspace. ### Callback mode vs Durable Workflows URL: https://rotor.sh/docs/guides/callback-vs-workflows Rotor has two ways to run your code. Knowing which to use will save you time. ## Quick answer | If you want to… | Use | |-----------------|-----| | Call an existing HTTP endpoint you already have | **Callback mode** | | Build a new multi-step workflow with retries, sleeps, and waits | **Durable Workflows** | | Replace n8n / Make / Zapier flows | **Callback mode** | | Replace Inngest / Temporal / Trigger.dev workflows | **Durable Workflows** | | Keep your code on your own server | **Callback mode** | | Write workflow logic once without managing retry state | **Durable Workflows** | --- ## Callback mode You own an HTTP endpoint. Rotor calls it with the job payload. Your endpoint runs your logic and returns 2xx. That's it. ``` [Job enqueued] → Rotor → POST your-app.example.com/handler → [Your code runs] ``` Rotor handles: delivery, retries on non-2xx, dead-letter queue, HMAC signature verification, concurrency limits, cron scheduling. You handle: running a server, verifying the signature, writing idempotent handlers. **When to use it:** - You already have an HTTP server (Next.js API route, Hono, Express, anything) - You're replacing scheduled Lambda functions, Vercel Cron handlers, or n8n/Make nodes - You want Rotor to act as the queue and scheduler while your app stays stateless - The job is a single unit of work (enrich this contact, send this email, sync this record) ```typescript // Your handler — receives a signed POST for each job app.post("/rotor/enrichment", async (c) => { const body = await c.req.text(); if (!verifyRotorSignature(body, c.req.header("x-rotor-signature"), secret)) { return c.text("unauthorized", 401); } const { contactId } = JSON.parse(body); await enrichContact(contactId); return c.json({ ok: true }); }); ``` See the [Quickstart](/docs/quickstart) for a full walkthrough. --- ## Durable Workflows You write a TypeScript function with `step.*` calls. Rotor checkpoints each step to Postgres. If the function retries, already-completed steps return their cached results — no double-execution. ``` [Event published] → Rotor → [Step 1 runs] → [Checkpoint] → [Step 2 runs] → [Checkpoint] → ... ``` Rotor handles: checkpointing, replay on retry, sleeping without occupying a worker, waiting for external events. You handle: writing the workflow function, deploying a small server that `serveWorkflow()` runs on. **When to use it:** - The job has multiple steps with independent retry budgets - You need `step.sleep("24h")` — pause for a day without a running process - You need `step.waitForEvent` — pause until something happens externally (email opened, payment received) - You need `step.waitForSignal` — pause until a human approves - You need `step.invoke` — fan out to sub-workflows and collect results ```typescript export const outreachSequence = createFunction( { id: "outreach-sequence", trigger: { event: "contact.added" } }, async ({ event, step }) => { const profile = await step.run("fetch-profile", () => fetchContact(event.data.contactId) ); await step.sleep("warm-up", "24h"); // no worker occupied during this wait await step.run("send-email", () => sendEmail(profile.email, "intro") ); const opened = await step.waitForEvent<{ at: string }>("wait-open", { event: "email.opened", match: `data.contactId == "${event.data.contactId}"`, timeout: "7d", }); if (opened) { await step.run("send-followup", () => sendEmail(profile.email, "followup")); } } ); ``` See the [Workflows Quickstart](/docs/workflows/quickstart) for setup. --- ## Can you use both? Yes. They're independent. A common pattern: - Use **callback mode** for high-volume single-step jobs (contact enrichment, HubSpot sync) - Use **durable workflows** for multi-step sequences that need to sleep, wait, or branch A durable workflow step (`step.run`) can enqueue a callback-mode job via the SDK if needed — for example, delegating the heavy lifting of an enrichment step to a callback queue that runs with higher concurrency. --- ## Decision checklist Pick callback mode if **all** of these are true: - [ ] The job is a single unit of work (one action, one response) - [ ] You already have or want a simple HTTP server - [ ] You don't need to pause mid-job for more than a retry delay Pick durable workflows if **any** of these are true: - [ ] The job has 2+ sequential steps that should retry independently - [ ] You need to wait hours or days between steps - [ ] You need to pause until an external event (user action, approval, webhook) arrives - [ ] You're building something that would require a state machine if you wrote it yourself --- ## MCP ### Install the Rotor MCP server URL: https://rotor.sh/docs/mcp/install # Install the Rotor MCP server Add **27 workspace-scoped tools** to Claude Code, Cursor, or Claude Desktop. Once installed, you can prompt Claude to list queues, create schedules, replay DLQ jobs, and more — all scoped to your Rotor workspace API key. Example prompts after install: - _"List my Rotor queues"_ - _"Create a schedule named nightly-export that runs at 02:00 UTC"_ - _"Replay DLQ job abc-123 from the outbound queue"_ **Tools are workspace-scoped.** Your API key (`rt_ws_*`) grants access only to the workspace it was created in. These tools do NOT have access to other workspaces or operator-level admin resources. ## Prerequisites - **Node 18+** on PATH (required for `npx`) - **A workspace API key** — prefix `rt_ws_*`. Generate one: ```bash rotor api-keys create --label "claude-code" ``` (CLI install: `npm i -g @rotorsh/cli`) - **One of:** Claude Code 1.x, Claude Desktop, or Cursor --- ## Install ### Claude Code Paste the following into `~/.claude/.mcp.json` (create the file if it does not exist). Replace `rt_ws_` with your actual workspace API key. Save and restart Claude Code. ```json { "mcpServers": { "rotor": { "command": "npx", "args": ["-y", "@rotorsh/mcp"], "env": { "ROTOR_API_KEY": "rt_ws_" } } } } ``` ### Claude Desktop Open your Claude Desktop config file: - **macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json` - **Windows:** `%APPDATA%\Claude\claude_desktop_config.json` - **Linux:** `~/.config/Claude/claude_desktop_config.json` Add or merge the following into the `mcpServers` key, then fully quit and reopen Claude Desktop. ```json { "mcpServers": { "rotor": { "command": "npx", "args": ["-y", "@rotorsh/mcp"], "env": { "ROTOR_API_KEY": "rt_ws_" } } } } ``` ### Cursor Open `~/.cursor/mcp.json` (create if it does not exist), add the following, then restart Cursor. ```json { "servers": { "rotor": { "command": "npx", "args": ["-y", "@rotorsh/mcp"], "env": { "ROTOR_API_KEY": "rt_ws_" } } } } ``` --- ## Verify the connection Fully quit and reopen Claude Code, Claude Desktop, or Cursor after saving the config file. Type: **"List my Rotor queues"** Claude should invoke `rotor_list_queues` and return your workspace queues. If you have no queues yet, the response is `{ queues: [] }` — that is expected and confirms the connection is working. --- ## Your first prompt: create a queue from Claude Once your client picks up the MCP config, type a prompt like this in any new conversation: > Create a Rotor queue called `welcome-emails` with concurrency 5 and 3 retry attempts. Claude should invoke the `rotor_create_queue` tool with arguments matching: ```json { "name": "welcome-emails", "concurrency": 5, "retry_attempts": 3 } ``` The tool returns the created queue object — confirm in your terminal with: ```bash rotor queues list # welcome-emails should appear with concurrency 5 ``` This proves three things at once: - The MCP server is connected (tool resolution). - Your `ROTOR_API_KEY` is valid (server-side auth succeeded). - The CLI and MCP share the same backend (queue created via MCP appears in CLI list). If Claude picks a different tool (`rotor_list_queues` instead of `rotor_create_queue`), reword the prompt with stronger imperatives: "Create a new Rotor queue named welcome-emails..." Tool selection improves with explicit verbs. --- ## Available tools 27 workspace-scoped tools organized by category: **Queues** — `rotor_list_queues`, `rotor_get_queue`, `rotor_create_queue`, `rotor_update_queue`, `rotor_set_queue_state` (pause/resume/active), `rotor_drain_queue`, `rotor_delete_queue` **Schedules** — `rotor_list_schedules`, `rotor_get_schedule`, `rotor_create_schedule`, `rotor_update_schedule`, `rotor_delete_schedule`, `rotor_force_fire_schedule`, `rotor_inspect_run_history` **Jobs** — `rotor_list_jobs`, `rotor_get_job`, `rotor_add_job`, `rotor_retry_job`, `rotor_delete_job` **DLQ** — `rotor_get_dlq` (list failed jobs), `rotor_replay_dlq` (idempotent replay) **Status** — `rotor_get_status` (workspace health) **API Keys** — `rotor_list_api_keys`, `rotor_create_api_key`, `rotor_revoke_api_key` **Workspaces** — `rotor_list_workspaces`, `rotor_get_workspace` For full input schema on any tool, call `tools/list` from your MCP client. --- ## Troubleshooting | Symptom | Resolution | |---------|------------| | "Tool not found" after restart | Verify Claude Code can spawn `npx` — run `npx --version` in a terminal. Check the config file syntax is valid JSON. | | "401 Unauthorized" | Rotate your API key: `rotor api-keys create --label "claude-code-new"`. Ensure the key starts with `rt_ws_` (workspace-scoped), not `rt_q_` (queue-scoped). | | Cold-start latency (>10s first call) | Install globally to skip `npx` download: `npm i -g @rotorsh/mcp`, then replace `"command": "npx"` with the absolute path to the `rotor-mcp` binary (find it via `which rotor-mcp`). | | stdout corruption / JSON parse errors | This is a known guard (Pitfall 6 — all logs go to stderr). If you see parse errors, please file an issue — it indicates an unexpected `console.log` in the server. | --- ## Next steps - [Quickstart guide](/docs/quickstart) — build your first queue-backed workflow - [Connection guide](/mcp/connection-guide) — the hosted Streamable-HTTP MCP for workflow tools (`trigger_workflow`, `approve_run`, etc.) ### MCP Connection Guide URL: https://rotor.sh/docs/mcp/connection-guide # MCP Connection Guide **MCP v2 ships workflow-led tools.** The previous job-shaped tools were removed. If your agent or client referenced them, see [the v2 migration guide](/mcp/v2-migration). The rotor.sh MCP server exposes **12 tools** that let AI agents trigger workflows, inspect runs, manage queues, and approve pending operations. Workflows are first-class. Runs are how you observe them. ## Prerequisites 1. A rotor.sh account — [sign up free](https://rotor.sh) 2. An API key — generate one at [rotor.sh/dashboard/api-keys](https://rotor.sh/dashboard/api-keys) Your key looks like `rt_ws__`. --- ## Claude Code (recommended) Claude Code supports MCP servers natively via the `claude mcp add` command. ```bash claude mcp add rotor https://rotor.sh/mcp \ --header "Authorization: Bearer $ROTOR_API_KEY" ``` Replace `$ROTOR_API_KEY` with your actual key, or set the environment variable first: ```bash export ROTOR_API_KEY=rt_ws_your_key_here claude mcp add rotor https://rotor.sh/mcp \ --header "Authorization: Bearer $ROTOR_API_KEY" ``` Verify the connection: ```bash claude mcp list # Output should include: # rotor: https://rotor.sh/mcp (12 tools) ``` ### Available tools | Tool | Description | |------|-------------| | `list_workflows` | List the workspace's registered workflow functions | | `trigger_workflow` | Fire-and-forget trigger of a workflow. Returns `run_id`. | | `invoke_workflow` | Synchronous wait. Polls until the run completes or times out. | | `get_run` | Get full run detail including step graph | | `list_runs` | List runs with filters by workflow, status, and time | | `cancel_run` | Cancel a running workflow. Idempotent on terminal runs. | | `upsert_queue` | Create or update a BullMQ-backed queue | | `set_queue_state` | Pause, resume, or drain a queue | | `drain_queue` | Destructively drain a queue (requires `confirm: true`) | | `queue_stats` | Get job counts and throughput for a queue | | `upsert_schedule` | Create or update a cron schedule | | `approve_run` | Approve or reject a pending workflow approval | --- ## Claude Desktop Claude Desktop reads MCP server configuration from a JSON file. - **macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json` - **Windows:** `%APPDATA%\Claude\claude_desktop_config.json` - **Linux:** `~/.config/Claude/claude_desktop_config.json` Paste the following into `mcpServers` (create the key if the file does not have it): ```json { "mcpServers": { "rotor": { "url": "https://rotor.sh/mcp", "headers": { "Authorization": "Bearer rt_ws_your_key_here" } } } } ``` Replace `rt_ws_your_key_here` with your actual API key. Fully quit and reopen Claude Desktop. You should see the rotor tools available in the tools panel. --- ## Cursor Cursor supports MCP via its settings panel or a config file. ### Via Cursor Settings UI 1. Open **Cursor Settings** (`Cmd/Ctrl + ,`) 2. Navigate to **Features → Model Context Protocol** 3. Click **Add MCP Server** 4. Enter: - **Name:** `rotor` - **URL:** `https://rotor.sh/mcp` - **Auth Header:** `Authorization: Bearer rt_ws_your_key_here` ### Via Config File Add to your Cursor MCP config (usually `~/.cursor/mcp.json`): ```json { "servers": { "rotor": { "url": "https://rotor.sh/mcp", "headers": { "Authorization": "Bearer rt_ws_your_key_here" } } } } ``` Restart Cursor after saving. --- ## Verify the connection Regardless of which client you use, verify the MCP server is responding by asking the AI assistant: > "List my rotor workflows" or > "Trigger the lead-enrichment workflow with email lead@acme.com" or > "What was the result of run abc-123?" The assistant should call the appropriate tool and return data from your workspace. --- ## Troubleshooting **Tools not appearing after connecting:** - Ensure your API key starts with `rt_ws_` (workspace-scoped key required) - Check the key is not expired — generate a fresh one from [dashboard/api-keys](https://rotor.sh/dashboard/api-keys) - Restart the AI client after changing the config **401 Unauthorized errors:** - Verify the `Authorization: Bearer ...` header is correctly formatted - Ensure no extra whitespace around the key value **Connection timeout:** - The MCP endpoint is `https://rotor.sh/mcp` (not the API base URL `https://api.rotor.sh`) - Check your network is not blocking SSE connections (some corporate proxies do) --- ## What changed in v2 The MCP surface was redesigned to put workflows at the center. If you used the previous job-shaped tools, the [v2 migration guide](/mcp/v2-migration) maps every removed tool to its replacement. **Need help?** Open an issue on [GitHub](https://github.com/shyftai/rotor) or reach out in the [Anthropic Discord](https://discord.gg/anthropic) `#rotor` channel. --- ## Claude Code Plugin ### Claude Code Plugin Install URL: https://rotor.sh/docs/skill/install # Claude Code Plugin Install The rotor.sh Claude Code plugin adds the `/rotor:new-workflow` slash command and agent rules to Claude Code, letting you scaffold GTM workflows without leaving your editor. ## Quick Install (Marketplace) ```bash /plugin marketplace add rotor-sh/claude-plugin /plugin install rotor ``` That's it. Restart Claude Code and run `/rotor:new-workflow outbound-with-approvals` to scaffold your first workflow. --- ## Manual Install (Fallback) If the marketplace is unavailable or you prefer local control: ```bash git clone https://github.com/rotor-sh/claude-plugin ~/.claude/plugins/rotor ``` Then in Claude Code: ```bash /plugin install ~/.claude/plugins/rotor ``` --- ## What Gets Installed After installing, Claude Code gains: | Feature | Description | |---------|-------------| | `/rotor:new-workflow` | Skill that scaffolds starter templates via `rotor template apply` | | Agent rules | Copies `@rotorsh/sdk/CLAUDE.md` to your project root when you scaffold | | MCP suggestion | Guides you through `claude mcp add rotor ...` if not already wired | --- ## Using `/rotor:new-workflow` ``` /rotor:new-workflow ``` Available template names: | Template | Use case | |----------|----------| | `outbound-with-approvals` | Outbound email/message sequences requiring human approval | | `enrichment-dag` | Parallel data enrichment from multiple providers | | `attribution-rollup` | Daily cron attribution rollup | | `reply-classifier` | LLM-based reply intent classification | ### Example ``` /rotor:new-workflow outbound-with-approvals ``` Claude Code will: 1. Check `rotor` CLI is installed (install if missing) 2. Add `@rotorsh/sdk` to your project if absent 3. Run `rotor template apply outbound-with-approvals` 4. Show customization options from the generated README 5. Update `CLAUDE.md` with rotor agent rules 6. Remind you to set `ROTOR_API_KEY` 7. Suggest wiring the MCP server --- ## Verify Installation After installing, check the skill is available: ``` /rotor:new-workflow --help ``` Or ask Claude Code directly: > "What rotor templates are available?" Claude should list the four starter templates. --- ## Update the Plugin To update to the latest version: ### Marketplace install ```bash /plugin update rotor ``` ### Manual install ```bash cd ~/.claude/plugins/rotor && git pull /plugin install ~/.claude/plugins/rotor ``` --- ## Troubleshooting **`/rotor:new-workflow` not recognized after install:** 1. Restart Claude Code completely (quit and reopen) 2. Verify the skill file exists: `ls ~/.claude/plugins/rotor/skills/new-workflow/SKILL.md` 3. Re-run `/plugin install rotor` **`rotor template apply` command not found:** The `rotor` CLI is separate from the Claude Code plugin. Install it globally: ```bash npm install -g rotor-cli rotor --version ``` **Plugin install fails with "repository not found":** Use the manual install method above. If `rotor-sh/claude-plugin` is not yet on GitHub, clone from the main repo: ```bash git clone https://github.com/shyftai/rotor ~/.claude/plugins/rotor-src cp -r ~/.claude/plugins/rotor-src/packages/rotor-plugin ~/.claude/plugins/rotor /plugin install ~/.claude/plugins/rotor ``` **Need help?** Open an issue on [GitHub](https://github.com/shyftai/rotor) or ask in the [Anthropic Discord](https://discord.gg/anthropic) `#rotor` channel. --- ## API Reference ### API Reference URL: https://rotor.sh/docs/api-reference/introduction ## Base URL ``` https://api.rotor.sh ``` All API endpoints are versioned under `/v1/`. ## Authentication All `/v1/*` routes require a workspace API key passed as a Bearer token: ```http Authorization: Bearer rt_ws_your_key_here ``` Workspace keys are prefixed `rt_ws_` and scoped to a single workspace. Generate them from your workspace settings at [rotor.sh](https://rotor.sh). ## Rate Limits Requests are rate-limited per workspace based on your plan: | Plan | Requests/minute | | ---------- | --------------- | | Free | 60 | | Pro | 300 | | Team | 1,000 | | Enterprise | Custom | ## Error Format All errors return a consistent JSON envelope: ```json { "error": { "code": "QUOTA_EXCEEDED", "message": "Monthly job quota exceeded for this workspace", "details": { "used": 10000, "limit": 10000, "plan": "free" } } } ``` ## Tag Groups The OpenAPI spec below groups endpoints by resource: - **Queues** — create, list, get, delete queues; per-queue metrics - **Jobs** — enqueue single + batch; get status; retry; DLQ management - **Schedules** — create, list, pause, resume, delete recurring jobs - **Status** — workspace health check + quota usage - **Usage** — billing-period execution breakdown - **Metrics** — Prometheus-compatible counters (internal; gated at infra level) ## OpenAPI Spec Download the full machine-readable spec: ```bash curl https://api.rotor.sh/doc -o openapi.json ``` The spec covers all `/v1` endpoints with full request/response schemas. You can import it into Insomnia, Postman, or any OpenAPI-compatible client. ## Endpoints at a glance | Method | Path | Description | |--------|------|-------------| | GET | `/v1/queues` | List all queues | | POST | `/v1/queues` | Create a queue | | GET | `/v1/queues/:name` | Get queue config + metrics | | PATCH | `/v1/queues/:name` | Update queue config | | DELETE | `/v1/queues/:name` | Delete a queue | | POST | `/v1/queues/:name/jobs` | Enqueue a job | | POST | `/v1/queues/:name/jobs/batch` | Enqueue jobs in bulk | | GET | `/v1/queues/:name/jobs` | List jobs (with `?state=`, `?tag=`) | | GET | `/v1/queues/:name/jobs/:id` | Get job status | | DELETE | `/v1/queues/:name/jobs/:id` | Cancel / delete a job | | POST | `/v1/queues/:name/jobs/:id/retry` | Retry a failed job | | GET | `/v1/schedules` | List schedules | | POST | `/v1/schedules` | Create a schedule | | PATCH | `/v1/schedules/:name/pause` | Pause a schedule | | PATCH | `/v1/schedules/:name/resume` | Resume a schedule | | DELETE | `/v1/schedules/:name` | Delete a schedule | | GET | `/v1/status` | Workspace health + quota | | GET | `/v1/usage` | Billing-period execution breakdown | --- ## Workflows ### Durable Workflows Quickstart URL: https://rotor.sh/docs/workflows/quickstart # Durable Workflows Quickstart Rotor durable workflows are multi-step TypeScript functions that execute reliably across retries, sleeps, and external event waits — without occupying a long-lived worker process. ## How it works Each workflow step is durably checkpointed to Postgres. If a retry occurs, the function body re-executes from the top, but already-completed steps return their cached results immediately — no double-execution of side effects. ## Installation ```bash npm install @rotorsh/sdk ``` ## Define a function ```typescript import { createFunction } from '@rotorsh/sdk'; export const welcomeSequence = createFunction( { id: 'welcome-sequence', trigger: { event: 'user.created' }, retries: 3, }, async ({ event, step }) => { // step.run() executes exactly once per run — safe to put side effects here const profile = await step.run('fetch-profile', async () => { const res = await fetch(`/api/users/${event.data.userId}`); return res.json(); }); // Pause for 24 hours without occupying a worker slot await step.sleep('initial-wait', '24h'); // Send a personalized welcome email await step.run('send-welcome', async () => { await sendEmail({ to: profile.email, subject: `Welcome, ${profile.name}!`, body: `Thanks for signing up...`, }); }); return { sent: true }; }, ); ``` ## Register with serve() ```typescript import { Hono } from 'hono'; import { serveWorkflow } from '@rotorsh/sdk'; import { welcomeSequence } from './workflows/welcome-sequence'; const app = new Hono(); serveWorkflow(app, { functions: [welcomeSequence], signingKey: process.env.ROTOR_SIGNING_KEY!, // set in Rotor dashboard }); export default app; ``` ## Register the function with Rotor ```bash curl -X POST https://api.rotor.sh/v1/functions \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "id": "welcome-sequence", "serve_url": "https://your-app.example.com/api/rotor", "trigger_type": "event", "trigger_event": "user.created" }' ``` ## Trigger the function ```typescript import { Rotor } from '@rotorsh/sdk'; const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); // Publish an event — Rotor fans out to all matching functions await rotor.send('user.created', { userId: '123', email: 'alice@example.com' }); ``` ## Step primitives | Primitive | Description | |-----------|-------------| | `step.run(id, fn)` | Executes `fn` exactly once per run; cached on retry | | `step.sleep(id, duration)` | Suspends for a duration — `'30m'`, `'2d'`, `'1h'` | | `step.sleepUntil(id, timestamp)` | Suspends until a specific timestamp | | `step.waitForEvent(id, opts)` | Suspends until a matching event arrives via `rotor.send()` | ## Phase 10 primitives Three new step primitives extend the base set with sub-workflow composition, event fan-out, and human-in-the-loop capabilities: **[step.invoke](/docs/workflows/step-invoke)** — Spawn a child workflow and wait for its result. `timeout` is required (TypeScript build fails without it). Returns `{ result, runId }`. Throws `WorkflowFailedError`, `WorkflowTimeoutError`, or `WorkflowCancelledError` on child failure. Memoized on parent retry — child is not re-spawned. **[step.sendEvent](/docs/workflows/step-send-event)** — Fire-and-forget event emission from inside a step. Parent step completes immediately. Optional `ts` parameter defers emission via BullMQ delay queue. Memoized on retry — no double-emit. **[step.waitForSignal](/docs/workflows/step-wait-for-signal)** — Named pause/resume. Suspends the run until `POST /v1/signals/:signal_id/complete` is called. The signal ID is workspace-unique — use `approval-${runId}` to avoid collisions. Throws `SignalTimeoutError` if `timeout` elapses before the POST arrives. ### Full-primitive workflow example This example uses all six step primitives in one workflow — mirrors the backward-compatibility test suite Case E: ```typescript import { createFunction } from '@rotorsh/sdk'; import { WorkflowFailedError, WorkflowTimeoutError, SignalTimeoutError, } from '@rotor/sdk/errors'; export const fullPrimitivesWorkflow = createFunction( { id: 'full-primitives-workflow', trigger: { event: 'campaign.requested' }, retries: 2, }, async ({ event, step, runId }) => { // 1. step.run — fetch contact profile (exactly once, cached on retry) const profile = await step.run('fetch-profile', async () => { const res = await fetch(`/api/contacts/${event.data.contactId}`); return res.json() as Promise<{ email: string; name: string }>; }); // 2. step.sleep — wait 1 hour before enriching (no worker slot occupied) await step.sleep('initial-delay', '1h'); // 3. step.invoke — delegate to enrichment specialist workflow let enriched: { score: number; company: string }; try { const { result } = await step.invoke('enrich-contact', { workflow: { id: 'contact-enricher' }, data: { contactId: event.data.contactId }, timeout: '5m', // REQUIRED — TypeScript build fails without this }); enriched = result; } catch (err) { if (err instanceof WorkflowFailedError || err instanceof WorkflowTimeoutError) { enriched = { score: 0, company: 'Unknown' }; } else { throw err; } } // 4. step.sendEvent — fan-out to analytics pipeline (fire and forget) await step.sendEvent('notify-analytics', { name: 'analytics/contact.scored', data: { contactId: event.data.contactId, score: enriched.score }, }); // 5. step.waitForSignal — require human approval before sending let approved = false; try { const approval = await step.waitForSignal<{ approved: boolean }>( 'approval-gate', { signal: `campaign-approval-${runId}`, timeout: '48h', } ); approved = approval.approved; } catch (err) { if (err instanceof SignalTimeoutError) { // Auto-reject after 48h return { status: 'auto-rejected', contactId: event.data.contactId }; } throw err; } if (!approved) { return { status: 'rejected', contactId: event.data.contactId }; } // 6. step.waitForEvent — wait for the contact to open the email (max 7 days) const opened = await step.waitForEvent<{ openedAt: string }>('wait-for-open', { event: 'email.opened', match: `data.contactId == "${event.data.contactId}"`, timeout: '7d', }); return { status: 'complete', contactId: event.data.contactId, score: enriched.score, emailOpened: opened !== null, }; } ); ``` ## Fan-out with Promise.all ```typescript const [resultA, resultB] = await Promise.all([ step.run('enrich-email', () => enrichEmail(contact.email)), step.run('enrich-phone', () => enrichPhone(contact.phone)), ]); ``` Both steps run in parallel. The workflow resumes when both complete. ## Cancellation ```typescript // Cancel a run via event await rotor.send('rotor/cancel.run', { runId: 'run-abc-123' }); ``` In-flight steps complete; subsequent steps are skipped. Run status moves to `cancelled`. ## View step graph Open `/dashboard/runs/:id` to see a real-time step DAG with status colors and output previews. ## Starter templates | Template | Description | |----------|-------------| | `multi-step-outbound` | Outbound email sequence with waitForEvent reply detection | | `event-driven-enrichment` | Parallel contact enrichment on campaign.started | | `scheduled-with-approval` | Weekly cron report requiring Slack/terminal approval | ```bash npx rotor template add multi-step-outbound ``` ### step.invoke — Sync Sub-Workflow Call URL: https://rotor.sh/docs/workflows/step-invoke ## Overview `step.invoke` spawns a child workflow run and suspends the parent step until the child completes. When the child finishes, the parent resumes with the child's return value. The parent's step result is durably memoized — on retry, the cached child result is returned without re-spawning. Unlike enqueueing a job and polling, `step.invoke` gives you a type-safe return value and propagates errors — the parent step throws if the child fails, times out, or is cancelled. ## API shape ```typescript const { result, runId } = await step.invoke<{ score: number }>("enrich-contact", { workflow: { id: "contact-enrichment" }, // target workflow id (must be event-triggered) data: { contactId: "c_123" }, // trigger data sent as the event payload timeout: "10m", // REQUIRED — no infinite default. Max 24h. }); ``` ```typescript // Full type signature: invoke( id: string, opts: { workflow: { id: string }; data: unknown; timeout: string; // REQUIRED. e.g. "5m", "1h", "24h". TypeScript build fails without it. } ): Promise<{ result: R; runId: string }> ``` ## Return value `step.invoke` returns `{ result, runId }`: | Field | Type | Description | |-------|------|-------------| | `result` | `R` | The child workflow's return value, typed via the `` generic | | `runId` | `string` | The child run's UUID — useful for debugging, linking logs, and dashboard navigation | The `runId` is included because observability matters: when a parent invokes dozens of children, you need to link parent and child spans without scraping logs. ## Error model `step.invoke` throws one of three errors when the child doesn't complete successfully: | Error class | When it fires | Import | |-------------|--------------|--------| | `WorkflowFailedError` | Child workflow threw or reached terminal failure | `@rotor/sdk/errors` | | `WorkflowTimeoutError` | Parent-side timeout elapsed (child continues running) | `@rotor/sdk/errors` | | `WorkflowCancelledError` | Child was cancelled mid-flight (user cancel, future `cancelOn` match) | `@rotor/sdk/errors` | ```typescript import { WorkflowFailedError, WorkflowTimeoutError, WorkflowCancelledError, } from "@rotor/sdk/errors"; const { result } = await step.invoke("run-analysis", { workflow: { id: "analysis-workflow" }, data: { reportId }, timeout: "30m", }).catch((err) => { if (err.name === "WorkflowTimeoutError") { // Child is still running independently — record the runId for follow-up return { result: null, runId: "unknown" }; } if (err.name === "WorkflowCancelledError") { // Child was explicitly cancelled — compensate await step.run("record-cancellation", () => recordCancellation(reportId)); throw err; } // WorkflowFailedError or unexpected — rethrow throw err; }); ``` ## Cancellation matrix | Direction | What happens | Error surfaced | |-----------|--------------|----------------| | Parent cancelled → child | `cancelRun(parentId)` recursively cancels all in-flight children via `parent_run_id` scan | N/A — parent itself was cancelled | | Child cancelled → parent step | Child reaches `cancelled` terminal status; parent's pending `step.invoke` throws | `WorkflowCancelledError` | | Parent-side timeout fires | `workflow:invoke-timeout` marks parent step failed; **child keeps running independently** | `WorkflowTimeoutError` | **`WorkflowTimeoutError` does NOT cancel the child.** The child run continues executing in the background. The timeout means "I'm done waiting for you" — it is a parent concern, not a child concern. This is intentional and matches the "wait for at most N minutes, but don't waste the work already done" semantics. Record the `runId` so you can track or cancel the child separately if needed. ## Why timeout is required **Inngest's blocks-forever default is a billing bug pretending to be a UX feature.** Forgetting to set a timeout on a blocking sub-workflow call means your run — and your billing meter — runs until the child completes, which could be days. Rotor surfaces this as a compile error: `timeout` is required at the TypeScript type level. The maximum is 24 hours. Recommended default: `"5m"` — matches the step.run cap and is long enough for most enrichment/analysis sub-agents. Increase only when you have a concrete SLA reason. ## Memoization On parent retry, the cached child result is returned without re-spawning: 1. The serve adapter hashes `"stepId:counter"` to produce a stable step key. 2. If `completedSteps[hash]` is already populated, `step.invoke` returns the cached `{ result, runId }` immediately. 3. No duplicate child runs are created — the child workflow is spawned exactly once per unique step ID. If the parent timed out on the previous attempt, the cached `WorkflowTimeoutError` is re-thrown without spawning a new child. Catch `WorkflowTimeoutError` inside your workflow function if you want to break out of this retry loop. ## Common patterns ### Sub-agent delegation Fan out to a specialist workflow and collect structured results: ```typescript export const analysisOrchestrator = createFunction( { id: "analysis-orchestrator", trigger: { event: "analysis.requested" }, retries: 2 }, async ({ event, step }) => { // Delegate to a contact enrichment specialist const { result: enriched } = await step.invoke("enrich", { workflow: { id: "contact-enricher" }, data: { contactId: event.data.contactId }, timeout: "5m", }); // Delegate to a scoring model const { result: score } = await step.invoke("score", { workflow: { id: "lead-scorer" }, data: { enriched }, timeout: "2m", }); return { contactId: event.data.contactId, score: score.value }; } ); ``` ### Pipeline composition Chain workflows where each stage's output feeds the next: ```typescript const { result: stage1 } = await step.invoke("stage-1", { workflow: { id: "data-cleaner" }, data: rawPayload, timeout: "10m", }); const { result: stage2 } = await step.invoke("stage-2", { workflow: { id: "data-transformer" }, data: stage1, timeout: "10m", }); const { result: final } = await step.invoke("stage-3", { workflow: { id: "data-loader" }, data: stage2, timeout: "5m", }); ``` ### Fan-out with Promise.all Invoke multiple children in parallel and merge results: ```typescript const [{ result: a }, { result: b }, { result: c }] = await Promise.all([ step.invoke("scorer-a", { workflow: { id: "scorer-a" }, data: contact, timeout: "3m" }), step.invoke("scorer-b", { workflow: { id: "scorer-b" }, data: contact, timeout: "3m" }), step.invoke("scorer-c", { workflow: { id: "scorer-c" }, data: contact, timeout: "3m" }), ]); return { consensus: average(a.score, b.score, c.score) }; ``` ## Limits | Constraint | Value | |------------|-------| | Maximum `timeout` | 24 hours | | Target workflow trigger type | Must be `event`-triggered (not `cron`) | | Recursion depth | No platform limit — user's responsibility to avoid infinite loops | | Parallel invocations | No platform limit — use `Promise.all` freely | **Avoid invoking cron workflows.** Cron workflows are not designed to receive event data — they read from schedules, not event payloads. `step.invoke` will fail immediately if the target workflow has `trigger_type = 'cron'`. ### step.sendEvent — Fire-and-Forget Event Emission URL: https://rotor.sh/docs/workflows/step-send-event ## Overview `step.sendEvent` publishes an event to your workspace event bus from within a running workflow. The parent step completes immediately — it does not wait for any downstream workflow triggered by the event. This is the async counterpart to `step.invoke`: use `step.sendEvent` when you want to fan out work without blocking the current workflow. The call is memoized: on parent retry, the event is not emitted a second time. ## API shape ```typescript await step.sendEvent("spawn-enricher", { name: "agent/contact.enrich.requested", // event name — dot-notation convention data: { contactId: "c_123", priority: "high" }, ts: undefined, // optional: Unix timestamp in ms — omit for immediate emission }); ``` ```typescript // Full type signature: sendEvent( id: string, opts: { name: string; data: unknown; ts?: number; // optional Unix timestamp (ms). Future ts defers emission via BullMQ delay queue. } ): Promise ``` ## Return value `step.sendEvent` returns `Promise`. There is no result to await — the operation is fire-and-forget from the parent workflow's perspective. ## Future scheduling When `ts` is set to a future Unix timestamp (milliseconds), the event emission is deferred via BullMQ's delay queue. The parent step completes immediately upon scheduling — it does not block until the future time. ```typescript const tomorrow = Date.now() + 24 * 60 * 60 * 1000; await step.sendEvent("schedule-followup", { name: "outreach/followup.scheduled", data: { contactId, templateId: "follow-up-v2" }, ts: tomorrow, // event fires ~24h from now }); // Parent workflow continues immediately after this line await step.run("record-scheduled", () => db.outreach.markScheduled(contactId)); ``` BullMQ stores the deferred job and emits the event at `ts`. If the worker restarts before `ts`, BullMQ's durable queue ensures the event fires on recovery — no manual retry logic needed. ## Memoization On parent retry, `step.sendEvent` is a no-op for an already-emitted step: 1. The serve adapter hashes `"stepId:counter"` to a stable key. 2. If `completedSteps[hash]` is already populated, `step.sendEvent` returns immediately without re-inserting to the events table. 3. No duplicate events are emitted. This gives the same memoization guarantee as `step.run`: side effects happen exactly once per workflow execution. ## Common patterns ### Async fan-out to parallel agents Spawn N sub-agents without waiting for any of them: ```typescript for (const contact of contacts) { await step.sendEvent(`spawn-enricher-${contact.id}`, { name: "agent/enrich.requested", data: { contactId: contact.id, workflowRunId: runId }, }); } // All N enrichment workflows start in parallel — parent does not wait ``` ### Scheduled sub-agent Schedule follow-up work at a specific time without running a separate cron: ```typescript const sendAt = new Date("2026-05-01T09:00:00Z").getTime(); await step.sendEvent("schedule-campaign-blast", { name: "campaign/blast.scheduled", data: { campaignId, segmentId }, ts: sendAt, }); ``` ### Chain without coupling Trigger a downstream workflow by name, allowing you to evolve each workflow independently: ```typescript const { data: enriched } = await step.run("enrich", () => enrichContact(contactId)); // Hand off to a separate "scoring" workflow — no direct import required await step.sendEvent("trigger-scoring", { name: "scoring/contact.ready", data: enriched, }); ``` ## Limits | Constraint | Value | |------------|-------| | Workspace scope | Events are workspace-scoped — no cross-workspace emit | | Deferred sendEvent durability | BullMQ delay queue — durable across worker restarts | | Deferred sendEvent idempotency | NOT idempotent across worker crashes before the delay job is persisted (Phase 11 follow-up) | | `ts` precision | Millisecond Unix timestamp — use `Date.now()` + offset | **Deferred `sendEvent` is not yet idempotent across worker crashes.** If the worker crashes between the serve adapter call and the BullMQ `add(delay:)` call, the event may not be scheduled. Phase 11 (idempotency keys) closes this window. For critical scheduled events, use a dedicated `step.sleep` + `step.sendEvent` sequence instead — the sleep memoizes durably and the sendEvent fires after resume. ### step.waitForSignal — Named Pause / Resume URL: https://rotor.sh/docs/workflows/step-wait-for-signal ## Overview `step.waitForSignal` suspends a workflow run until a named signal arrives via `POST /v1/signals/:signal_id/complete`. This is the idiomatic Rotor pattern for human-in-the-loop workflows: approval gates, review queues, external integration callbacks, and any case where a workflow needs to wait for an action that originates outside the system. Unlike `step.waitForEvent` — which matches against the global event stream — `step.waitForSignal` targets exactly one run by a signal ID you control. No fan-out, no broadcast: one signal resumes one run. ## API shape ```typescript const approval = await step.waitForSignal<{ approved: boolean; reason?: string }>( "approval-gate", { signal: `approval-${runId}`, // unique per run — include runId to avoid collisions timeout: "48h", // REQUIRED — throws SignalTimeoutError if no POST arrives } ); if (!approval.approved) { throw new Error(`Rejected: ${approval.reason}`); } ``` ```typescript // Full type signature: waitForSignal( id: string, opts: { signal: string; // workspace-unique signal ID timeout: string; // e.g. "24h", "7d" — required } ): Promise ``` ## Return value `step.waitForSignal` returns the `data` field from the resume POST body, typed via the `` generic. If the POST body is `{ "data": { "approved": true } }`, the step returns `{ approved: true }`. ## Resume mechanism Resume a waiting run by POSTing to the signal's completion endpoint: ``` POST /v1/signals/:signal_id/complete Authorization: Bearer Content-Type: application/json { "data": { "approved": true, "reviewer": "alice@example.com" } } ``` ### Authentication The API key must belong to the **same workspace** that owns the signal. Cross-workspace resume attempts receive a `404` (not a `401`) to prevent signal enumeration across tenants — see the 404/401 note below. ### Response codes | Status | Meaning | |--------|---------| | `200` | Signal received; run queued for resume | | `404` | Signal not found (unknown ID or cross-workspace) | | `409` | Collision — signal_id already in use by another active waiter in this workspace | | `410` | Signal already resolved (run already resumed or timed out) | | `422` | Missing or invalid `data` field in body | ### 404 on cross-workspace — not 401 Rotor returns `404` for both unknown signal IDs and cross-workspace resume attempts. Returning `401` for cross-workspace would leak signal existence (an attacker could enumerate live signal IDs by observing `401` vs `404`). The `404` response keeps signal existence opaque across workspace boundaries. ## Error model | Error class | When it fires | |-------------|--------------| | `SignalTimeoutError` | `timeout` elapsed before a `POST` was received | ```typescript import { SignalTimeoutError } from "@rotor/sdk/errors"; try { const result = await step.waitForSignal("human-review", { signal: `review-${runId}`, timeout: "72h", }); return { status: "approved", data: result }; } catch (err) { if (err.name === "SignalTimeoutError") { await step.run("auto-reject", () => recordAutoRejection(runId)); return { status: "auto-rejected" }; } throw err; } ``` ## signal_id uniqueness scope Signal IDs are **unique per workspace** — not per run and not globally. Two simultaneous runs in the same workspace cannot share the same signal ID. Recommended pattern: ```typescript // Use runId (unique per workflow execution) as part of the signal ID const signal = `approval-${runId}`; // If you need multiple signals per run, add a qualifier: const signal1 = `budget-approval-${runId}`; const signal2 = `legal-approval-${runId}`; ``` If a signal ID collision occurs (e.g., you accidentally reuse an ID that's already waiting), you get a `409` response from the POST endpoint, which surfaces as a run failure in the orchestrator. ## Collision behavior A `409` from the `workflow_run_waiters` UNIQUE constraint (on `workspace_id + signal_id`) surfaces as a run failure with a distinct error message. Your consumer code (the entity POSTing to the signal endpoint) should treat `409` as a bug in signal ID generation — not a retry-able transient error. ## Memoization On parent retry, `step.waitForSignal` does not create a duplicate waiter: 1. If a `workflow_run_waiters` row already exists for `(run_id, step_id)`, the serve adapter returns the cached result. 2. If the signal was already completed before the retry, the cached resolved value is returned immediately. 3. If the signal is still pending (awaiting POST), the run re-suspends on the existing waiter. ## Common patterns ### Approval gate ```typescript export const outreachWithApproval = createFunction( { id: "outreach-with-approval", trigger: { event: "outreach.requested" }, retries: 2 }, async ({ event, step, runId }) => { const draft = await step.run("draft-message", async () => generateDraft(event.data.contactId) ); // Post to Slack with a link to your approval UI await step.run("notify-approver", async () => slack.send({ channel: "#outreach-approvals", text: `New outreach draft for ${event.data.contactId}`, actions: [ { text: "Approve", url: `https://app.example.com/approve?signal=approval-${runId}` }, { text: "Reject", url: `https://app.example.com/reject?signal=approval-${runId}` }, ], }) ); // Pause until the approver clicks Approve or Reject const { approved, reason } = await step.waitForSignal<{ approved: boolean; reason?: string }>( "wait-for-approval", { signal: `approval-${runId}`, timeout: "48h" } ); if (!approved) { return { status: "rejected", reason }; } await step.run("send-message", () => sendOutreach(draft, event.data.contactId)); return { status: "sent" }; } ); ``` Your approval UI POSTs to complete the signal: ```typescript // In your approval UI API handler: await fetch(`https://api.rotor.sh/v1/signals/approval-${runId}/complete`, { method: "POST", headers: { Authorization: `Bearer ${process.env.ROTOR_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ data: { approved: true, reviewer: "alice@example.com" } }), }); ``` ### Human review with fallback ```typescript let reviewResult: { action: "approve" | "escalate" } | null = null; try { reviewResult = await step.waitForSignal("human-review", { signal: `review-${runId}`, timeout: "24h", }); } catch (err) { if (err.name === "SignalTimeoutError") { // Auto-escalate after 24h of no response await step.run("escalate", () => escalateToManager(runId)); return { status: "escalated" }; } throw err; } ``` ### External integration callback Use `step.waitForSignal` when an external service calls back asynchronously — e.g., a KYC provider, a document signing service, or a payment processor: ```typescript // Register your signal ID with the external service before suspending await step.run("register-kyc-webhook", () => kycProvider.startVerification({ userId: event.data.userId, callbackSignalId: `kyc-${runId}`, }) ); const { verified, failureReason } = await step.waitForSignal( "wait-for-kyc", { signal: `kyc-${runId}`, timeout: "7d" } ); ``` Your KYC webhook handler then calls `POST /v1/signals/kyc-${runId}/complete` with the result. ## Cancellation Cancelling a workflow run that is paused on `step.waitForSignal` automatically deletes the waiter row — the signal ID is released and the run moves to `cancelled`. Any subsequent `POST` to that signal ID returns `404`. ## Limits | Constraint | Value | |------------|-------| | Uniqueness scope | Per workspace (not per run) — use `runId` in signal ID | | Collision | `409` at resume endpoint; surfaces as run failure | | Timeout maximum | No hard platform max — use `"7d"` for week-long human-review flows | | Resume body size | Limited by API gateway max body (512 KB) | | Cross-workspace resume | Returns `404` (not `401`) — workspace boundary enforced silently | ### step.waitForEvent URL: https://rotor.sh/docs/workflows/step-wait-for-event `step.waitForEvent` suspends the current workflow run until a specific event is published to your workspace via `rotor.send()`. If no matching event arrives before the timeout, the step returns `null`. ## Signature ```typescript const result = await step.waitForEvent( id: string, opts: { event: string; // event name to wait for match?: string; // filter expression (see below) timeout: string; // required — e.g. "1h", "7d", "30m" } ): Promise ``` Returns the `data` field of the matching event, or `null` if the timeout elapses first. ## Basic example ```typescript const opened = await step.waitForEvent<{ openedAt: string }>("wait-for-open", { event: "email.opened", timeout: "7d", }); if (opened === null) { // No open event within 7 days return { status: "no-open" }; } console.log("Email opened at:", opened.openedAt); ``` ## Filtering with `match` The `match` field is a filter expression that lets you wait for a specific event among many. It uses a CEL-like syntax operating on the incoming event's `data` field. ```typescript const opened = await step.waitForEvent<{ openedAt: string }>("wait-for-open", { event: "email.opened", match: `data.contactId == "${event.data.contactId}"`, timeout: "7d", }); ``` The expression is evaluated against each incoming event of the given name. The step resumes with the first event where the expression evaluates to `true`. **Supported operators:** | Operator | Example | |----------|---------| | Equality | `data.userId == "u_123"` | | Inequality | `data.status != "bounced"` | | Logical AND | `data.campaignId == "c1" && data.step == "opened"` | | Logical OR | `data.source == "web" \|\| data.source == "mobile"` | | Numeric comparison | `data.score >= 50` | Always use `match` when you have multiple workflow runs that could each be waiting for the same event name. Without a filter, any workflow run waiting for `email.opened` will resume on the first open event for any contact. ## Triggering the event From anywhere in your application — a webhook handler, another workflow, or the SDK: ```typescript const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); await rotor.send("email.opened", { contactId: "cid_123", openedAt: new Date().toISOString(), campaignId: "camp_456", }); ``` `rotor.send()` fans out to all workflow runs currently suspended at a matching `step.waitForEvent`. ## Timeout behavior When the timeout elapses before a matching event arrives, `step.waitForEvent` returns `null` — it does **not** throw. Check the return value: ```typescript const reply = await step.waitForEvent<{ approved: boolean }>("wait-for-reply", { event: "campaign.reply", match: `data.contactId == "${contactId}"`, timeout: "3d", }); if (reply === null) { // Follow up after 3 days of silence await step.run("send-followup", () => sendFollowUp(contactId)); return { status: "followed-up" }; } ``` ## Difference from step.waitForSignal | | `step.waitForEvent` | `step.waitForSignal` | |--|---------------------|----------------------| | Triggered by | `rotor.send(eventName, data)` | `POST /v1/signals/:id/complete` | | ID collision | Multiple runs can match the same event with filters | Signal ID must be globally unique per workspace | | On timeout | Returns `null` | Throws `SignalTimeoutError` | | Best for | Waiting for user behaviour (opens, clicks, replies) | Approval gates, human-in-the-loop steps | Use `step.waitForEvent` when you're waiting for an event that happens naturally in your system. Use `step.waitForSignal` when you need an explicit approval or external trigger with an unambiguous ID. ## Memoization `step.waitForEvent` is memoized. If the workflow retries after the event was already received and cached, the step returns the cached event data immediately without re-suspending. --- ## Migrate ### Migrate from Inngest URL: https://rotor.sh/docs/migrate/from-inngest ## TL;DR Inngest serves HTTP events; Rotor's **HTTP callback mode** (new in Phase 3) maps 1:1 to your existing handler code. Migration is a handler URL swap and a config copy — your Inngest handlers are already Rotor-compatible HTTP routes. No SDK rewrite, no worker container, no runtime change. At **1M runs/mo**, Rotor Team costs **$99 flat**; Inngest requires Enterprise pricing. Rotor is the right fit if you like Inngest's HTTP-function ergonomics but want flat per-workspace pricing instead of per-execution usage metering. It is **not** a replacement for multi-step Durable Execution (step.waitForEvent, step.invoke, cross-step observability). See [What Rotor does NOT replace](#what-rotor-does-not-replace). ## Pricing at a glance | Plan | Price | Executions / mo | Retention | Notes | |------|-------|-----------------|-----------|-------| | Inngest Free | $0 | 50k runs | — | Low limits | | Inngest Cloud | $100/mo | 500k runs | — | Per-execution meter | | Inngest Enterprise | Custom | 1M+ | — | Required above Cloud limits | | **Rotor Free** | **$0** | **10k** | 7d | Unlimited queues + schedules | | **Rotor Pro** | **$19/mo** | **100k** | 30d | 14-day trial, no card | | **Rotor Team** | **$99/mo** | **1M** | 90d | Flat — no overages | | Rotor Enterprise | Custom | Custom | 365d | Managed workers + SOC 2 + SSO | **100k runs/mo:** Inngest Cloud $100 vs Rotor Pro $19 → **5.3x cheaper**. **1M runs/mo:** Inngest requires Enterprise (unknown pricing); Rotor Team is **$99 flat**. Competitor pricing was verified against inngest.com/pricing on the publish date of this guide — re-check before making a purchasing decision. ## Architecture mapping | Inngest concept | Rotor equivalent | Migration notes | |-----------------|------------------|-----------------| | `inngest.createFunction({id, event}, handler)` | Queue + `callback_url` | `POST /v1/queues` then `PATCH` the queue with your handler URL | | `inngest.send({name, data})` | `POST /v1/queues/:name/jobs` | 1:1 — same enqueue-from-API pattern | | `step.run("name", async () => ...)` | Idempotent handler + BullMQ `attempts` + backoff | Rotor has one handler per queue; split steps into separate queues if you need independent retry budgets | | `step.sleep("1h")` | Delayed enqueue (`delay: 3600` on the job) | Same outcome — job becomes eligible after delay | | `step.waitForEvent(...)` | Approval queue (Phase 2 — [see docs](/enterprise)) | Partial match: approval queues pause until approved/rejected; not a general "wait for any event" primitive | | `createFunction({cron})` | `POST /v1/schedules` | Cron syntax is the same; Rotor **requires** a `timezone` field | | Dashboard event history | Job history archive (Phase 3) | 7–365 days retention by plan | ## Before / After ```typescript Inngest (before) // inngest/functions.ts import { inngest } from "./client"; export const welcomeEmail = inngest.createFunction( { id: "welcome-email" }, { event: "user/created" }, async ({ event, step }) => { await step.run("send-email", async () => { return sendEmail({ to: event.data.email, template: "welcome", }); }); } ); ``` ```typescript Rotor (after) // app/api/rotor/welcome-email/route.ts (your existing handler) import { createHmac, timingSafeEqual } from "node:crypto"; export async function POST(req: Request) { const body = await req.text(); if (!verifyRotorSignature(req.headers, body)) { return new Response("unauthorized", { status: 401 }); } const job = JSON.parse(body); const jobId = req.headers.get("x-rotor-job-id")!; // Idempotent send keyed on the Rotor job id await sendEmailOnce(jobId, { to: job.payload.email, template: "welcome", }); return new Response("ok"); } ``` ```bash One-time setup (create the queue) curl -X POST https://api.rotor.sh/v1/queues \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name":"welcome-email"}' # Attach your handler URL (paid plans only) curl -X PATCH https://api.rotor.sh/v1/queues/welcome-email \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "callback_url": "https://your-app.example.com/api/rotor/welcome-email", "rotate_callback_secret": true }' # Response includes callback_secret once — save it to your env var. ``` ```bash Replacing inngest.send # Was: inngest.send({ name: "user/created", data: { email, id } }) curl -X POST https://api.rotor.sh/v1/queues/welcome-email/jobs \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"payload": {"email":"alice@example.com","id":"u_1"}}' ``` ## Signature verification Every callback from Rotor includes three headers you must verify: | Header | Purpose | |--------|---------| | `X-Rotor-Job-Id` | Unique job id — use as idempotency / dedup key | | `X-Rotor-Attempt` | Retry attempt counter (`1` on first delivery) | | `X-Rotor-Signature` | Svix-compatible HMAC-SHA256 signature of the body | The signature format is identical to the webhook signing format — see [Webhook Signature Verification](/webhooks/signing) for Node, Python, Go, and Ruby examples. The only difference is the header names (`x-rotor-*` instead of `rotor-webhook-*`). ```typescript Node.js import { createHmac, timingSafeEqual } from "node:crypto"; function verifyRotorSignature( headers: Headers, body: string, ): boolean { const id = headers.get("x-rotor-job-id") ?? ""; const ts = headers.get("x-rotor-timestamp") ?? ""; const sig = headers.get("x-rotor-signature") ?? ""; if (!id || !ts || !sig) return false; if (Math.abs(Date.now() / 1000 - Number(ts)) > 300) return false; const expected = `v1,${createHmac("sha256", process.env.ROTOR_CALLBACK_SECRET!) .update(`${id}.${ts}.${body}`) .digest("base64")}`; const a = Buffer.from(expected); const b = Buffer.from(sig); return a.length === b.length && timingSafeEqual(a, b); } ``` ```python Python import hmac, hashlib, base64, time def verify_rotor_signature(headers, body, secret): jid = headers.get("x-rotor-job-id", "") ts = headers.get("x-rotor-timestamp", "") sig = headers.get("x-rotor-signature", "") if not (jid and ts and sig): return False if abs(time.time() - int(ts)) > 300: return False digest = hmac.new(secret.encode(), f"{jid}.{ts}.{body}".encode(), hashlib.sha256).digest() expected = f"v1,{base64.b64encode(digest).decode()}" return hmac.compare_digest(expected, sig) ``` ```go Go import ( "crypto/hmac"; "crypto/sha256"; "encoding/base64" "fmt"; "math"; "strconv"; "time" ) func VerifyRotorSignature(headers map[string]string, body, secret string) bool { jid, ts, sig := headers["x-rotor-job-id"], headers["x-rotor-timestamp"], headers["x-rotor-signature"] if jid == "" || ts == "" || sig == "" { return false } tsInt, err := strconv.ParseInt(ts, 10, 64) if err != nil || math.Abs(float64(time.Now().Unix()-tsInt)) > 300 { return false } mac := hmac.New(sha256.New, []byte(secret)) mac.Write([]byte(fmt.Sprintf("%s.%s.%s", jid, ts, body))) expected := fmt.Sprintf("v1,%s", base64.StdEncoding.EncodeToString(mac.Sum(nil))) return hmac.Equal([]byte(expected), []byte(sig)) } ``` ## Retry semantics Rotor uses BullMQ-native retry counts + exponential backoff. Where Inngest retries individual steps, Rotor retries the entire handler invocation. | Behavior | Inngest | Rotor | |----------|---------|-------| | Default attempts | 4 per step | 3 per job (configurable per queue) | | Backoff | Exponential (step-level) | Exponential (job-level) | | On exhaustion | Marked failed; visible in dashboard | Moved to workspace **DLQ** — retry via `POST /v1/queues/:name/retry-all` | | Partial progress | Each step independently retried | Handler re-runs from the top — **must be idempotent** | | Delivery semantics | At-least-once | At-least-once | Configure per-queue retry policy at queue creation time: ```json { "name": "welcome-email", "defaultJobOptions": { "attempts": 5, "backoff": { "type": "exponential", "delay": 2000 } } } ``` ## Idempotency — read this **At-least-once delivery means your handler MUST be idempotent.** Use `X-Rotor-Job-Id` as a dedup key on every side-effect. The header is stable across all retry attempts of the same job. ```typescript // Wrong — will send multiple emails on retry await sendEmail(event.data); // Right — idempotent via unique DB constraint on job_id await db.outreach.upsert({ where: { rotorJobId: jobId }, create: { rotorJobId: jobId, sentAt: new Date(), ... }, update: {}, // no-op if already sent }); if (!alreadySent) await sendEmail(event.data); ``` This is the same rule Inngest imposes — it's just that Rotor makes it explicit. ## Automated import We ship a small script that statically analyzes your Inngest function definitions and emits the equivalent Rotor configuration as JSON-lines, one object per queue/schedule. Pipe the output to `jq` or save it and replay with `curl` / `rotor queue create`. ```bash # Download the script (no install needed) npx tsx https://rotor.sh/docs/scripts/import-from-inngest.ts ./inngest/functions.ts # or run locally: npx tsx docs/scripts/import-from-inngest.ts ./inngest/functions.ts # Example output: # {"type":"queue","config":{"name":"welcome-email","notes":"event: user/created"}} # {"type":"schedule","config":{"queue":"nightly-digest","cron":"0 2 * * *","timezone":"UTC"}} ``` The script: - Walks the directory you point it at looking for `.ts` / `.js` / `.tsx` files - Greps for `inngest.createFunction({ id: ..., event: ... })` and `cron: ...` shapes - Emits `{ type: "queue" | "schedule", config: {...} }` per line - Prints warnings (to stderr) for shapes it couldn't parse — you handle those manually The output is **diagnostic** — we never modify your code. Paste the queue JSONs into a bootstrap script or run them through `curl` to provision Rotor. ## What Rotor does NOT replace Be honest with yourself about which Inngest features you actually use: - **Multi-step observability** (step-level timing, step-level retries, branching). Rotor treats each handler as one atomic invocation — if you need step-level visibility, Rotor is not the right fit. - **`step.waitForEvent(...)`** — general "wait for any event" coordination. Rotor's [approval queues](/enterprise) solve the "human-in-the-loop pause" subset, but not arbitrary event coordination. - **`step.invoke(otherFunction)`** — cross-function fan-out with result composition. In Rotor you do this by enqueueing jobs to downstream queues and correlating on your own. - **Durable, pausable function state across days/weeks** — that's Temporal's domain. If you are using Inngest primarily as "a queue with good ergonomics + webhook handlers + cron" → Rotor is a perfect fit and you'll save money. If you've built multi-step durable workflows with event-based coordination → evaluate [Temporal](https://temporal.io) or stay on Inngest. ## Next steps 1. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card. 2. **Install the SDK:** `pnpm add @rotorsh/sdk` — see the [Node.js quickstart](/guides/node-quickstart). 3. **Join the [Anthropic Discord](https://www.anthropic.com/discord)** — `#rotor` channel for migration help. --- ## Migrating step.invoke / step.sendEvent / step.waitForSignal Rotor ships native equivalents for all three Inngest durable-execution primitives. The APIs are intentionally similar — most migrations are a rename and a `timeout` addition. ### step.invoke ```typescript inngest import { inngest } from './inngest'; export const orchestrator = inngest.createFunction( { id: 'orchestrator' }, { event: 'campaign/requested' }, async ({ event, step }) => { // Inngest: timeout is optional — omitting it blocks forever (billing bug) const result = await step.invoke('enrich-contact', { function: contactEnricher, data: { contactId: event.data.contactId }, }); return result; } ); ``` ```typescript rotor import { createFunction } from '@rotorsh/sdk'; import { WorkflowFailedError, WorkflowTimeoutError } from '@rotor/sdk/errors'; export const orchestrator = createFunction( { id: 'orchestrator', trigger: { event: 'campaign.requested' }, retries: 2 }, async ({ event, step }) => { // Rotor: timeout is REQUIRED — TypeScript build fails without it (max 24h) const { result } = await step.invoke<{ score: number }>('enrich-contact', { workflow: { id: 'contact-enricher' }, // target by workflow id string, not function ref data: { contactId: event.data.contactId }, timeout: '5m', // REQUIRED }); return result; } ); ``` ### step.sendEvent ```typescript inngest // Inngest: step.sendEvent uses 'at' (ISO string) for future scheduling await step.sendEvent('spawn-sub-agent', { name: 'agent/sub-agent.spawn', data: { contactId: '123' }, ts: new Date('2026-05-01T09:00:00Z'), // Date object or ISO string }); ``` ```typescript rotor // Rotor: ts is Unix timestamp in milliseconds (not ISO string or Date object) await step.sendEvent('spawn-sub-agent', { name: 'agent/sub-agent.spawn', data: { contactId: '123' }, ts: new Date('2026-05-01T09:00:00Z').getTime(), // convert Date to Unix ms }); ``` ### step.waitForSignal Inngest does not have a direct `step.waitForSignal` equivalent. The closest Inngest pattern uses `cancelOn` matchers to respond to specific events — a fundamentally different model (event-bus broadcast vs. single-run named signal). ```typescript inngest // Inngest: no step.waitForSignal — closest pattern is cancelOn event matcher // This CANCELS the run (not a pause/resume pattern) export const withCancelOn = inngest.createFunction( { id: 'approval-workflow', cancelOn: [{ event: 'approval/rejected', match: 'data.runId' }], }, { event: 'approval/requested' }, async ({ event, step }) => { // If 'approval/rejected' fires with matching runId, the whole run is cancelled // No way to "wait for approval and branch on it" without external state await step.sleep('wait', '48h'); } ); ``` ```typescript rotor // Rotor: step.waitForSignal — pause/resume with typed return value + timeout import { SignalTimeoutError } from '@rotor/sdk/errors'; export const approvalWorkflow = createFunction( { id: 'approval-workflow', trigger: { event: 'approval.requested' }, retries: 2 }, async ({ event, step, runId }) => { const { approved, reason } = await step.waitForSignal<{ approved: boolean; reason?: string }>( 'wait-for-approval', { signal: `approval-${runId}`, // workspace-unique; POST to /v1/signals/:id/complete timeout: '48h', } ).catch((err) => { if (err.name === 'SignalTimeoutError') return { approved: false, reason: 'Timed out' }; throw err; }); return approved ? { status: 'approved' } : { status: 'rejected', reason }; } ); ``` ### API differences table | Aspect | Inngest | Rotor Phase 10+ | |--------|---------|----------------| | `step.invoke` — `timeout` param | Optional; omitting blocks forever (no cap) | **REQUIRED**; TypeScript build error if missing; max 24h | | `step.invoke` — target reference | Function reference (`function: myFn`) | Workflow ID string (`workflow: { id: 'my-workflow' }`) | | `step.invoke` — return shape | Bare result `R` | `{ result: R; runId: string }` — `runId` included for observability | | Parent-side timeout cancels child | N/A (no timeout concept) | **NO** — child runs independently; parent step throws `WorkflowTimeoutError` | | Child cancellation surfaces to parent | YES — `WorkflowCancelledError` | YES — `WorkflowCancelledError` (same behavior) | | `step.sendEvent` — future scheduling | `ts: Date` or ISO string | `ts: number` — Unix timestamp in **milliseconds** | | `step.waitForSignal` | **Does not exist** | YES — `signal` is workspace-scoped unique ID; resume via `POST /v1/signals/:id/complete` | | Signal resume auth | N/A | `rt_ws_` API key scoped to the workspace owning the signal | | Cross-workspace signal resume | N/A | Returns `404` (not `401`) — existence is opaque across workspace boundaries | | Concurrency keys | Shipped (`concurrency: { key, limit }`) | Shipped — set `concurrencyKey: string` on enqueue; see [Concurrency Keys](/docs/guides/concurrency-keys) | | `cancelOn` config | Shipped | **E2 STUB DELETED in Phase 10** — Wave 2 reintroduces with real enforcement | | Idempotent triggers | ✓ (`idempotency` field on `inngest.send`) | ✓ (`idempotencyKey` on `rotor.workflow.trigger`) | | `onFailure` hook | ✓ | ✓ | | `onSuccess` hook | ✓ | ✓ | | `onCancel` hook | partial | ✓ | | `onStartAttempt` / `onWait` / `onResume` / `catchError` | ✓ | — (deferred) | ### Key migration callouts **1. Required timeout on step.invoke** Add `timeout` to every `step.invoke` call. Omitting it is a TypeScript compile error — the build will fail and point at the missing field. Start with `"5m"` and increase only when you have a concrete SLA. **2. Parent-side timeout does not cancel the child** When `WorkflowTimeoutError` fires on the parent, the child workflow keeps running independently. This matches "I'm done waiting, but don't waste the work" semantics. If you want to cancel the child on timeout, you must call the Rotor cancel API manually using the `runId` captured before the timeout. **3. step.waitForSignal is a new mental model — not cancelOn** Inngest's `cancelOn` terminates the run on a matching event. Rotor's `step.waitForSignal` pauses the run and resumes it with the signal payload as the step result. You can branch on the result, catch `SignalTimeoutError` for fallback logic, or rethrow — the run stays in your control. For more detail: [step.invoke deep dive](/docs/workflows/step-invoke) · [step.sendEvent deep dive](/docs/workflows/step-send-event) · [step.waitForSignal deep dive](/docs/workflows/step-wait-for-signal) --- ## Idempotency keys Inngest's `inngest.send({ name, data, idempotency })` returns the same event ID for duplicate calls within a configured window. Rotor's equivalent: `rotor.workflow.trigger(name, data, { idempotencyKey })`. Same call site, same intent, same dedup behavior — the key is scoped per (workspace, function) so the same Stripe webhook ID won't dedup against itself across two unrelated workflows. ```typescript Before (Inngest) await inngest.send({ name: "stripe/charge.succeeded", data: { chargeId: charge.id }, idempotency: `stripe-${event.id}`, }); ``` ```typescript After (Rotor) await rotor.workflow.trigger( "stripe/charge.succeeded", { chargeId: charge.id }, { idempotencyKey: `stripe-${event.id}` }, ); ``` **What you get:** - Same `idempotencyKey` twice → second call returns `{ duplicate: true, runIds: [...] }` with the same run ID. - Different `idempotencyKey` (or none) → fresh run. - Workspace at quota → 402 still wins. A duplicate webhook from a workspace at the Hobby cap gets the 402 response, not a free duplicate-hit pass. (Per [pricing.md](/docs/pricing).) **Differences from Inngest:** - Rotor's idempotency window is "forever" — the unique constraint is on a Postgres column, not a 24h Redis cache. Storage is bounded only by row retention (purged at the same rates as workflow_runs). - The dedup happens at the workflow level, not the global event-bus level. Same key under two different workflows fires two distinct runs (intended — they have different side effects). --- ## Lifecycle hooks Inngest exposes per-function `onFailure` (and a half-dozen others). Rotor ships three: `onSuccess`, `onFailure`, `onCancel`. ```typescript Before (Inngest) inngest.createFunction( { id: "send-receipt", name: "Send Receipt" }, { event: "order/placed" }, async ({ event, step }) => { /* ... */ }, ); inngest.createFunction( { id: "send-receipt-failure-handler" }, { event: "inngest/function.failed", if: "event.data.function_id == 'send-receipt'" }, async ({ event }) => { /* send Slack alert */ }, ); ``` ```typescript After (Rotor) const sendReceipt = workflow({ id: "send-receipt", trigger: { event: "order/placed" }, onFailure: async ({ ctx, error }) => { // Same Slack alert. ctx.runId, ctx.functionId, ctx.attempt available. await slack.send({ channel: "#alerts", text: `Receipt failed: ${error.message} (run ${ctx.runId})`, }); }, steps: async ({ step }) => { /* ... */ }, }); ``` **Locked semantics (matching Inngest where applicable):** - `onSuccess` fires only on FINAL completion (after retries succeed). A workflow that fails twice then succeeds → `onSuccess` fires once. - `onFailure` fires only after retries are exhausted. Per-attempt failures do NOT fire `onFailure`. - `onCancel` fires when the run is cancelled mid-execution (`running | sleeping | waiting` → `cancelled`). It does NOT fire if the run was queued (`pending`) and never started — same as Inngest. - Hooks are fire-and-forget. Errors thrown inside a hook are logged to your audit_event channel (kind=`workflow.hook.errored`) and visible at `GET /v1/runs/:id` under `hookErrors[]` — but do NOT change the run's terminal status. A workflow that succeeded with a failing `onSuccess` still shows `status: completed`. - Hooks have a 30s soft timeout. Hooks running longer trigger an audit event (`workflow.hook.slow`) but are NOT killed — handle long work via `step.sendEvent` from inside `steps` instead. **Hooks Rotor does NOT ship (yet):** `onStartAttempt`, `onWait`, `onResume`, `onComplete`, `catchError`. Trigger.dev has these. Rotor doesn't, by design — three hooks cover 90% of buyer needs and we won't bloat the surface to mimic competitor feature counts. If you have a specific need, [open an issue](https://github.com/rotor-sh/rotor/issues). --- ## Side-by-side: Inngest vs Rotor createFunction ```typescript inngest import { inngest } from './inngest'; export const welcomeEmail = inngest.createFunction( { id: 'welcome-email' }, { event: 'user/created' }, async ({ event, step }) => { const profile = await step.run('fetch-profile', async () => { return fetchProfile(event.data.userId); }); await step.sleep('wait-1-day', '1d'); await step.run('send-email', async () => { await sendEmail({ to: profile.email }); }); } ); ``` ```typescript rotor import { createFunction } from '@rotorsh/sdk'; export const welcomeEmail = createFunction( { id: 'welcome-email', trigger: { event: 'user.created' }, // dot-notation, not slash retries: 3, // explicit retry count }, async ({ event, step }) => { const profile = await step.run('fetch-profile', async () => { return fetchProfile(event.data.userId); }); await step.sleep('wait-1-day', '1d'); // identical await step.run('send-email', async () => { await sendEmail({ to: profile.email }); }); } ); ``` ### Key differences | Feature | Inngest | Rotor | |---------|---------|-------| | Event name format | `user/created` (slash) | `user.created` (dot) | | Serve adapter | `serve()` (conflicts with Node.js) | `serveWorkflow()` (no naming conflict) | | Pricing | $100/mo for 500k steps | $19/mo Pro for 100k jobs (5.3x cheaper) | | State storage | Inngest cloud | Your Postgres (you own the data) | | Open source | Server is closed | `rotor-core` is MIT | | Step graph | Inngest Cloud UI | `/dashboard/runs/:id` on your domain | ### Migrating `serve()` ```typescript // Inngest import { serve } from 'inngest/hono'; app.use('/api/inngest', serve({ client: inngest, functions: [myFn] })); // Rotor — same pattern, different import import { serveWorkflow } from '@rotorsh/sdk'; serveWorkflow(app, { functions: [myFn], signingKey: process.env.ROTOR_SIGNING_KEY! }); ``` ### Migrating event publishing ```typescript // Inngest await inngest.send({ name: 'user/created', data: { userId: '123' } }); // Rotor const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! }); await rotor.send('user.created', { userId: '123' }); ``` ### Migrate from Vercel Cron URL: https://rotor.sh/docs/migrate/from-vercel-cron ## TL;DR Vercel Cron triggers HTTP GETs (or POSTs) on a schedule. **Rotor schedules + HTTP callback mode** replace it — with no **1-cron-per-day Hobby cap** and no per-project Pro ceiling. Your existing `/api/cron/*` handlers work as-is; Rotor becomes the scheduler that calls them. At **Rotor Free** ($0) you get **unlimited schedules** against a 10k/mo execution budget — vs. Vercel Hobby's hard **1 cron per day** limit. ## Pricing at a glance | Plan | Price | Schedule limits | Notes | |------|-------|-----------------|-------| | Vercel Hobby | $0 | **1 cron per day** | Daily granularity only | | Vercel Pro | $20/mo | Every minute | Per-project Vercel plan cost | | **Rotor Free** | **$0** | **Unlimited schedules** | 10k executions/mo total | | **Rotor Pro** | **$19/mo** | **Unlimited schedules** | 100k executions/mo (covers most cron use cases) | | **Rotor Team** | **$99/mo** | **Unlimited schedules** | 1M executions/mo | **Key differentiator:** Rotor Free lets you run 10,000 scheduled executions per month across **as many schedules as you want**. Vercel Hobby caps you at 1 cron per day regardless of execution volume. If you're hitting the Hobby cron cap, Rotor Free is a drop-in replacement at the same price ($0). Competitor pricing was verified against vercel.com/pricing on the publish date of this guide. Vercel frequently adjusts their cron allowances — re-read their pricing page before making a purchasing decision. ## Architecture mapping | Vercel Cron concept | Rotor equivalent | Migration notes | |---------------------|------------------|-----------------| | `vercel.json` `crons[]` entry | `POST /v1/schedules` | One schedule per entry | | Handler route (e.g. `/api/cron/cleanup`) | Queue with `callback_url` | Same handler — just attach its URL to a Rotor queue | | `CRON_SECRET` bearer header | `X-Rotor-Signature` HMAC verification | Swap the auth check; handler body stays the same | | Default UTC | **Explicit `timezone` required** | Rotor schedules MUST declare a timezone — see below | | Per-project isolation | Per-workspace isolation | Works across any deploy target (Vercel, Railway, Fly, self-hosted) | **No handler-side changes required:** Vercel Cron already hits an HTTP route on your app. Rotor does the same thing — you just change what authenticates the incoming request (HMAC instead of bearer). ## Before / After ```json vercel.json (before) { "crons": [ { "path": "/api/cron/cleanup", "schedule": "0 */6 * * *" }, { "path": "/api/cron/daily-digest", "schedule": "0 12 * * *" } ] } ``` ```bash Rotor (after) # 1. Create one queue per cron, pointing at your existing handler URL curl -X POST https://api.rotor.sh/v1/queues \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name":"cron-cleanup"}' curl -X PATCH https://api.rotor.sh/v1/queues/cron-cleanup \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "callback_url": "https://your-app.vercel.app/api/cron/cleanup", "rotate_callback_secret": true }' # Response returns callback_secret ONCE — save to env var ROTOR_CALLBACK_SECRET. # 2. Attach a schedule curl -X POST https://api.rotor.sh/v1/schedules \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "queue": "cron-cleanup", "cron": "0 */6 * * *", "timezone": "UTC" }' ``` Delete the `crons` array from `vercel.json` once the Rotor schedule is live. Your handler route stays exactly where it is. ## Timezone — read this first **Rotor schedules REQUIRE an explicit `timezone` parameter** (API-12). Vercel Cron defaults to UTC silently — when you migrate, **decide** what timezone you want. - If your existing cron was "run at 12:00 every day" and you intended that in UTC, set `"timezone": "UTC"`. - If you intended local business hours (e.g. "noon eastern"), set `"timezone": "America/New_York"` — Rotor will handle DST. Pick wrong and your reports will fire an hour off twice a year. Supported timezone values: any IANA zone name (e.g. `UTC`, `America/New_York`, `Europe/London`, `Asia/Tokyo`). See the full list at [IANA tz database](https://www.iana.org/time-zones). ## Signature verification Vercel Cron authenticates callbacks via the `CRON_SECRET` env var + bearer header: ```typescript // Before: Vercel Cron handler export async function GET(req: Request) { const auth = req.headers.get("authorization"); if (auth !== `Bearer ${process.env.CRON_SECRET}`) { return new Response("unauthorized", { status: 401 }); } // ...do cron work } ``` Rotor authenticates via **HMAC signature** on the request body — stronger because it binds the signature to the exact payload, preventing replay with altered bodies. See the full [Webhook Signature Verification](/webhooks/signing) reference for Node, Python, Go, and Ruby. ```typescript // After: Rotor callback handler import { createHmac, timingSafeEqual } from "node:crypto"; export async function POST(req: Request) { const body = await req.text(); if (!verifyRotorSignature(req.headers, body)) { return new Response("unauthorized", { status: 401 }); } const jobId = req.headers.get("x-rotor-job-id"); // ...do cron work (same as before); use jobId for idempotency if needed return new Response("ok"); } function verifyRotorSignature(headers: Headers, body: string): boolean { const id = headers.get("x-rotor-job-id") ?? ""; const ts = headers.get("x-rotor-timestamp") ?? ""; const sig = headers.get("x-rotor-signature") ?? ""; if (!id || !ts || !sig) return false; if (Math.abs(Date.now() / 1000 - Number(ts)) > 300) return false; const expected = `v1,${createHmac("sha256", process.env.ROTOR_CALLBACK_SECRET!) .update(`${id}.${ts}.${body}`) .digest("base64")}`; const a = Buffer.from(expected); const b = Buffer.from(sig); return a.length === b.length && timingSafeEqual(a, b); } ``` **Request method changes:** Vercel Cron sends `GET`. Rotor sends `POST` with a JSON body (empty `{}` for schedules with no payload). Update your route handler from `GET` to `POST` — or accept both if you want to leave Vercel Cron running in parallel during migration. ## Auth migration cheatsheet | Check | Vercel Cron | Rotor callback | |-------|-------------|----------------| | Header read | `authorization: Bearer $CRON_SECRET` | `x-rotor-signature: v1,` | | Secret source | `process.env.CRON_SECRET` | `process.env.ROTOR_CALLBACK_SECRET` (minted once via `PATCH /v1/queues/:name`) | | Replay window | None | ±300s (checked against `x-rotor-timestamp`) | | Idempotency | — | `x-rotor-job-id` — stable across retries | | Request method | `GET` | `POST` | ## Automated import We ship a small script that reads your `vercel.json`, extracts the `crons[]` array, and emits the Rotor queue + schedule configuration as JSON-lines. ```bash # Run it npx tsx docs/scripts/import-from-vercel-cron.ts ./vercel.json ``` Example input and output: ```json // vercel.json { "crons": [ { "path": "/api/cron/cleanup", "schedule": "0 */6 * * *" }, { "path": "/api/cron/daily-digest", "schedule": "0 12 * * *" } ] } ``` ``` // stdout — one object per line {"type":"queue","config":{"name":"cron-cleanup","callback_url":"https://YOUR_DOMAIN/api/cron/cleanup"}} {"type":"schedule","config":{"queue":"cron-cleanup","cron":"0 */6 * * *","timezone":"UTC"}} {"type":"queue","config":{"name":"cron-daily-digest","callback_url":"https://YOUR_DOMAIN/api/cron/daily-digest"}} {"type":"schedule","config":{"queue":"cron-daily-digest","cron":"0 12 * * *","timezone":"UTC"}} ``` After running the script: 1. Search-and-replace `YOUR_DOMAIN` with your actual production host (e.g. `your-app.vercel.app`). 2. Replay the `type:queue` lines as `POST /v1/queues` + `PATCH .../callback_url`. 3. Replay the `type:schedule` lines as `POST /v1/schedules`. 4. Verify one scheduled execution fires and hits your handler, **then** remove the `crons` array from `vercel.json`. ## Next steps 1. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card. 2. **Install the SDK:** `pnpm add @rotorsh/sdk` — see the [Node.js quickstart](/guides/node-quickstart). 3. **Join the [Anthropic Discord](https://www.anthropic.com/discord)** — `#rotor` channel for migration help. ### Move your Vercel env vars to Rotor URL: https://rotor.sh/docs/migrate/from-vercel-env-vars ## TL;DR If your Rotor jobs reference `process.env.STRIPE_KEY` or `process.env.SLACK_TOKEN` in callback URLs, headers, or payloads, you can move those into Rotor's secrets vault and reference them as `${{ secrets.STRIPE_KEY }}` instead — gaining a centralized audit trail, rotation API, and team-shared access without redeploying your app. This guide covers secrets **used by Rotor jobs** at dispatch time (callback URLs, callback headers, job payloads). It does NOT cover secrets your app reads at boot. See [What stays in Vercel/Railway](#what-stays-in-vercelrailway) for the full scope boundary. ## Why this matters - **Audit trail** — every secret access is recorded in `audit_event` with actor and timestamp. Know who changed what and when, without digging through Vercel's team activity log. - **Rotation without redeploy** — `PATCH /v1/secrets/:name` re-encrypts the value. Queued jobs pick up the new plaintext at next dispatch. No redeploy, no downtime. - **Team-shared** — secrets are workspace-scoped. Teammates with workspace access read the same values without sharing your Railway or Vercel login credentials. ## What stays in Vercel/Railway Rotor secrets are resolved at **job dispatch time**. Anything your application reads **outside** of a Rotor job should stay where it is: - Database connection strings (your app needs them at boot) - Supabase service role key and anon key (read by your backend at startup) - Auth provider secrets (NextAuth, Clerk, etc.) - Anything consumed outside a Rotor callback handler If you're not sure, ask: "Does Rotor need this value to dispatch a job?" YES → vault it. NO → leave it in Vercel/Railway. ## Architecture mapping | Concept | Vercel / Railway env vars | Rotor secrets vault | |---------|--------------------------|---------------------| | Storage | Platform UI or `.env` file | Encrypted at rest, AES-256-GCM | | Reference | `process.env.MY_KEY` | `${{ secrets.MY_KEY }}` | | Rotation | Edit in dashboard, redeploy | `PATCH /v1/secrets/:name`, no redeploy | | Audit | None (or Vercel team log) | `audit_event` rows on every create / rotate / delete / access | | Scope | Per project | Per workspace (team-shared) | | Resolution | At app boot / request time | At job dispatch time (plaintext never in BullMQ or Postgres) | ## Before / After The common pattern: your callback handler reads `process.env.NOTIFY_TOKEN` to authenticate outbound requests. After migration, Rotor injects the token directly into the callback URL or header — your handler no longer needs the env var. **`ROTOR_CALLBACK_SECRET` stays in your env.** That secret verifies inbound requests *from* Rotor to your handler. Rotor can't inject it into itself. Only secrets that Rotor *sends outbound* (in callback URLs, headers, or payloads) belong in the vault. ```typescript Before — env var in callback URL // your-app/api/rotor-callback.ts export async function POST(req: Request) { const auth = req.headers.get("authorization"); // ROTOR_CALLBACK_SECRET stays in your env — it verifies inbound Rotor requests if (auth !== `Bearer ${process.env.ROTOR_CALLBACK_SECRET}`) { return new Response("unauthorized", { status: 401 }); } const job = await req.json(); // NOTIFY_TOKEN used to authenticate outbound request — this moves to Rotor vault await fetch( `https://api.example.com/notify?token=${process.env.NOTIFY_TOKEN}`, { method: "POST", body: JSON.stringify(job) } ); return new Response("ok"); } ``` ```typescript After — secret vaulted, URL interpolation // 1. Store the secret once: // POST /v1/secrets {"name":"NOTIFY_TOKEN","value":"abc123..."} // 2. Configure your queue's callback URL to interpolate the secret: // PATCH /v1/queues/notifications { // "callback_url": "https://api.example.com/notify?token=${{ secrets.NOTIFY_TOKEN }}" // } // Rotor resolves ${{ secrets.NOTIFY_TOKEN }} at dispatch — plaintext never // persists in Redis or Postgres. // 3. Your callback handler no longer needs NOTIFY_TOKEN: export async function POST(req: Request) { const auth = req.headers.get("authorization"); if (auth !== `Bearer ${process.env.ROTOR_CALLBACK_SECRET}`) { return new Response("unauthorized", { status: 401 }); } // The token is already in the URL Rotor called — handler doesn't need it. return new Response("ok"); } ``` ### Using secrets in callback headers If your downstream API expects the token as a header rather than a query parameter: ```bash curl -X PATCH https://api.rotor.sh/v1/queues/notifications \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "callback_headers": { "X-API-Key": "${{ secrets.NOTIFY_TOKEN }}" } }' ``` Rotor resolves the template at dispatch and forwards the plaintext header value to your URL. The template string `${{ secrets.NOTIFY_TOKEN }}` is what gets stored in Postgres — never the plaintext. ## Migration steps 1. **Create the secret** — use the dashboard at `/dashboard/settings/secrets` or the API: ```bash curl -X POST https://api.rotor.sh/v1/secrets \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name":"NOTIFY_TOKEN","value":"your-token-here"}' # Response: {"name":"NOTIFY_TOKEN","value_hint":"your-t...","created_at":"..."} ``` 2. **Update your queue's callback URL or `callback_headers`** to reference `${{ secrets.YOUR_KEY }}`: ```bash curl -X PATCH https://api.rotor.sh/v1/queues/notifications \ -H "Authorization: Bearer $ROTOR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"callback_url":"https://api.example.com/notify?token=${{ secrets.NOTIFY_TOKEN }}"}' ``` 3. **Verify with a test job** — enqueue a job and check that your downstream endpoint received the resolved token (not the template string) in the URL or header. 4. **Remove the env var from Vercel/Railway** — once you've confirmed delivery works end-to-end. 5. **Redeploy if needed** — only if the env var was previously read at app boot. For most callback-token use cases (env var only needed in outbound calls), no redeploy is required. **Run a 24-hour observation window before removing the env var.** Confirm zero callback failures with the new vault-sourced value before deleting the original env var from Railway or Vercel. If Rotor's resolution fails, you can roll back instantly by re-adding the env var. ## Bulk migration script The script below reads env var names from your environment (or from `.env`), filters the ones you want to migrate, and POSTs each to `/v1/secrets`. Review the list before running — it only creates secrets you explicitly include. ```typescript Node.js (npx tsx) #!/usr/bin/env node // scripts/migrate-to-rotor-vault.ts // Usage: ROTOR_API_KEY=xxx npx tsx scripts/migrate-to-rotor-vault.ts // Add the env var names you want to migrate to KEYS_TO_MIGRATE below. const ROTOR_API_URL = process.env.ROTOR_API_URL ?? "https://api.rotor.sh"; const ROTOR_API_KEY = process.env.ROTOR_API_KEY ?? ""; const KEYS_TO_MIGRATE = [ "NOTIFY_TOKEN", "SLACK_BOT_TOKEN", "STRIPE_WEBHOOK_SECRET", // add more... ]; if (!ROTOR_API_KEY) { console.error("ROTOR_API_KEY is required"); process.exit(1); } for (const name of KEYS_TO_MIGRATE) { const value = process.env[name]; if (!value) { console.warn(`[skip] ${name} — not set in current env`); continue; } const res = await fetch(`${ROTOR_API_URL}/v1/secrets`, { method: "POST", headers: { "Authorization": `Bearer ${ROTOR_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ name, value }), }); if (res.status === 201) { const data = (await res.json()) as { value_hint: string }; console.log(`[ok] ${name} → hint: ${data.value_hint}`); } else if (res.status === 409) { console.log(`[skip] ${name} — already exists in vault`); } else { const text = await res.text(); console.error(`[fail] ${name} — HTTP ${res.status}: ${text}`); process.exit(1); } } console.log("\nNext step: update your queue callback_url / callback_headers"); console.log(' PATCH /v1/queues/ {"callback_headers":{"X-Token":"${{ secrets.YOUR_KEY }}"}}'); ``` The script never logs plaintext values — it only prints the `value_hint` returned by the API (first 8 characters + `...`). ## Reference values | Field | Details | |-------|---------| | Secret name format | `[A-Z][A-Z0-9_]*` (uppercase, underscores, digits — no lowercase) | | Surfaces supported | Callback URL, callback headers, job payload fields, workflow step inputs | | Resolution timing | Dispatch-time — Rotor substitutes plaintext before sending; template string stored in Postgres/Redis | | Audit trail | Every create / rotate / delete / access logged to `audit_event` with `resource_type = 'secret'` | | Rotation | `PATCH /v1/secrets/:name` with `{"value":"new-value"}` — zero-downtime, no redeploy | | Listing | `GET /v1/secrets` returns names + hints, never plaintext | ## Audit verification After migrating, confirm the audit trail is clean: ```sql -- No plaintext values should appear in input/output blobs SELECT event_type, occurred_at, actor_id FROM audit_event WHERE resource_type = 'secret' ORDER BY occurred_at DESC LIMIT 10; ``` Expected: `secret.created` rows for each migrated key. Confirm the `input` and `output` columns do not contain the raw secret values. ## Next steps 1. **[Dashboard secrets page](https://rotor.sh/dashboard/settings/secrets)** — create, rotate, and delete secrets in the UI. 2. **[/v1/secrets API reference](/api-reference/introduction)** — full CRUD: create, list, rotate (`PATCH`), delete. 3. **[Audit log API](/api-reference/introduction)** — query `audit_event` for secret access history. 4. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card. --- ## Enterprise ### Enterprise URL: https://rotor.sh/docs/enterprise/index ## Audit Export Export your workspace's full audit trail to your data warehouse for compliance, forensic analysis, and long-term retention. - [Audit Export to Snowflake](/enterprise/audit-export-snowflake) — pipe `audit_event` rows to Snowflake via S3 staging with a daily `COPY INTO` task. - [Audit Export to S3](/enterprise/audit-export-s3) — write parquet files to your S3 bucket for Databricks, BigQuery, Redshift, or Athena ingestion. The `GET /v1/audit?format=csv` endpoint is available **now** on all plans for ad-hoc exports. Automated S3/Snowflake push is Enterprise-tier and ships in Phase 4. ## SSO / SAML **Coming in Phase 4.** Enterprise customers will be able to configure SAML 2.0 single sign-on with Okta, Azure AD, and Google Workspace. Team members log in via your IdP; Rotor maps group memberships to workspace roles. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) to be notified when SSO reaches general availability. ## SOC 2 Type II **Coming in Phase 4.** Rotor is pursuing SOC 2 Type II certification covering Security, Availability, and Confidentiality trust service criteria. The audit period begins once the managed-worker and dedicated-Redis infrastructure (Phase 4) is stable in production for 90 days. Expected report availability: Q4 2026. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) to receive a draft controls summary under NDA. ## Contact For Enterprise pricing, custom contracts, or data-residency requirements: [enterprise@rotor.sh](mailto:enterprise@rotor.sh) ### Snowflake Audit Export URL: https://rotor.sh/docs/enterprise/audit-export-snowflake Audit-event data is available **now** via `GET /v1/audit?format=csv`. Automated push to S3/Snowflake is an Enterprise-tier add-on shipping in Phase 4 — until then, customers can run the documented `pg_dump` + `COPY INTO` recipe themselves on a CSV export. ## Why Snowflake The `audit_event` table is append-only and partitioned by month — a perfect fit for Snowflake's billing model. Rotor generates structured JSON in the `metadata` column (JSONB in Postgres, VARIANT in Snowflake), which Snowflake queries natively with `metadata:key` dot-notation. Key advantages of routing audit data through Snowflake: - **Cost-efficient long-term storage.** Monthly parquet files are ~10× smaller than row-based exports. A 10M-row workspace accumulates ~200 MB parquet/month vs 2 GB raw CSV. - **Clustered tables.** Snowflake's automatic clustering on `occurred_at` keeps query costs predictable for time-range scans regardless of table size. - **VARIANT for metadata.** No schema migration is needed as Rotor adds fields to the metadata payload — Snowflake's semi-structured storage handles schema evolution transparently. ## Pipeline Overview ``` Postgres (audit_event, partitioned by month) │ │ pg_dump --format=custom → parquet conversion (or direct CSV → S3 → Snowflake) ▼ S3 bucket (customer-managed or Rotor-managed) s3://bucket/rotor-audit/workspace=/year=YYYY/month=MM/ │ │ Snowflake COPY INTO (scheduled daily task, X-SMALL warehouse) ▼ Snowflake table: rotor_audit_events ``` **Data residency note:** Enterprise customers may require that Rotor writes parquet files to a customer-owned S3 bucket, and that no data transits Rotor-managed S3. See the [Customer-Managed Bucket Variant](#customer-managed-s3-bucket-variant) section below. ## Schema The `audit_event` Postgres table maps to Snowflake as follows: | Postgres column | Postgres type | Snowflake type | Notes | |------------------|------------------|------------------|--------------------------------------------| | `id` | UUID | VARCHAR(36) | Hyphen-separated UUID string | | `workspace_id` | TEXT | VARCHAR(64) | Foreign key; use for row-level security | | `event_type` | TEXT | VARCHAR(128) | e.g. `webhook.created`, `job.completed` | | `actor_id` | TEXT | VARCHAR(64) | API key ID or system actor | | `resource_id` | TEXT | VARCHAR(256) | Affected resource ID (nullable) | | `metadata` | JSONB | VARIANT | Event-specific payload; query with `:` | | `occurred_at` | TIMESTAMPTZ | TIMESTAMP_TZ(9) | Cluster key; used for COPY INTO partition | | `ip_address` | TEXT | VARCHAR(45) | IPv4 or IPv6 (nullable) | ## S3 Stage Setup Create a Snowflake external stage pointing to your S3 bucket. Replace ``, ``, and `` with your values: ```sql -- Create storage integration (run once as ACCOUNTADMIN) CREATE OR REPLACE STORAGE INTEGRATION rotor_audit_s3 TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = 'S3' ENABLED = TRUE STORAGE_AWS_ROLE_ARN = '' STORAGE_ALLOWED_LOCATIONS = ('s3:///rotor-audit/'); -- Retrieve the Snowflake AWS account ID and external ID for IAM trust policy DESC INTEGRATION rotor_audit_s3; -- Create the stage CREATE OR REPLACE STAGE rotor_audit STORAGE_INTEGRATION = rotor_audit_s3 URL = 's3:///rotor-audit/' FILE_FORMAT = (TYPE = 'PARQUET'); ``` ### IAM Role (trust policy) Attach this trust policy to the IAM role you pass as `STORAGE_AWS_ROLE_ARN`. Replace `` and `` with the values from `DESC INTEGRATION`: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:::root" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "" } } } ] } ``` ### IAM permissions policy The role needs minimal S3 access on the audit prefix: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetObject", "s3:ListBucket"], "Resource": [ "arn:aws:s3:::", "arn:aws:s3:::/rotor-audit/*" ] } ] } ``` ## COPY INTO Command Run this after each daily parquet drop to load the previous day's partition: ```sql COPY INTO rotor_audit_events FROM @rotor_audit/workspace=/year=2026/month=04/ FILE_FORMAT = (TYPE = 'PARQUET') MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE ON_ERROR = 'CONTINUE'; ``` For a full historical backfill, omit the date-prefix to copy all partitions: ```sql COPY INTO rotor_audit_events FROM @rotor_audit/workspace=/ FILE_FORMAT = (TYPE = 'PARQUET') MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE ON_ERROR = 'CONTINUE'; ``` ## Scheduling (Daily Snowflake Task) Schedule the COPY INTO to run automatically each night. This keeps warehouse size at X-SMALL since parquet files for a typical 10M-row workspace are ~20 MB/day: ```sql CREATE OR REPLACE TASK load_rotor_audit_daily WAREHOUSE = 'COMPUTE_WH' SCHEDULE = 'USING CRON 0 3 * * * UTC' -- 03:00 UTC daily AS COPY INTO rotor_audit_events FROM @rotor_audit/workspace=/ FILE_FORMAT = (TYPE = 'PARQUET') MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE ON_ERROR = 'CONTINUE'; ALTER TASK load_rotor_audit_daily RESUME; ``` ### Cost estimate | Workspace size | Parquet/day | Warehouse time | Approx cost/month | |--------------------|-------------|----------------|-------------------| | 1M events/month | ~2 MB | ~10s X-SMALL | under $1 | | 10M events/month | ~20 MB | ~30s X-SMALL | ~$2 | | 100M events/month | ~200 MB | ~3 min X-SMALL | ~$15 | Prices based on Snowflake on-demand at $3.00/credit; X-SMALL = 1 credit/hour. ## Customer-Managed S3 Bucket Variant Enterprise customers with data-residency requirements can configure Rotor to write parquet files directly to a customer-owned S3 bucket. In this mode: 1. Rotor assumes a customer-provided IAM role via `sts:AssumeRole` (cross-account). 2. The customer S3 bucket receives parquet files at the standard Hive-style prefix: `s3://customer-bucket/rotor-audit/workspace=/year=YYYY/month=MM/day=DD/`. 3. Rotor never reads the data back — write-only access via `s3:PutObject`. 4. The customer runs their own Snowflake COPY INTO on their own schedule. To enable, contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) with your AWS account ID and the ARN of the role you want Rotor to assume. ## Verification After each COPY INTO, verify the row count matches the Postgres source: ```sql -- Snowflake: count for a specific day SELECT COUNT(*) FROM rotor_audit_events WHERE occurred_at::DATE = '2026-04-13' AND workspace_id = ''; ``` ```sql -- Postgres: equivalent count (run against your audit replica) SELECT COUNT(*) FROM audit_event WHERE occurred_at::DATE = '2026-04-13' AND workspace_id = ''; ``` Row counts should match. A discrepancy of ≤0.01% is acceptable (COPY INTO `ON_ERROR=CONTINUE` skips malformed rows; these are logged in `COPY_HISTORY`). ```sql -- Check for skipped rows SELECT * FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY( TABLE_NAME => 'ROTOR_AUDIT_EVENTS', START_TIME => DATEADD('hour', -25, CURRENT_TIMESTAMP()) )) WHERE STATUS = 'PARTIALLY_LOADED'; ``` --- *Reference: Snowflake audit log to S3 pipeline pattern — see [Snowflake docs: Loading from S3](https://docs.snowflake.com/en/user-guide/data-load-s3).* ### S3 Audit Export URL: https://rotor.sh/docs/enterprise/audit-export-s3 Audit-event data is available **now** via `GET /v1/audit?format=csv`. Automated push to S3 is an Enterprise-tier add-on shipping in Phase 4 — until then, customers can download CSV exports from `GET /v1/audit?format=csv` and upload them manually to their S3 bucket using the schema and prefix layout documented below. ## Output Schema Parquet files produced by Rotor follow this column schema, derived from the `audit_event` Postgres table (shipped in plan 02-01): | Column | Parquet type | Nullable | Notes | |----------------|---------------------------|----------|--------------------------------------------| | `id` | STRING | No | UUID (hyphen-separated) | | `workspace_id` | STRING | No | Workspace key (`ws_...`) | | `event_type` | STRING | No | e.g. `job.completed`, `guardrail.blocked` | | `actor_id` | STRING | No | API key ID or `system` | | `resource_id` | STRING | Yes | Affected resource ID (job ID, webhook ID) | | `metadata` | STRING (JSON-encoded) | Yes | Event-specific payload as JSON string | | `occurred_at` | INT96 (TIMESTAMP_MICROS) | No | UTC; partition key | | `ip_address` | STRING | Yes | IPv4 or IPv6 | `metadata` is written as a JSON string (not a nested struct) to maximise compatibility across Databricks, BigQuery, and Redshift, all of which handle JSON strings with native functions (`JSON_VALUE`, `from_json`, `JSON_EXTRACT_PATH_TEXT`). ## Bucket Layout Files are written in **Hive-style partitioning** for compatibility with Athena, BigQuery wildcard queries, and Databricks Auto Loader: ``` s3://customer-bucket/rotor-audit/ workspace=ws_abc123/ year=2026/ month=04/ day=13/ part-00000.parquet part-00001.parquet _checksum.sha256 ← row count + sha256 of all parts ``` Each `part-NNNNN.parquet` file is at most 128 MB uncompressed (snappy-compressed in practice). A `_checksum.sha256` file is written atomically after all parts for a day complete, enabling idempotent detection of complete vs. in-progress drops. ## Schedule By default, Rotor exports the previous day's events at **02:00 UTC** each day. The schedule is configurable per workspace (Enterprise tier). Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) to change frequency or timezone. ## IAM Setup Rotor assumes a customer-provided IAM role to write to your S3 bucket. The role needs only `s3:PutObject` on the audit prefix — Rotor never reads from your bucket: ### Step 1: Create the IAM role Create a new IAM role with a trust policy that allows Rotor's AWS account to assume it: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::123456789012:root" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "" } } } ] } ``` Replace `123456789012` with Rotor's AWS account ID (provided during Enterprise onboarding). The `ExternalId` is unique per workspace and is provided by Rotor to prevent confused-deputy attacks. ### Step 2: Attach a minimal permissions policy ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Resource": "arn:aws:s3:::customer-bucket/rotor-audit/*" } ] } ``` ### Step 3: Provide Rotor with the role ARN Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) with your: - AWS account ID - IAM role ARN - S3 bucket name and region - Preferred export schedule (default: daily at 02:00 UTC) ## Athena Example Query Once files are in S3, register a Glue table pointing to the Hive partition layout: ```sql CREATE EXTERNAL TABLE rotor_audit ( id STRING, workspace_id STRING, event_type STRING, actor_id STRING, resource_id STRING, metadata STRING, occurred_at TIMESTAMP, ip_address STRING ) PARTITIONED BY (year STRING, month STRING, day STRING) STORED AS PARQUET LOCATION 's3://customer-bucket/rotor-audit/workspace=ws_abc123/' TBLPROPERTIES ('parquet.compression'='SNAPPY'); -- Load partitions MSCK REPAIR TABLE rotor_audit; ``` Run queries filtered to a specific event type: ```sql -- Find all guardrail blocks for a workspace in April SELECT occurred_at, actor_id, JSON_EXTRACT_SCALAR(metadata, '$.reason') AS reason FROM rotor_audit WHERE workspace = 'ws_abc123' AND event_type = 'guardrail.blocked' AND year = '2026' AND month = '04' ORDER BY occurred_at DESC LIMIT 100; ``` Cost estimate: Athena charges $5/TB scanned. A 10M-row workspace generates ~20 MB parquet/day; scanning one month of data costs approximately $0.003. ## BigQuery Transfer Use [BigQuery Data Transfer Service](https://cloud.google.com/bigquery/docs/s3-transfer) to ingest S3 parquet files into BigQuery automatically: 1. In BigQuery, go to **Data Transfers** → **Create Transfer**. 2. Select **Amazon S3** as the source. 3. Set the source URI pattern: `s3://customer-bucket/rotor-audit/workspace=/year=*/month=*/day=*/*.parquet` 4. Configure the service account with cross-account S3 read access. 5. Set schedule to **Daily** starting at 04:00 UTC (after Rotor's 02:00 UTC export completes). BigQuery automatically infers the Parquet schema. The `metadata` column is ingested as `STRING` and can be queried with `JSON_VALUE(metadata, '$.reason')`. ## Databricks Auto Loader Databricks Auto Loader natively supports S3 + Hive partitioning: ```python df = ( spark.readStream .format("cloudFiles") .option("cloudFiles.format", "parquet") .option("cloudFiles.schemaLocation", "/mnt/checkpoints/rotor-audit-schema") .load("s3://customer-bucket/rotor-audit/workspace=ws_abc123/") ) df.writeStream \ .format("delta") \ .option("checkpointLocation", "/mnt/checkpoints/rotor-audit") \ .table("rotor_audit_events") ``` Auto Loader processes only new files on each trigger, making it efficient for daily batch drops as well as near-real-time streaming if Rotor is configured for hourly exports. ## Redshift Spectrum For Redshift customers, use Redshift Spectrum to query directly from S3 without ingestion: ```sql -- Create external schema pointing to your S3 audit data CREATE EXTERNAL SCHEMA rotor_audit FROM DATA CATALOG DATABASE 'rotor_audit_db' IAM_ROLE 'arn:aws:iam:::role/RedshiftSpectrumRole'; -- Query directly SELECT event_type, COUNT(*) as cnt FROM rotor_audit.audit_events WHERE workspace_id = 'ws_abc123' AND occurred_at > GETDATE() - INTERVAL '30 days' GROUP BY 1 ORDER BY cnt DESC; ``` ## Verification After each daily export, verify completeness using the `_checksum.sha256` file: ```bash # Download checksum file aws s3 cp \ s3://customer-bucket/rotor-audit/workspace=ws_abc123/year=2026/month=04/day=13/_checksum.sha256 \ ./checksum.sha256 # File format: " " cat checksum.sha256 # Example: 142857 a3f9... # Compare with Postgres source count psql $DATABASE_URL -c " SELECT COUNT(*) FROM audit_event WHERE workspace_id = 'ws_abc123' AND occurred_at::date = '2026-04-13' " ``` Row counts should match. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) if you observe consistent discrepancies greater than 0.01%. --- ## Operators ### Admin MCP Server URL: https://rotor.sh/docs/operators/admin-mcp The **Rotor admin MCP server** lets a Claude Code session on your laptop inspect and triage a Rotor deployment in natural language. No SSH, no `redis-cli`. This is separate from the customer-facing Rotor MCP at `rotor.sh/mcp` (which uses `rt_ws_` / `rt_team_` keys). The admin MCP is operator-only and gated by a single `ADMIN_TOKEN` shared with the API. ## Install From the rotor monorepo root: ```bash pnpm --filter=@rotor/mcp-server build ``` ```bash claude mcp add rotor-admin -- node $(pwd)/packages/mcp-server/dist/index.js ``` Add to your shell profile or your Claude Code MCP config: ```bash export ROTOR_ADMIN_TOKEN="" export ROTOR_API_URL="https://api.rotor.sh" # optional override ``` Open a fresh Claude Code session and ask "What workspaces are running on Rotor?" — Claude should call `rotor_list_workspaces` and render the result. ## Available tools | Tool | Destructive | Description | | ------------------------------ | ----------- | ---------------------------------------------------------- | | `rotor_list_workspaces` | — | List all workspaces with plan and creation date | | `rotor_get_workspace_queues` | — | Queue depth per state for every queue in a workspace | | `rotor_get_workspace_schedules`| — | Schedules with BullMQ state and `drift` flag | | `rotor_inspect_job` | — | Full job detail (data, return value, failure reason) | | `rotor_get_dlq` | — | DLQ depth and recent failed jobs | | `rotor_replay_dlq` | yes | Idempotently re-enqueue DLQ jobs (deterministic IDs) | | `rotor_fire_schedule_now` | yes | Out-of-band schedule execution (does not reset cron) | Destructive tools require `confirm: true` in the call arguments. Claude Code will ask you to confirm before invoking them. ## Idempotency guarantees `rotor_replay_dlq` uses BullMQ's job-id uniqueness to dedup replays. Calling it twice with the same parameters produces the same `replay-${originalId}` IDs and BullMQ silently no-ops the second `addBulk`. `rotor_fire_schedule_now` enqueues a one-shot job — it does **not** call `upsertJobScheduler`. The cron's normal `next_run_at` is unchanged. ## Troubleshooting See the package README at `packages/mcp-server/README.md` for env var and 403 troubleshooting. --- ## Trust ### Security URL: https://rotor.sh/docs/security/index # Security SOC 2 Type II observation window started 2026-04-20; target report ~2026-10-20. ## Posture - **SOC 2 Type II** in observation (start: 2026-04-20; target report: ~2026-10-20) - All production traffic **TLS 1.3+** - Data at rest: **AES-256** (Supabase Postgres, AWS S3) - Secrets: Railway vault + AWS KMS; rotated on incident + annually - Penetration test: annual (next: 2026-10) - Vanta-monitored: Supabase, Fly.io, AWS, GitHub, Slack ## Sub-processors | Sub-processor | Purpose | Data | Region | | --- | --- | --- | --- | | Supabase | Auth, Postgres | PII, audit\_event, team\_member | us-east-1 | | Railway | Redis, application hosting | transient queue state | us-west2 / us-east-1 | | Fly.io | Managed workers (Enterprise) | customer compute | multi-region | | Stripe | Billing | customer\_id, invoice | US | | Anthropic | Brand-tone LLM judge | outbound payload samples | US | | Slack | Approval callbacks | approval metadata | US | | AWS | S3 staging (audit export) | Parquet exports | us-east-1 | | Sentry | Error tracking | stack traces, user\_id | US | ## Data Handling - **Encryption at rest**: AES-256-GCM for customer secrets (webhook, callback, mTLS CA) and audit-export credentials - **Encryption in transit**: TLS 1.3+ - **Retention**: plan-based — Free 7d · Pro 30d · Team 90d · Enterprise 365d+ - **PII redaction**: applied before worker receives payload (see Guardrail Engine) - **Audit logs**: partitioned by month, immutable, 1-year retention minimum ## Incident Response We commit to disclosing confirmed breaches affecting customer data **within 24 hours** of confirmation. To report a vulnerability or security incident: - **Email**: security@rotor.sh - We acknowledge reports within 24 hours - Critical issues are remediated within 7 days ## Policies The following policies govern our security program: - [Acceptable Use Policy](/compliance/acceptable-use-policy) - [Access Control Policy](/compliance/access-control-policy) - [Change Management Policy](/compliance/change-management-policy) - [Incident Response Policy](/compliance/incident-response-policy) - [Data Retention Policy](/compliance/data-retention-policy) - [Vendor Management Policy](/compliance/vendor-management-policy) - [Risk Assessment Policy](/compliance/risk-assessment-policy) ### compliance/acceptable-use-policy URL: https://rotor.sh/docs/compliance/acceptable-use-policy # Acceptable Use Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define expected behavior for all users of rotor.sh services — employees, contractors, and customers — to protect the platform's integrity, security, and other users' quality of service. ## Scope All Rotor employees, contractors, and end customers (including free-tier, Pro, Team, and Enterprise plans). ## Policy ### 1. Authorized Use Only Users may only interact with rotor.sh APIs, SDKs, MCP tools, and CLI using credentials they have legitimately obtained. Sharing API keys outside one's organization is prohibited. ### 2. Tenant Isolation Respect Users must not attempt to circumvent BullMQ tenant isolation (the `{ws_}` prefix boundary). Probing, enumerating, or accessing another workspace's queues, jobs, or audit events is prohibited regardless of technical feasibility. ### 3. PII and Sensitive Data Handling Payloads must not contain unencrypted PII outside the supported PII-redaction path (Guardrail Engine). Users who need to process PII must enable PII redaction in their workspace's guardrail config before enqueuing jobs containing personal data. ### 4. Rate and Quota Compliance Users must not attempt to circumvent per-plan job-execution quotas (Free: 10k/mo, Pro: 100k/mo, Team: 1M/mo). Artificially spreading load across multiple free-tier workspaces to exceed limits is prohibited and will result in account termination. ### 5. No Unauthorized Scraping or Reverse Engineering Users must not attempt to reverse-engineer rotor.sh internal APIs, Redis key structures, or BullMQ Lua scripts beyond what is documented in the public API reference. ### 6. No Abuse of the Approval Flow The approvals system is intended for genuine human-in-the-loop oversight. Automated scripts that auto-approve all jobs to bypass guardrail review violate the spirit of the approval system and may trigger account review. ### 7. No Malicious Payloads Job payloads must not contain instructions intended to exploit Rotor's infrastructure, other customers' handlers, or downstream callback recipients. This includes code injection, prompt injection against the brand-tone LLM judge, and SSRF payloads targeting callback URLs. ### 8. Compliance with Laws Users are responsible for ensuring their use of rotor.sh complies with applicable laws and regulations, including data protection laws (GDPR, CCPA) and export control regulations. ## Enforcement Violations may result in: - Immediate API key revocation - Workspace suspension (BIL-06 kill-switch) - Account termination without refund - Legal action where warranted Suspected violations are logged as `compliance.aup_violation` audit events and reviewed by the security team. ## Review Cadence This policy is reviewed annually. Next review: 2026-10-20. ### compliance/access-control-policy URL: https://rotor.sh/docs/compliance/access-control-policy # Access Control Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define how access to rotor.sh systems, data, and administrative functions is granted, maintained, and revoked — covering both internal team access and customer-facing RBAC. ## Scope All Rotor employees, contractors, and enterprise customers using SSO/SAML + RBAC features. ## Policy ### 1. Customer-Facing RBAC (ENT-01) rotor.sh implements a fixed four-role model: - **admin**: Full workspace control; manages members, API keys, SSO config, billing. - **developer**: Manages queues, jobs, schedules, guardrails, webhooks; cannot manage members. - **approver**: Approves or rejects pending jobs; cannot manage queues or members. - **viewer**: Read-only access to runs, audit events, and metrics. Role assignments follow least-privilege: new invites default to `viewer` unless explicitly set by an admin. The `approver` capability is additive — a developer's team member record may include `approver` in their capabilities array. ### 2. API Key Hierarchy API keys follow a three-tier hierarchy (AUTH-03): - **Team keys** (`rtr_t_*`): Cross-workspace access; issued only to admins. - **Workspace keys** (`rtr_w_*`): Single-workspace access; issued by admins/developers. - **Queue keys** (`rtr_q_*`): Single-queue access; issued by admins/developers. Keys expire after 90 days unless extended by an admin. Rotated keys (via `POST /v1/api-keys/:id/rotate`) inherit the parent key's expiry. ### 3. SSO / SAML (ENT-01) Enterprise customers may configure SAML 2.0 SSO via Supabase Auth. SAML role assertions (`raw_app_meta_data.role`) override invite-time roles where the asserted role is a valid four-role value. The Supabase SAML provider toggle must be enabled per-project by the Rotor operator. ### 4. Internal Team Access Rotor employees access production infrastructure via: - Supabase Dashboard: email + TOTP 2FA required. - Railway Dashboard: SSO via GitHub; 2FA required on GitHub. - Fly.io Dashboard: email + TOTP 2FA required. - AWS Console: IAM roles with MFA enforcement; no root account usage. New employee access is provisioned on their first day; deprovisioned within 24 hours of offboarding. ### 5. Secret Rotation - Webhook signing secrets: rotated via `POST /v1/webhooks/:id/rotate-secret`. - Callback signing secrets: rotated via `PATCH /v1/queues/:id` (`rotate_callback_secret: true`). - WEBHOOK_SECRET_ENCRYPTION_KEY: rotated annually and on any credential-exposure incident. - All secrets stored in Railway vault; never in `.env` files committed to version control. ### 6. Orphan / Legacy Keys Orphan API key rows (null or broken `team_member_id`) default to `role=viewer` (least-privilege). `scripts/audit-orphan-api-keys.sh` runs as a pre-deploy CI gate before production cutover. ## Review Cadence Reviewed annually. Next review: 2026-10-20. ### compliance/change-management-policy URL: https://rotor.sh/docs/compliance/change-management-policy # Change Management Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Ensure that changes to rotor.sh production systems are reviewed, tested, and deployed in a controlled manner to minimize risk and maintain availability. ## Scope All changes to production infrastructure: code deployments (apps/api, apps/worker, apps/www), database migrations, Railway/Fly configuration changes, and infrastructure-as-code changes. ## Policy ### 1. Pull Request Review All code changes require at least one approving review from a Rotor team member before merge. Self-approval is prohibited except for hotfixes under the emergency procedure below. ### 2. CI Gates (must pass before merge) The following CI checks must pass on every PR: - `pnpm typecheck` — TypeScript compilation across all packages - `pnpm test` — full unit test suite (≥95% pass rate; flaky tests are fixed within 24h) - `bash scripts/audit-prefix.sh` — BullMQ prefix isolation invariant - `bash scripts/audit-compliance-docs.sh` — no PLACEHOLDER compliance docs - `bash scripts/audit-openapi.sh` — all /v1/ routes documented ### 3. Deploy Cadence - **Staging**: automatic deploy on merge to `main` via Railway CI - **Production**: manual promotion from staging after smoke tests pass (minimum 30 min soak) - **Database migrations**: applied via `pnpm db:migrate` before worker/API restart; rollback script required in PR description for all DDL changes ### 4. Feature Flags New features that affect billing, quotas, or enterprise behavior must be gated behind a feature flag in the `feature_flags` table. This allows instant rollback without a code deployment. ### 5. Emergency Hotfix Procedure For critical production incidents (data loss, auth bypass, billing error): 1. Create a hotfix branch directly from the last production tag. 2. Implement the minimal fix; add a regression test. 3. Get async review from at least one team member (Slack DM acceptable for P0). 4. Deploy directly to production; backfill the PR review within 2 business days. 5. File an incident report within 24 hours. ### 6. Secrets and Environment Variables Environment variable changes (Railway vault) must be: - Documented in the PR description with the variable name (not value). - Applied in staging first, then production. - Never committed to version control. ## Review Cadence Reviewed annually. Next review: 2026-10-20. ### compliance/incident-response-policy URL: https://rotor.sh/docs/compliance/incident-response-policy # Incident Response Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define the process for detecting, responding to, and communicating security incidents and service disruptions affecting rotor.sh customers. ## Scope All security incidents (data breaches, credential exposure, unauthorized access) and significant service disruptions (>1h downtime, data loss, billing errors). ## Definitions - **P0 — Critical**: Data breach, auth bypass, billing fraud, or complete service outage. Response starts within 1 hour. - **P1 — High**: Partial service outage, significant performance degradation (>2x latency), or credential exposure. Response starts within 4 hours. - **P2 — Medium**: Non-critical bug affecting a subset of customers. Response within 24 hours. - **P3 — Low**: Cosmetic issues, documentation errors. Response within 72 hours. ## Incident Response Process ### 1. Detection Incidents may be detected via: - Sentry error alerts (apps/api, apps/worker) - Railway service health monitors - Customer reports to support@rotor.sh or security@rotor.sh - Vanta compliance monitoring alerts - Internal monitoring (BullMQ job failure spikes, Postgres connection errors) ### 2. Triage (within 1h for P0, 4h for P1) 1. Confirm the incident is real (not a false positive). 2. Classify severity (P0–P3). 3. Identify affected scope: which workspaces, customers, data types. 4. Open an incident Slack channel: `#incident-YYYY-MM-DD-brief-description`. 5. Assign an incident commander (IC). ### 3. Containment - For credential exposure: rotate affected secrets immediately (WEBHOOK_SECRET_ENCRYPTION_KEY, Supabase service key, Railway tokens). - For unauthorized access: revoke affected API keys / sessions via Supabase admin. - For service disruption: engage Railway/Fly support and activate the kill-switch (BIL-06) for affected workspaces if data integrity is at risk. ### 4. Eradication and Recovery - Identify root cause; implement the minimum fix. - Apply the fix via the emergency hotfix procedure (see Change Management Policy). - Validate fix in staging before applying to production. - Restore service; verify affected customers can access their data. ### 5. Customer Communication (P0 / P1 — 24h disclosure commitment) We commit to disclosing confirmed breaches affecting customer data **within 24 hours** of confirmation. Communication channels: - Email to affected workspace admins (sourced from Supabase team_member table). - Status page update at status.rotor.sh. - For Enterprise: direct phone/Slack contact if account has a CSM. Disclosure must include: - What happened and when (UTC timestamps). - What data was affected. - What we have done to contain it. - What customers should do (e.g. rotate API keys). ### 6. Post-Incident Review Within 5 business days of P0/P1 resolution: - Write a blameless post-mortem (5 Whys format). - Document timeline, contributing factors, and action items. - Share with affected customers on request. - Add action items to engineering backlog with due dates. ## Escalation Contacts | Role | Contact | |------|---------| | Incident Commander | daan@shyft.ai | | Supabase support | support.supabase.com | | Railway support | railway.app/help | | Fly.io support | fly.io/docs/support | | Sentry on-call | via Sentry alerting rules | ## Review Cadence Reviewed annually. Next review: 2026-10-20. ### compliance/data-retention-policy URL: https://rotor.sh/docs/compliance/data-retention-policy # Data Retention Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define how long rotor.sh retains customer data and system logs, and how data is deleted when it is no longer needed. ## Scope All customer data stored by rotor.sh: job history (Redis + Postgres), audit events, webhook delivery logs, billing records, and team/user PII. ## Retention Schedule | Data Type | Free | Pro | Team | Enterprise | |-----------|------|-----|------|------------| | Active jobs (Redis) | 7 days after completion | 30 days | 90 days | 365 days | | job_history (Postgres archive) | 7 days | 30 days | 90 days | 365 days | | audit_event | 1 year | 1 year | 1 year | 1 year (+ custom on request) | | Webhook delivery logs | Same as job history | Same | Same | Same | | Billing records (Stripe) | Indefinite (legal) | Indefinite | Indefinite | Indefinite | | Team / user PII | Until account deletion + 30 days | Same | Same | Same | | COGS daily metrics | 2 years | N/A | N/A | 2 years | ### Notes - **Retention enforcement**: The nightly history archiver (`0 2 * * *` UTC) moves completed/failed jobs from Redis to the `job_history` Postgres table according to the workspace's plan retention. After the retention period, jobs are permanently deleted from the archive. - **Partition cleanup**: Monthly Postgres partitions for `audit_event` and `job_history` older than the applicable retention period are dropped by the monthly partition preflight job. - **Enterprise custom retention**: Enterprise customers with contractual requirements beyond 365 days should contact enterprise@rotor.sh to arrange a custom export pipeline. ## PII Handling - PII is redacted at the Guardrail Engine layer before job payloads reach workers (when `pii_redaction_enabled = true` in the workspace guardrail config). - Auth PII (email, user_id) stored in Supabase `auth.users` is deleted within 30 days of account deletion via Supabase auth admin API. - Stripe billing PII is retained per Stripe's own retention policy (Stripe is the controller for payment card data; Rotor is a processor). ## Data Deletion Requests Customers may request deletion of their workspace data by emailing privacy@rotor.sh. Rotor will acknowledge within 5 business days and complete deletion within 30 days (or within the timeframe required by applicable law). Deletion covers: - All jobs, job history, and audit events for the workspace - Webhook endpoints and delivery history - COGS daily metrics (if Enterprise) - Team members and API keys Deletion does NOT cover: - Billing records required for legal/tax purposes (retained 7 years) - Anonymized aggregate metrics that cannot identify the workspace ## Review Cadence Reviewed annually. Next review: 2026-10-20. ### compliance/vendor-management-policy URL: https://rotor.sh/docs/compliance/vendor-management-policy # Vendor Management Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define how rotor.sh evaluates, onboards, monitors, and offboards third-party vendors (sub-processors) who handle customer data or provide critical infrastructure. ## Scope All third-party services that (a) store or process customer data, or (b) are in the critical path of rotor.sh production availability. ## Current Sub-processors | Vendor | Purpose | Data Processed | Region | SOC 2 / Compliance | |--------|---------|----------------|--------|-------------------| | Supabase | Auth, Postgres | PII, audit_event, team_member | us-east-1 | SOC 2 Type II | | Railway | Redis, app hosting | transient queue state | us-west2 / us-east-1 | SOC 2 (in progress) | | Fly.io | Managed workers (Enterprise) | customer compute | multi-region | SOC 2 Type II (Compliance package) | | Stripe | Billing | customer_id, invoice | US | PCI DSS Level 1, SOC 2 | | Anthropic | Brand-tone LLM judge | outbound payload samples | US | Trust & Safety policy | | Slack | Approval callbacks | approval metadata | US | SOC 2 Type II | | AWS | S3 staging (audit export) | Parquet exports | us-east-1 | SOC 2 Type II, ISO 27001 | | Sentry | Error tracking | stack traces, user_id | US | SOC 2 Type II | ## Vendor Evaluation Criteria New sub-processors must be evaluated against: 1. **Security posture**: SOC 2 Type II report (or equivalent ISO 27001 / HIPAA BAA if applicable) 2. **Data residency**: Does the vendor support US-only data residency where required by Enterprise customers? 3. **Incident response SLA**: Does the vendor commit to notification within 72 hours of a data breach? 4. **Business continuity**: What is the vendor's documented uptime SLA and DR capability? 5. **Data deletion**: Does the vendor support deletion within 30 days of contract termination? ## Vendor Monitoring ### Vanta-Integrated Vendors (automated evidence collection) The following vendors have active Vanta integrations and are continuously monitored: - Supabase (OAuth integration) - Fly.io (Compliance package + Vanta OAuth) - AWS (native Vanta integration) - GitHub (native Vanta integration) - Slack (native Vanta integration) ### Manual Evidence Cadence (Railway) Railway does not have a Vanta integration. Manual evidence is collected monthly per `docs/compliance/vanta-manual-evidence.md`: - Railway Team Members list screenshot - Railway Project/Environment configuration screenshot - Railway Billing invoice screenshot ### Annual Review All sub-processors are reviewed annually for: - Updated SOC 2 / compliance reports - Changes to data processing terms (DPA updates) - Service changes that affect data handling ## Sub-processor Change Process Adding or removing a sub-processor requires: 1. Evaluation against the criteria above. 2. PR updating this document and `docs/security/index.mdx`. 3. If the vendor processes customer PII: update the customer-facing Data Processing Agreement (DPA) and notify Enterprise customers with 30 days notice. ## Vendor Offboarding On contract termination: 1. Revoke all credentials and API tokens within 24 hours. 2. Request data deletion confirmation within 30 days. 3. Update `docs/compliance/vendor-management-policy.md` and `docs/security/index.mdx`. ## Review Cadence Reviewed annually. Next review: 2026-10-20. ### compliance/risk-assessment-policy URL: https://rotor.sh/docs/compliance/risk-assessment-policy # Risk Assessment Policy Status: ACTIVE Owner: Daan (daan@shyft.ai) Effective: 2026-04-20 Last reviewed: 2026-04-20 Next review: 2026-10-20 ## Purpose Define how rotor.sh identifies, assesses, prioritizes, and mitigates risks to the confidentiality, integrity, and availability of customer data and platform services. ## Scope All risks affecting rotor.sh production systems, customer data, and business operations. ## Risk Assessment Cadence | Activity | Frequency | Owner | |----------|-----------|-------| | Full risk review | Quarterly | Daan | | New feature risk assessment | Per feature (in PR description) | Feature author | | Vendor risk review | Annual | Daan | | Penetration test | Annual | External firm | ## Risk Categories ### 1. Infrastructure Risks | Risk | Likelihood | Impact | Mitigation | |------|-----------|--------|-----------| | Redis maxmemory eviction causing queue corruption | Low (guarded by assertNoEviction) | High | FND-07 startup guard; Railway noeviction confirmed (Phase 0) | | Railway service outage | Medium | High | Railway 99.9% SLA; BullMQ persists jobs; worker auto-restarts | | Supabase outage | Low | High | Supabase 99.9% SLA; read-through cache on resolver; jobs still queue in Redis | | Fly.io managed worker failure | Medium | Medium (Enterprise only) | Autoscaler retries provisioning; BullMQ job retries with backoff | ### 2. Security Risks | Risk | Likelihood | Impact | Mitigation | |------|-----------|--------|-----------| | Credential leak (API key, Railway secret) | Low | Critical | Railway vault; key rotation on detection; audit log; Sentry alerts | | DDoS on public API | Medium | High | Railway load balancer; rate limiting (manual Redis INCR+PEXPIRE); quota middleware | | SSRF via callback URLs | Low | High | validateCallbackUrl DNS check; SSRF guard in delivery worker | | Tenant cross-contamination (Redis prefix escape) | Very Low | Critical | BullMQ prefix audit (audit-prefix.sh); single source of truth in prefix.ts | | PII leakage in job payloads | Medium | High | Guardrail Engine PII redaction; configurable per-workspace | | Compromised LLM judge output | Low | Medium | Brand-tone circuit breaker fails open (safe default); humans approve | | Billing abuse / quota circumvention | Medium | Medium | Per-plan hard caps at API write time; Stripe reconciler detects drift | ### 3. Compliance Risks | Risk | Likelihood | Impact | Mitigation | |------|-----------|--------|-----------| | SOC 2 observation window gap (Railway manual evidence) | Medium | Medium | vanta-manual-evidence.md monthly cadence | | GDPR data subject request beyond 30-day SLA | Low | Medium | Manual process documented in data-retention-policy.md | | API misuse / AUP violation | Medium | Low-Medium | Audit logs; kill-switch (BIL-06); AUP enforcement process | ### 4. Operational Risks | Risk | Likelihood | Impact | Mitigation | |------|-----------|--------|-----------| | Stalled-job duplication (at-least-once semantics) | Medium | Medium | SDK SIGTERM drain; idempotency keys; history archiver dedup | | BullMQ Lua script version drift | Low | Medium | peerDep pin `>=5.73 <6`; version check at startup | | Claude Code skill distribution drift (Anthropic registry change) | Medium | Low | Fall back to local skill install; monitor Claude Code releases | ## Risk Acceptance Risks rated **Low × Low** may be accepted without further mitigation. All other risks require a documented mitigation plan and owner. Accepted risks must be documented in the Vanta risk register with: - Risk description - Current mitigation - Residual risk level - Acceptance date and owner ## Review Cadence Reviewed quarterly. Next full review: 2026-07-20.