# Rotor — Full Documentation
> Complete docs for Rotor (rotor.sh). Durable job queues for GTM engineers.
> Source: https://rotor.sh/llms-full.txt | Summary: https://rotor.sh/llms.txt

---

## Get Started

### Quickstart

URL: https://rotor.sh/docs/quickstart

This quickstart is a strict, opinionated path. Every command works on macOS and Linux. Windows users: run from Git Bash or WSL.

**Total wall-clock:** ~20 minutes if you're new; under 10 minutes once you're warmed up.

## Prerequisites

You need three things before starting:

- **Node 18+** on your `PATH`. Verify with `node --version`.
- **A Rotor workspace API key** starting with `rt_ws_`. Sign up at [rotor.sh/signup](https://rotor.sh/signup) — your key is shown once after checkout. Treat it like a database password.
- **A server you can run on a public URL** for step 4 (callback handler). Localhost works during development if you tunnel via ngrok / cloudflared / Tailscale Funnel.

## Install the CLI

```bash
npm i -g @rotorsh/cli
rotor --version    # confirm 0.1.x prints
```

## Login

```bash
rotor login
# Paste your rt_ws_* key when prompted. Stored at ~/.rotor/config.json (mode 0600).
```

You can also export the key as `ROTOR_API_KEY` in your shell — `rotor` commands prefer the env var over the saved config when both are present.

## Create a queue

```bash
rotor queues create welcome-emails \
  --concurrency 5 \
  --retry-attempts 3
```

The queue is now live and accepting jobs. Confirm with `rotor queues list`.

## Create a callback handler

This is where most of the time goes. Rotor delivers each job as an HMAC-signed `POST` to the URL you configure on the queue. Your handler verifies the signature, runs your job logic, and returns a 2xx on success.

The simplest production-quality handler — Hono on Node, using `verifyRotorSignature` from the SDK:

```bash
npm i hono @rotorsh/sdk
```

```typescript
// server.ts
import { Hono } from "hono";
import { serve } from "@hono/node-server";
import { verifyRotorSignature } from "@rotorsh/sdk";

const app = new Hono();

app.post("/rotor-callback", async (c) => {
  const body = await c.req.text();
  const ok = verifyRotorSignature(
    body,
    c.req.header("x-rotor-signature"),
    process.env.ROTOR_CALLBACK_SECRET!,
    {
      id: c.req.header("x-rotor-job-id")!,
      timestamp: Number(c.req.header("x-rotor-timestamp")),
      maxAgeSeconds: 300,
    },
  );
  if (!ok) return c.text("invalid signature", 401);

  // Your job logic — keep it idempotent (BullMQ delivers at-least-once)
  const { contactId, campaignId } = JSON.parse(body);
  console.log(`Sending welcome email: contact=${contactId} campaign=${campaignId}`);

  return c.json({ status: "ok" });
});

serve({ fetch: app.fetch, port: 3000 });
console.log("listening on :3000");
```

```bash
# Run it
ROTOR_CALLBACK_SECRET=__placeholder__ npx tsx server.ts
```

You'll set the real secret in the next step. For local development, expose `:3000` to the public internet via your tunnel of choice — `cloudflared tunnel --url http://localhost:3000` is one option.

<Tip>
**Idempotent handlers.** BullMQ delivers jobs at-least-once. Use the `x-rotor-job-id` header as a dedup key (e.g., upsert on `(workspace, jobId)` in your DB) so a redelivery after a worker crash doesn't double-send.
</Tip>

## Wire callback URL + secret into the queue

```bash
rotor queues update welcome-emails \
  --callback-url https://your-tunnel.example.com/rotor-callback \
  --rotate-callback-secret
# Output: New callback secret: <a long string starting with rt_cb_>
# Copy this — it's shown once.
```

Set the printed secret as `ROTOR_CALLBACK_SECRET` in your server's environment and restart your handler.

## Create a cron schedule

```bash
rotor schedules create \
  --cron "*/2 * * * *" \
  --timezone "UTC" \
  --queue welcome-emails
# Fires every 2 minutes. Adjust to "0 9 * * 1-5" for 9am weekdays.
```

## Watch the first cron tick land

```bash
# Wait up to 2 minutes for the first cron tick
rotor schedules list
# Look for `last_fired_at` advancing within the last 2 minutes.

# Or watch live:
rotor status --watch
```

Within 2 minutes you should see your `welcome-emails` queue's depth advance and your callback handler's logs print the contact/campaign payload.

## Where next

<CardGroup cols={2}>
  <Card title="MCP install for Claude Code" href="/mcp/install">
    Add Rotor tools to Claude Code, Cursor, or Claude Desktop.
  </Card>
  <Card title="Node SDK reference" href="/guides/node-quickstart">
    Worker class, idempotency helpers, error hierarchy.
  </Card>
  <Card title="Migrate from Inngest" href="/migrate/from-inngest">
    Side-by-side comparison + migration playbook.
  </Card>
  <Card title="Migrate from Vercel Cron" href="/migrate/from-vercel-cron">
    From cron-as-a-service to Rotor schedules.
  </Card>
</CardGroup>

---

## Guides

### Node.js Guide

URL: https://rotor.sh/docs/guides/node-quickstart

## Prerequisites

- Node.js 18+
- Redis 7.4+ (Railway or self-hosted; must be `noeviction` policy)
- A Rotor workspace key (`rt_ws_...`)

## Install

```bash
pnpm add @rotorsh/sdk bullmq ioredis
```

<Warning>
  Rotor pins a specific BullMQ major version. Check your `package.json` after
  install — if `bullmq` is a different major than `@rotorsh/sdk` expects, you'll
  get a startup warning. Pin to the version listed in `@rotorsh/sdk`'s peer
  dependencies.
</Warning>

## The 5 Resources

### 1. Queues

Create and inspect queues:

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

// Create a queue
const queue = await rotor.queues.create({
  name: "outreach",
  retry_attempts: 3,
});

// List queues
const queues = await rotor.queues.list();
console.log(queues.data.map((q) => q.name));

// Get queue stats
const stats = await rotor.queues.stats("outreach");
console.log(stats.waiting, stats.active, stats.completed, stats.failed);
```

### 2. Jobs

Enqueue individual or batch jobs:

```typescript
// Single job
const job = await rotor.jobs.enqueue("outreach", {
  payload: { leadId: "lead_123" },
});

// Batch (quota checked atomically — all-or-nothing)
const jobs = await rotor.jobs.addBatch("outreach", [
  { payload: { leadId: "lead_001" } },
  { payload: { leadId: "lead_002" } },
  { payload: { leadId: "lead_003" } },
]);

// Get job status
const status = await rotor.jobs.get("outreach", job.id);
console.log(status.state); // waiting | active | completed | failed | delayed
```

### 3. Schedules

Recurring jobs with cron syntax:

```typescript
// Create a recurring enrichment job
const schedule = await rotor.schedules.create({
  queue: "enrichment",
  name: "nightly-enrichment",
  cron: "0 2 * * *", // 2am UTC daily
  payload: { source: "clearbit" },
  timezone: "UTC",
});

// Pause / resume via update
await rotor.schedules.update(schedule.id, { enabled: false });
await rotor.schedules.update(schedule.id, { enabled: true });
```

### 4. Status

Check workspace health and quota usage in a single call:

```typescript
const status = await rotor.status.get();
// {
//   redis: 'ok',
//   api: 'ok',
//   version: '1.0.0',
//   quota: { used: 12450, limit: 100000, plan: 'pro', resetAt: '2025-05-01T00:00:00Z' }
// }
```

### 5. Usage

Billing-period usage breakdown:

```typescript
const usage = await rotor.usage.current();
// { period: { from, to }, executions: { total, byQueue: {...} } }
```

## Idempotent Handler Pattern

Prevent duplicate outreach when jobs retry:

```typescript
import { RotorWorker } from "@rotorsh/sdk";

const worker = new RotorWorker({
  workspaceId: process.env.WORKSPACE_ID!,
  queueName: "outreach",
  connection: process.env.REDIS_URL!,
  concurrency: 5,
  processor: async (job) => {
    const { leadId } = job.data.payload;

    // Use job.id as idempotency key for downstream APIs
    const alreadySent = await db.outreach.findUnique({
      where: { rotorJobId: job.id },
    });

    if (alreadySent) {
      return { skipped: true, reason: "already-sent" };
    }

    await sendEmail(leadId);

    await db.outreach.create({
      data: { rotorJobId: job.id, leadId, sentAt: new Date() },
    });

    return { sent: true };
  },
});

await worker.ready();
```

## SIGTERM Drain

Graceful shutdown — in-flight jobs complete before the process exits:

```typescript
const worker = new RotorWorker({
  workspaceId: process.env.WORKSPACE_ID!,
  queueName: "outreach",
  connection: process.env.REDIS_URL!,
  concurrency: 5,
  drainTimeout: 30_000, // wait up to 30s for in-flight jobs
  processor: handler,
});

// RotorWorker registers SIGTERM/SIGINT handlers automatically.
// On Railway/Fly.io/Heroku: deploy sends SIGTERM → worker finishes
// current jobs → process exits cleanly. No jobs are lost.
await worker.ready();
```

## Railway Deploy Notes

1. Set `ROTOR_API_KEY` in Railway environment variables
2. Set `REDIS_URL` to your Railway Redis private URL (use `redis://...`)
3. Ensure Redis Memory Policy is `noeviction`:
   ```
   railway run redis-cli CONFIG SET maxmemory-policy noeviction
   ```
4. Worker `Procfile` or start command: `node dist/worker.js`
5. Deploy: Railway will send SIGTERM on deploys — SIGTERM drain ensures zero job loss

## Next Steps

- [CLI Reference](/docs/guides/cli-reference) — all `rotor` commands
- [Job Tags](/docs/guides/job-tags) — label and filter jobs by campaign or contact
- [Concurrency Keys](/docs/guides/concurrency-keys) — prevent duplicate concurrent runs
- [Failure Alerts](/docs/guides/failure-alerts) — get notified when jobs terminally fail
- [Idempotency](/docs/guides/idempotency) — two-layer dedup strategy
- [Webhooks](/docs/guides/webhooks) — receive HTTP callbacks on job lifecycle events
- [API Reference](/docs/api-reference/introduction) — all REST endpoints

### CLI quickstart

URL: https://rotor.sh/docs/guides/cli-quickstart

The CLI quickstart has been consolidated into the main [Quickstart](/quickstart). That page walks through `npm i -g @rotorsh/cli`, `rotor login`, queue + schedule creation, and writing a HMAC-verifying callback handler — start to finish in under 30 minutes.

<Card title="Go to the full Quickstart" href="/quickstart" icon="arrow-right" />

### CLI Reference

URL: https://rotor.sh/docs/guides/cli-reference

Install the CLI globally:

```bash
npm i -g @rotorsh/cli
rotor --version
```

Authenticate once, then all commands use your saved key:

```bash
rotor login
# Paste your rt_ws_* key. Saved to ~/.rotor/config.json (mode 0600).
```

You can also pass the key inline via `ROTOR_API_KEY` — this takes precedence over the saved config.

---

## rotor queues

### `rotor queues create`

```bash
rotor queues create <name> [options]

Options:
  --concurrency <n>           Max jobs processed in parallel (default: 1)
  --retry-attempts <n>        Attempts per job before moving to DLQ (default: 3)
  --callback-url <url>        HTTPS URL Rotor POSTs each job to
  --rotate-callback-secret    Generate a new HMAC signing secret (printed once)
```

```bash
rotor queues create enrichment \
  --concurrency 10 \
  --retry-attempts 5 \
  --callback-url https://your-app.example.com/rotor/enrichment
```

### `rotor queues list`

```bash
rotor queues list
# NAME         CONCURRENCY  JOBS (waiting/active/failed)
# enrichment   10           142 / 3 / 1
# outreach     5            0 / 0 / 0
```

### `rotor queues get`

```bash
rotor queues get <name>
```

Prints full queue config: callback URL, retry settings, concurrency, metrics.

### `rotor queues update`

```bash
rotor queues update <name> [options]

Options:
  --concurrency <n>
  --retry-attempts <n>
  --callback-url <url>
  --rotate-callback-secret    Rotate the signing secret; new value printed once
```

```bash
# Update callback URL and rotate the secret in one command
rotor queues update enrichment \
  --callback-url https://new-host.example.com/rotor/enrichment \
  --rotate-callback-secret
```

### `rotor queues delete`

```bash
rotor queues delete <name>
# Prompts for confirmation. Pass --force to skip.
```

---

## rotor jobs

### `rotor jobs enqueue`

```bash
rotor jobs enqueue <queue> --payload <json> [options]

Options:
  --payload <json>            Job payload as a JSON string (required)
  --delay <seconds>           Delay before job becomes eligible
  --tags <tag,...>            Comma-separated list of tags
  --concurrency-key <key>     Mutual exclusion key (one job per key at a time)
  --idempotency-key <key>     Deduplicate — same key = same job, not a new one
```

```bash
rotor jobs enqueue outreach \
  --payload '{"contactId":"cid_123","campaignId":"camp_456"}' \
  --tags "campaign:q2-outbound,contact:cid_123" \
  --concurrency-key "contact:cid_123"
```

### `rotor jobs get`

```bash
rotor jobs get <queue> <jobId>
```

Prints job state, payload, tags, retry count, timestamps, and failure reason if applicable.

### `rotor jobs list`

```bash
rotor jobs list <queue> [options]

Options:
  --status <state>    Filter by state: waiting|active|completed|failed|delayed
  --tag <tag>         Filter by tag
  --limit <n>         Results per page (default: 50)
```

```bash
rotor jobs list enrichment --status failed
rotor jobs list outreach --tag campaign:q2-outbound
```

### `rotor jobs retry`

```bash
rotor jobs retry <queue> <jobId>
# Moves a failed job back to waiting. Resets attempt count.
```

### `rotor jobs cancel`

```bash
rotor jobs cancel <queue> <jobId>
# Cancels a waiting or delayed job. Cannot cancel an active job.
```

---

## rotor schedules

### `rotor schedules create`

```bash
rotor schedules create [options]

Options:
  --queue <name>           Queue to enqueue jobs into (required)
  --cron <expression>      Cron expression (required)
  --timezone <tz>          IANA timezone (required) — e.g. "America/New_York"
  --name <name>            Human-readable name for the schedule
  --payload <json>         Payload for each generated job
```

```bash
rotor schedules create \
  --queue enrichment \
  --cron "0 9 * * 1-5" \
  --timezone "America/New_York" \
  --name "weekday-morning-enrichment" \
  --payload '{"source":"clearbit"}'
```

### `rotor schedules list`

```bash
rotor schedules list
# NAME                          CRON            NEXT FIRE
# weekday-morning-enrichment    0 9 * * 1-5     2026-05-21T09:00:00-04:00
```

### `rotor schedules pause`

```bash
rotor schedules pause <name>
# No new jobs fire until resumed. In-flight jobs are not affected.
```

### `rotor schedules resume`

```bash
rotor schedules resume <name>
```

### `rotor schedules delete`

```bash
rotor schedules delete <name> [--force]
```

---

## rotor secrets

Secrets are workspace-scoped encrypted values. Reference them in callback URLs and job payloads as `${{ secrets.SECRET_NAME }}` — Rotor injects the plaintext at dispatch time. The value never appears in your code or logs.

### `rotor secrets create`

```bash
rotor secrets create <NAME> --value <value>

# Or pipe the value to avoid shell history:
echo -n "sk_live_abc123" | rotor secrets create STRIPE_KEY --stdin
```

Secret names must be `UPPER_SNAKE_CASE`. Values are encrypted at rest (AES-256).

### `rotor secrets list`

```bash
rotor secrets list
# NAME            HINT        CREATED
# STRIPE_KEY      sk_li…      2026-05-01
# RESEND_KEY      re_…        2026-05-10
```

Only the masked hint is shown. Plaintext values are never returned after creation.

### `rotor secrets delete`

```bash
rotor secrets delete <NAME> [--force]
```

---

## rotor status

```bash
rotor status
# API:    ok
# Redis:  ok
# Plan:   pro
# Quota:  12,450 / 100,000 jobs used this period
# Resets: 2026-06-01
```

Add `--watch` to poll every 5 seconds:

```bash
rotor status --watch
```

---

## rotor template

```bash
rotor template add <template-name>
```

Scaffolds a starter workflow into your current directory. Available templates:

| Template | Description |
|----------|-------------|
| `outbound-with-approvals` | Multi-step outbound sequence with human approval gate |
| `multi-step-outbound` | Email sequence with `step.waitForEvent` reply detection |
| `event-driven-enrichment` | Parallel contact enrichment on `campaign.started` |
| `scheduled-with-approval` | Weekly cron report requiring approval before delivery |

```bash
rotor template add outbound-with-approvals
```

---

## Global flags

All commands accept:

```
--api-key <key>     Override saved key for this command only
--json              Output raw JSON instead of formatted table
--quiet             Suppress all output except errors
--help              Show help for any command
```

### MCP quickstart

URL: https://rotor.sh/docs/guides/mcp-quickstart

The MCP quickstart has been consolidated into the [MCP install page](/mcp/install). That page has copy-pasteable `.mcp.json` snippets for Claude Code, Claude Desktop, and Cursor, plus a "first prompt" demo using `rotor_create_queue`.

<Card title="Go to MCP install" href="/mcp/install" icon="arrow-right" />

### Schedules

URL: https://rotor.sh/docs/guides/schedules

A schedule is a named cron rule attached to a queue. When a tick fires, Rotor enqueues a job to that queue with the payload you specified — your callback handler receives it exactly like any other job. You don't run a cron process; Rotor does.

Use schedules when you need to:

- Kick off a nightly enrichment or sync at a specific time in a specific timezone
- Send a weekly digest every Monday morning in each customer's local time
- Trigger a daily report at 9 AM New York time regardless of daylight saving

## Creating a schedule

### CLI

```bash
rotor schedules create \
  --queue enrichment \
  --name nightly-enrichment \
  --cron "0 2 * * *" \
  --timezone "America/New_York" \
  --job-data '{"source":"clearbit"}'
```

### SDK

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

const schedule = await rotor.schedules.create({
  queue_name: "enrichment",
  name: "nightly-enrichment",
  cron: "0 2 * * *",
  timezone: "America/New_York",
  job_data: { source: "clearbit" },
  enabled: true,
});

console.log(schedule.external_id); // sched_<32hex> — save this
```

The `external_id` (`sched_<32hex>`) is your handle for all subsequent operations. Save it.

## The timezone requirement

`timezone` is required. Rotor rejects schedule creation without it.

The reason matters: cron expressions like `0 9 * * 1-5` (9 AM weekdays) are meaningless without a timezone. If you use UTC when your team is in New York, the job fires at 9 AM UTC — which is 5 AM ET in winter and 4 AM ET when daylight saving ends. With a timezone, Rotor recalculates the next-fire timestamp whenever clocks change so the job always fires at the wall-clock time you intended.

<Warning>
  Pass a valid IANA timezone string (e.g., `"America/New_York"`, `"Europe/London"`, `"Asia/Tokyo"`). Offsets like `"+05:30"` are not accepted — they don't account for DST transitions.
</Warning>

## `external_id` vs internal `id`

Every schedule has two identifiers:

| Field | Format | Use |
|-------|--------|-----|
| `external_id` | `sched_<32hex>` | Use in all SDK and CLI calls |
| `id` | `ws_abc:name` | BullMQ internal key — ignore this |

Always use `external_id`. The internal `id` is an implementation detail that may change.

## Updating a schedule

You can update the cron expression, timezone, payload, or enabled state independently — any combination of fields.

```typescript
// Change the cron — now fires at 3 AM instead of 2 AM
await rotor.schedules.update("sched_abc123...", {
  cron: "0 3 * * *",
});

// Change the payload
await rotor.schedules.update("sched_abc123...", {
  job_data: { source: "apollo", limit: 500 },
});

// Change timezone — e.g., follow a customer moving regions
await rotor.schedules.update("sched_abc123...", {
  timezone: "Europe/London",
});
```

## Pausing and resuming

Set `enabled: false` to pause a schedule. No ticks fire while paused. Set `enabled: true` to resume — the next tick fires at the next cron-calculated time after resume.

```typescript
// Pause
await rotor.schedules.update("sched_abc123...", { enabled: false });

// Resume
await rotor.schedules.update("sched_abc123...", { enabled: true });
```

<Note>
  Pausing a schedule does not cancel jobs that are already enqueued and waiting in the queue. It only prevents future ticks from creating new jobs.
</Note>

## Deleting a schedule

```typescript
await rotor.schedules.delete("sched_abc123...");
```

CLI:

```bash
rotor schedules delete sched_abc123...
```

Deletion is permanent. The schedule stops firing immediately. Run history is retained for your plan's retention window.

## Fire now

`fireNow` enqueues a job immediately — the same job the schedule would enqueue at its next tick. Use it to test a new schedule without waiting for the first cron tick, or to manually backfill a missed run.

```typescript
const { jobId } = await rotor.schedules.fireNow("sched_abc123...");
console.log(jobId); // job_<id> — track this in rotor.jobs.get()
```

CLI:

```bash
rotor schedules fire-now sched_abc123...
```

Manual `fireNow` calls appear in the run history alongside cron-driven ticks — both are ScheduleRun records.

## Run history

Rotor records every tick (cron-driven and manual) as a `ScheduleRun`.

```typescript
const { runs } = await rotor.schedules.runs("sched_abc123...").list({ limit: 20 });

for (const run of runs) {
  console.log(run.id);              // sr_<32hex>
  console.log(run.scheduled_for);   // ISO timestamp — when the tick was due
  console.log(run.fired_at);        // ISO timestamp — when Rotor actually enqueued
  console.log(run.job_id);          // job_<id> — the resulting job
  console.log(run.callback_status); // pending | 2xx | 4xx | 5xx | timeout | null
  console.log(run.error);           // error message if status is 4xx/5xx/timeout
}
```

### `callback_status` values

| Value | Meaning |
|-------|---------|
| `pending` | Job enqueued, callback not yet called |
| `2xx` | Your handler returned a success response |
| `4xx` | Your handler returned a client error (check `error` field) |
| `5xx` | Your handler returned a server error — will retry per queue config |
| `timeout` | Your handler did not respond within the deadline |
| `null` | Status not yet recorded |

CLI:

```bash
rotor schedules get sched_abc123...
```

## Listing schedules

```bash
rotor schedules list
```

```typescript
const schedules = await rotor.schedules.list();
```

## Common cron patterns

| Expression | Meaning | Typical timezone |
|------------|---------|-----------------|
| `0 9 * * 1-5` | 9 AM weekdays | `America/New_York` |
| `0 2 * * *` | 2 AM daily | `UTC` |
| `0 8 * * 1` | 8 AM every Monday | `America/Chicago` |
| `0 0 1 * *` | Midnight on the 1st of each month | `UTC` |
| `*/15 * * * *` | Every 15 minutes | `UTC` |
| `0 18 * * 5` | 6 PM every Friday | `Europe/London` |

<Tip>
  Use [crontab.guru](https://crontab.guru) to verify expressions before creating a schedule. Paste the expression and confirm the human-readable description matches your intent.
</Tip>

## Next steps

<CardGroup cols={2}>
  <Card title="Node.js Guide" href="/docs/guides/node-quickstart">Full walkthrough of all Rotor resources including schedules.</Card>
  <Card title="CLI Reference" href="/docs/guides/cli-reference">All `rotor schedules` subcommands and flags.</Card>
  <Card title="Failure Alerts" href="/docs/guides/failure-alerts">Get notified when a scheduled job terminally fails.</Card>
  <Card title="Approvals" href="/docs/guides/approvals">Gate jobs behind human approval before they run.</Card>
</CardGroup>

### Approvals

URL: https://rotor.sh/docs/guides/approvals

Approvals are human-in-the-loop gates on jobs. When a queue has approvals enabled, every job enqueued to it pauses and waits for a human to approve or reject before the callback handler is called. The job payload is visible during review so you can make an informed decision.

Use approvals when:

- An AI agent is generating outbound emails and a human should review before send
- A financial transaction above a threshold requires sign-off
- Sensitive data access (e.g., exporting a contact list) needs audit-trail approval
- An automated process touches production systems and you want a manual gate

## How the flow works

<Steps>
  <Step title="Enable approvals on the queue">
    Set `approval_required: true` when creating or updating the queue.
  </Step>
  <Step title="Enqueue a job">
    Your code enqueues a job normally. The job enters a `pending_approval` state instead of `waiting`.
  </Step>
  <Step title="Rotor creates an approval record">
    An `approval_record` row is created with `status: "pending"` and an `apv_<32hex>` ID.
  </Step>
  <Step title="Webhook fires">
    If you've configured a webhook, Rotor sends an `approval.pending` event to your endpoint with the approval ID, job ID, queue name, and payload.
  </Step>
  <Step title="Human approves or rejects">
    Via CLI, SDK, REST API, or the MCP `approve_job` tool in Claude Code.
  </Step>
  <Step title="Job proceeds or fails">
    On approval, the job moves to `waiting` and your callback handler runs. On rejection, the job is marked `failed` with the rejection reason.
  </Step>
</Steps>

## Enable approvals on a queue

### Create a new queue with approvals

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

await rotor.queues.create({
  name: "ai-outreach",
  callback_url: "https://your-app.example.com/rotor/ai-outreach",
  approval_required: true,
  defaultJobOptions: { attempts: 3 },
});
```

### Enable approvals on an existing queue

```typescript
await rotor.queues.update("ai-outreach", {
  approval_required: true,
});
```

## Listing pending approvals

### CLI

```bash
rotor approvals list --status pending
```

### SDK

```typescript
const { approvals } = await rotor.approvals.list({
  status: "pending",
  limit: 50,
});

for (const apv of approvals) {
  console.log(apv.id);          // apv_<32hex>
  console.log(apv.jobId);       // the job waiting for approval
  console.log(apv.queueName);   // which queue
  console.log(apv.payload);     // the job payload — inspect before deciding
  console.log(apv.createdAt);   // when the approval record was created
}
```

### Filter by queue

```typescript
const { approvals } = await rotor.approvals.list({
  status: "pending",
  queue: "ai-outreach",
});
```

### Get a single approval

```typescript
const apv = await rotor.approvals.get("apv_abc123...");
console.log(apv.payload); // inspect the job payload
```

## Approving and rejecting

### CLI

```bash
# Approve
rotor approvals approve apv_abc123...

# Reject
rotor approvals reject apv_abc123...
```

### SDK

```typescript
// Approve — job proceeds to your callback handler
await rotor.approvals.approve("apv_abc123...");

// Reject — job is marked failed
await rotor.approvals.reject("apv_abc123...");
```

### REST API

```bash
# Approve
curl -X POST "https://api.rotor.sh/v1/approvals/apv_abc123.../approve" \
  -H "Authorization: Bearer $ROTOR_API_KEY"

# Reject
curl -X POST "https://api.rotor.sh/v1/approvals/apv_abc123.../reject" \
  -H "Authorization: Bearer $ROTOR_API_KEY"
```

## The `approval.pending` webhook

If you configure a webhook endpoint, Rotor sends a POST to it when an approval record is created. The payload type is `ApprovalPendingData`.

```typescript
// Webhook payload shape
type ApprovalPendingData = {
  approvalId: string;   // apv_<32hex>
  jobId: string;        // the waiting job
  queueName: string;    // which queue
  payload: unknown;     // the job payload — safe to log and display
  workspaceId: string;  // your workspace
};
```

Example payload:

```json
{
  "event": "approval.pending",
  "data": {
    "approvalId": "apv_a1b2c3d4...",
    "jobId": "job_e5f6g7h8...",
    "queueName": "ai-outreach",
    "payload": {
      "contactId": "cid_123",
      "subject": "Quick question about your stack",
      "body": "Hi Sarah, ..."
    },
    "workspaceId": "ws_abc123"
  }
}
```

Use this webhook to post a Slack message to your team, create a Notion entry, or trigger any other notification flow. Your webhook handler can embed a direct link to the approval using the `approvalId`.

<Tip>
  You can approve and reject directly from a Slack message by posting the `approvalId` back to the Rotor API via a Slack shortcut or button. The MCP `approve_job` tool makes this even easier from Claude Code.
</Tip>

## Using the MCP `approve_job` tool

If you use the [Rotor MCP server](/docs/mcp/install), Claude Code gains an `approve_job` tool. From your terminal you can list pending approvals and approve or reject them without leaving your editor.

```
# In Claude Code
> List pending approvals for the ai-outreach queue
> Approve apv_abc123...
> Reject apv_xyz789... — the subject line is too aggressive
```

This is the fastest review path for Claude Code-using teams: see the payload, make a call, continue building.

See [MCP quickstart](/docs/guides/mcp-quickstart) to connect the MCP server.

## What happens after rejection

A rejected job is marked `failed` with `reason: "rejected"`. It does not retry — rejection is terminal. The job appears in your failed jobs list with the rejection timestamp.

If you need to re-attempt a rejected job, enqueue a new job with the corrected payload.

<Warning>
  Approvals pause the job indefinitely — there is no automatic timeout. If an approval sits pending for a long time, the job will not run until someone acts. Build a notification flow (webhook → Slack, email) so approvals don't silently stall.
</Warning>

## Filtering by status

The `status` filter accepts: `pending`, `approved`, `rejected`.

```typescript
// All approved approvals for audit purposes
const { approvals } = await rotor.approvals.list({
  status: "approved",
  queue: "ai-outreach",
  limit: 100,
});
```

Use `cursor` for pagination:

```typescript
const page1 = await rotor.approvals.list({ status: "pending", limit: 20 });
const page2 = await rotor.approvals.list({
  status: "pending",
  limit: 20,
  cursor: page1.nextCursor,
});
```

## Next steps

<CardGroup cols={2}>
  <Card title="Guardrails" href="/docs/guides/guardrails">Run automated content checks before jobs ever reach the approval queue.</Card>
  <Card title="MCP Quickstart" href="/docs/guides/mcp-quickstart">Connect Claude Code to Rotor and approve jobs from your terminal.</Card>
  <Card title="Webhooks" href="/docs/api-reference/introduction">Configure webhook endpoints for `approval.pending` events.</Card>
  <Card title="Schedules" href="/docs/guides/schedules">Combine approvals with recurring scheduled jobs.</Card>
</CardGroup>

### Guardrails

URL: https://rotor.sh/docs/guides/guardrails

Guardrails are a pre-dispatch content policy layer. Before Rotor calls your callback URL, it runs each job payload through a configurable pipeline of checks. A job that fails a check is blocked — it never reaches your handler.

This protects you from:

- Accidentally emailing opted-out contacts
- Leaking PII in job payloads that flow into downstream systems
- Referencing competitor names in AI-generated outreach
- Sending off-brand copy that passed no human review

Guardrails apply to **all queues in your workspace**. There is no per-queue override.

## The pipeline

Four rule types run in order, cheapest first. Each stage short-circuits on block — if a job is blocked by the DNC check, the remaining stages don't run.

| Stage | Rule type | Blocks or modifies |
|-------|-----------|-------------------|
| 1 | **DNC** — do-not-contact list | Blocks matching email or domain |
| 2 | **Competitor block** — domain deny-list | Blocks jobs referencing competitor domains |
| 3 | **PII** — personally identifiable information | Redacts or blocks based on `piiMode` |
| 4 | **Brand tone** — LLM judge (Claude Haiku) | Blocks if score falls below rubric threshold |

Running cheaper checks first (DNC and competitor are simple string lookups) means most blocked jobs exit in stage 1 or 2 without incurring LLM costs.

## Viewing your current config

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

const config = await rotor.guardrails.get();
console.log(config); // GuardrailConfig
```

The `GuardrailConfig` object has a `grc_` prefixed ID and contains the current state of all four rule types.

## Enabling each rule type

### DNC — do-not-contact list

Block jobs where the payload contains an email address or domain on your DNC list.

```typescript
await rotor.guardrails.update({
  dncEmails: ["alice@example.com", "bob@example.com"],
  dncDomains: ["noemail.example.org"],
});
```

Add entries incrementally without re-posting the full list:

```typescript
// Append a single unsubscribe
await rotor.guardrails.appendDnc({
  emails: ["newunsub@example.com"],
});

// Append a domain block
await rotor.guardrails.appendDnc({
  domains: ["competitor-customer.io"],
});
```

Remove a single email:

```typescript
await rotor.guardrails.removeDncEmail("alice@example.com");
```

<Tip>
  Use `appendDnc` for real-time unsubscribes — call it from your unsubscribe webhook handler so the contact is protected immediately without rebuilding the full list.
</Tip>

### Managing a large DNC list

For bulk imports (thousands of emails), build the full list and pass it to `update`:

```typescript
// Build from your CRM unsubscribe export
const emails = await fetchUnsubscribedEmails(); // string[]

await rotor.guardrails.update({
  dncEmails: emails,
});
```

<Warning>
  `update` replaces the entire DNC list for the fields you pass. If you pass `dncEmails`, the previous `dncEmails` value is replaced. If you only want to add entries, use `appendDnc` instead.
</Warning>

### Competitor block

Block jobs that reference any of your competitor's domains anywhere in the payload.

```typescript
await rotor.guardrails.update({
  competitorDomains: ["rivalcrm.com", "othertool.io", "competitorapp.co"],
});
```

Rotor scans the serialized job payload for any occurrence of these domain strings. A job mentioning `rivalcrm.com` in a message body, subject line, or URL field is blocked.

### PII detection

PII detection scans the payload for patterns like email addresses, phone numbers, SSNs, and credit card numbers. You choose the response mode.

#### Redact mode

Detected PII is replaced with a placeholder before the job is dispatched. The job runs with the redacted payload.

```typescript
await rotor.guardrails.update({
  piiRedactionEnabled: true,
  piiMode: "redact",
});
```

A phone number like `555-867-5309` becomes `[REDACTED-PHONE]` in the payload your handler receives.

#### Block mode

The job is blocked entirely if PII is detected. Use this when your handler should never receive raw PII.

```typescript
await rotor.guardrails.update({
  piiRedactionEnabled: true,
  piiMode: "block",
});
```

| Mode | Behavior | Use when |
|------|---------|---------|
| `redact` | Strips PII, job runs with sanitized payload | Handler is safe to receive data but PII shouldn't flow through |
| `block` | Stops the job entirely | Handler must never see raw PII |

### Brand tone judge

The brand tone judge uses Claude Haiku to score the job payload against a rubric you define. If the score falls below the threshold, the job is blocked.

```typescript
await rotor.guardrails.update({
  brandToneJudgeEnabled: true,
  brandToneRubric:
    "Professional and helpful. Do not use urgency language, scarcity claims, or manipulative CTAs. " +
    "Avoid phrases like 'Act now', 'Limited time', 'Don't miss out'. " +
    "Tone should feel like a knowledgeable peer, not a sales rep.",
});
```

<Note>
  Brand tone is stage 4 — it runs last. It only fires if the job passed DNC, competitor, and PII checks. This means you're only paying for LLM inference on payloads that are otherwise clean.
</Note>

#### Writing an effective rubric

The rubric is a plain-text description of acceptable tone. Be specific:

- Describe what good looks like ("conversational, concise, direct")
- Explicitly call out what to avoid ("no urgency language", "no superlatives like 'best' or 'revolutionary'")
- Include examples of acceptable and unacceptable phrases if needed

Vague rubrics produce inconsistent results. Specific rubrics produce reliable blocks.

## When a job is blocked

When any guardrail stage blocks a job, Rotor:

1. Sets the job status to `guardrail.blocked`
2. Fires a `guardrail.blocked` webhook event (if configured)
3. Records which stage blocked it and why

The job does not reach your callback handler. It does not retry.

### The `guardrail.blocked` webhook payload

```json
{
  "event": "guardrail.blocked",
  "data": {
    "jobId": "job_abc123...",
    "queueName": "ai-outreach",
    "blockedBy": "dnc",
    "reason": "Email address matched DNC list",
    "payload": { "to": "alice@example.com", "subject": "..." }
  }
}
```

`blockedBy` is one of: `dnc`, `competitor`, `pii`, `brand_tone`.

## Use case: AI outbound agent

A common Rotor pattern for GTM teams: an AI agent generates personalized outreach emails, enqueues them to an `ai-outreach` queue, and a callback handler sends via your ESP. Guardrails protect every job before it leaves the system.

```typescript
// Configure guardrails once
await rotor.guardrails.update({
  // Block opted-out contacts
  dncEmails: await loadUnsubscribes(),
  dncDomains: await loadBouncedDomains(),

  // Never mention competitors by name
  competitorDomains: ["rivalcrm.com", "otherplatform.io"],

  // Redact any PII the LLM accidentally included
  piiRedactionEnabled: true,
  piiMode: "redact",

  // Score brand tone before send
  brandToneJudgeEnabled: true,
  brandToneRubric:
    "Helpful and direct. No urgency language. No superlatives. " +
    "Reads like a message from a thoughtful colleague, not marketing copy.",
});

// Enqueue AI-generated emails — guardrails run automatically
await rotor.jobs.enqueueBatch("ai-outreach",
  emails.map((email) => ({ payload: email }))
);
```

Any job that references an opted-out contact, a competitor, slips in a phone number, or fails the tone check is silently dropped — and you get a `guardrail.blocked` webhook so you can review and correct the source.

<Tip>
  Hook `guardrail.blocked` events into your Slack channel. If the brand tone judge starts blocking frequently, it usually means your prompt has drifted — review the rubric and your system prompt together.
</Tip>

## Full config reference

| Field | Type | Description |
|-------|------|-------------|
| `dncEmails` | `string[]` | Email addresses to block |
| `dncDomains` | `string[]` | Domains to block |
| `competitorDomains` | `string[]` | Domains that trigger a competitor block |
| `piiRedactionEnabled` | `boolean` | Whether PII detection is active |
| `piiMode` | `"redact" \| "block"` | What to do when PII is detected |
| `brandToneJudgeEnabled` | `boolean` | Whether brand tone scoring is active |
| `brandToneRubric` | `string` | Plain-text description of acceptable tone |

## Next steps

<CardGroup cols={2}>
  <Card title="Approvals" href="/docs/guides/approvals">Add a human gate after guardrails pass — for final review before send.</Card>
  <Card title="Failure Alerts" href="/docs/guides/failure-alerts">Get notified when a job terminally fails after all retries.</Card>
  <Card title="Job Tags" href="/docs/guides/job-tags">Label blocked jobs with tags to track guardrail hit rates by campaign.</Card>
  <Card title="API Reference" href="/docs/api-reference/introduction">Full REST reference for guardrail endpoints.</Card>
</CardGroup>

### Webhooks

URL: https://rotor.sh/docs/guides/webhooks

Rotor POSTs a signed JSON payload to your endpoint whenever a lifecycle event occurs. You register one or more webhook endpoints and choose which event types each one receives.

## Event types

| Event | When it fires |
|---|---|
| `job.completed` | A job finishes successfully |
| `job.failed` | A job exhausts all retry attempts and enters terminal failure |
| `job.stalled` | A job's lock expires while it was active (usually a crashed worker) |
| `approval.pending` | A job step is waiting for human approval |
| `approval.approved` | An approval request was approved |
| `approval.rejected` | An approval request was rejected |
| `guardrail.blocked` | A guardrail rule blocked a job from executing |
| `queue.drained` | A queue's waiting count dropped to zero |
| `queue.error` | A queue-level error occurred (e.g. Redis unreachable) |
| `webhook.test` | Fired by `rotor.webhooks.test()` to verify your endpoint |

## Create a webhook endpoint

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

const webhook = await rotor.webhooks.create({
  url: "https://yourapp.com/webhooks/rotor",
  eventTypes: ["job.completed", "job.failed", "approval.pending"],
});

console.log(webhook.id);     // wh_<32hex>
console.log(webhook.secret); // shown ONCE — store it in your secrets manager immediately
```

<Warning>
  The webhook secret is returned only at creation time. If you lose it, delete
  the webhook and create a new one. Store the secret in an environment variable
  (`ROTOR_WEBHOOK_SECRET`) before the creation response goes out of scope.
</Warning>

## Verify signatures

Every request Rotor sends includes three headers:

| Header | Contents |
|---|---|
| `Rotor-Webhook-Id` | Unique delivery ID |
| `Rotor-Webhook-Timestamp` | Unix timestamp of the delivery |
| `Rotor-Webhook-Signature` | `v1,{base64(HMAC-SHA256(secret, "{id}.{timestamp}.{body}"))}` |

Rotor uses a Svix-compatible HMAC-SHA256 scheme. Verify with the SDK helper:

```typescript
import { verifyRotorSignature } from "@rotorsh/sdk";
import type { IncomingMessage, ServerResponse } from "http";

export async function POST(req: Request): Promise<Response> {
  const body = await req.text();

  const id        = req.headers.get("rotor-webhook-id")!;
  const timestamp = req.headers.get("rotor-webhook-timestamp")!;
  const signature = req.headers.get("rotor-webhook-signature")!;

  let event: unknown;
  try {
    event = verifyRotorSignature(body, signature, process.env.ROTOR_WEBHOOK_SECRET!, {
      id,
      timestamp,
      maxAgeSeconds: 300, // reject deliveries older than 5 minutes
    });
  } catch (err) {
    return new Response("Invalid signature", { status: 401 });
  }

  // event is now typed and verified — handle it
  await handleRotorEvent(event as RotorWebhookEvent);

  return new Response("OK", { status: 200 });
}
```

<Tip>
  Always return `200` quickly — do your processing asynchronously or in a
  background task. If your endpoint takes longer than 10 seconds to respond,
  Rotor treats the delivery as a timeout and schedules a retry.
</Tip>

## Handle specific events

### job.completed

```typescript
import type { RotorWebhookEvent } from "@rotorsh/sdk";

async function handleRotorEvent(event: RotorWebhookEvent) {
  switch (event.type) {
    case "job.completed": {
      const { jobId, queueName, result, completedAt } = event.data;
      console.log(`Job ${jobId} in ${queueName} finished:`, result);
      // update your DB, trigger downstream steps, etc.
      break;
    }

    case "job.failed": {
      const { jobId, queueName, failReason, attempts } = event.data;
      console.error(`Job ${jobId} failed after ${attempts} attempts: ${failReason}`);
      // send a Slack alert, open a support ticket, etc.
      break;
    }

    case "approval.pending": {
      const { jobId, approvalId, queueName, payload } = event.data;
      // notify a reviewer — e.g. post to Slack
      await notifyReviewer({ approvalId, payload });
      break;
    }
  }
}
```

### Pipe job events to Slack

```typescript
case "job.failed": {
  const { jobId, queueName, failReason } = event.data;

  await fetch(process.env.SLACK_WEBHOOK_URL!, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      text: `❌ Job \`${jobId}\` failed in \`${queueName}\`\n>${failReason}`,
    }),
  });
  break;
}
```

### Route approval.pending to a bot

```typescript
case "approval.pending": {
  const { approvalId, queueName, payload } = event.data;

  await fetch("https://yourapp.com/internal/approval-bot", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ approvalId, queueName, payload }),
  });
  break;
}
```

## Test a webhook endpoint

Send a `webhook.test` event to confirm your endpoint is reachable and signature verification works:

```typescript
await rotor.webhooks.test("wh_abc123def456...");
```

Your endpoint will receive a payload like:

```json
{
  "type": "webhook.test",
  "webhookId": "wh_abc123def456...",
  "data": {
    "message": "This is a test event from Rotor.",
    "timestamp": "2026-05-20T10:00:00.000Z"
  }
}
```

Verify your endpoint returns `200` and that signature verification passes before relying on it in production.

## Retry behavior

If your endpoint returns a non-2xx status code, times out, or is unreachable, Rotor retries with exponential backoff — up to **5 attempts over approximately 3 hours**. After 5 failed attempts, the delivery is abandoned and marked as permanently failed.

| Attempt | Approximate delay |
|---|---|
| 1 | Immediate |
| 2 | ~30 seconds |
| 3 | ~5 minutes |
| 4 | ~30 minutes |
| 5 | ~2 hours |

<Note>
  Delivery failures do not affect your job execution. Your jobs continue
  running — only the webhook notification is retried.
</Note>

## Manage endpoints

```typescript
// List all webhook endpoints
const webhooks = await rotor.webhooks.list();

// Get a specific endpoint
const webhook = await rotor.webhooks.get("wh_abc123def456...");
console.log(webhook.url, webhook.eventTypes, webhook.createdAt);

// Delete an endpoint
await rotor.webhooks.delete("wh_abc123def456...");
```

Deleting a webhook stops all future deliveries immediately. In-flight deliveries that have already started may still complete.

## callback_status on schedule runs

When a schedule fires a job and that job has a callback configured, the `callback_status` field on the run record reflects the webhook delivery outcome:

| Value | Meaning |
|---|---|
| `pending` | Delivery has not been attempted yet |
| `2xx` | Endpoint responded with a success status |
| `4xx` | Endpoint responded with a client error |
| `5xx` | Endpoint responded with a server error |
| `timeout` | Endpoint did not respond within the timeout window |
| `null` | No callback configured for this run |

### Idempotency

URL: https://rotor.sh/docs/guides/idempotency

BullMQ delivers jobs **at least once**. If a worker crashes mid-execution, loses its Redis lock, or restarts, the job is redelivered and your handler runs again. Without idempotency guards, this causes duplicate emails, double charges, and phantom records.

Rotor gives you two independent layers of protection:

1. **Enqueue dedup** — the same `jobId` is never enqueued twice. BullMQ silently discards the second submission.
2. **Handler dedup** — your handler checks a DB record before doing work, preventing double execution even when the same job is delivered more than once.

Use both layers. Enqueue dedup prevents the common case; handler dedup is the safety net for redeliveries that slip through.

## Layer 1 — Explicit jobId at enqueue

Pass a deterministic `jobId` when enqueueing. BullMQ deduplicates at the queue level: submitting the same `jobId` again is a no-op.

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

const job = await rotor.jobs.enqueue("outreach", {
  payload: { contactId: "cid_123", campaignId: "cmp_456" },
  jobId: "outreach:cmp_456:cid_123", // deterministic key
});
```

If `outreach:cmp_456:cid_123` is already in the queue, BullMQ returns without creating a new job. No error is thrown; the call is silent.

<Tip>
  Build your `jobId` from the combination of entities that must be unique: e.g.
  `{queue}:{campaignId}:{contactId}`. This makes duplicate submissions safe
  regardless of how they occur — retried HTTP requests, double-clicks, or
  re-triggered workflows.
</Tip>

## Batch with idempotency keys

For batch enqueue, pass an `idempotencyKeys` array alongside your jobs. The array length must match the jobs array length, and keys must be unique within the batch.

```typescript
const contacts = [
  { id: "cid_001", email: "alice@example.com" },
  { id: "cid_002", email: "bob@example.com" },
  { id: "cid_003", email: "carol@example.com" },
];

await rotor.jobs.addBatch(
  "outreach",
  contacts.map((c) => ({
    payload: { contactId: c.id, email: c.email },
  })),
  {
    idempotencyKeys: contacts.map((c) => `outreach:cmp_456:${c.id}`),
  }
);
```

Rules for `idempotencyKeys`:

- Length must equal the number of jobs in the batch
- No duplicates within a single batch call
- Cannot conflict with explicit `jobId` fields on individual jobs in the same call
- Re-submitting the same key is a silent no-op — BullMQ deduplicates

## Layer 2 — Handler-level dedup

Enqueue dedup prevents most duplicates. Handler dedup catches redeliveries: cases where the job was enqueued once but executed more than once (e.g. worker crash after partial execution).

The pattern: check your DB for `rotorJobId` before doing work; write it after.

```typescript
import { RotorWorker } from "@rotorsh/sdk";

const worker = new RotorWorker({
  workspaceId: process.env.WORKSPACE_ID!,
  queueName: "outreach",
  connection: process.env.REDIS_URL!,
  concurrency: 5,
  processor: async (job) => {
    const { contactId } = job.data.payload;

    // Check if this exact job execution already completed
    const existing = await db.outreachSent.findUnique({
      where: { rotorJobId: job.id },
    });

    if (existing) {
      return { skipped: true, reason: "already-executed" };
    }

    // Do the work
    await sendWelcomeEmail(contactId);

    // Record completion — use upsert in case of a tight race
    await db.outreachSent.upsert({
      where: { rotorJobId: job.id },
      create: { rotorJobId: job.id, contactId, sentAt: new Date() },
      update: {}, // already exists — ignore
    });

    return { sent: true };
  },
});

await worker.ready();
```

### Schema for the dedup table

```sql
CREATE TABLE outreach_sent (
  id          SERIAL PRIMARY KEY,
  rotor_job_id  TEXT    NOT NULL,
  contact_id    TEXT    NOT NULL,
  sent_at       TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE (rotor_job_id)             -- enforces dedup at DB level
);
```

Or with a compound unique constraint to scope dedup per workspace:

```sql
UNIQUE (workspace_id, rotor_job_id)
```

## The Idempotency-Key header

When making HTTP API calls to Rotor directly (not via the SDK), include an `Idempotency-Key` header. The SDK parses this with `parseIdempotencyKey`:

```typescript
import { parseIdempotencyKey } from "@rotorsh/sdk";

// In an API route handler
const idempotencyKey = parseIdempotencyKey(req.headers);

if (idempotencyKey) {
  await rotor.jobs.enqueue("outreach", {
    payload: { contactId },
    jobId: idempotencyKey, // use the caller's key as the jobId
  });
}
```

This lets upstream callers (e.g. webhook consumers, frontend clients) pass their own idempotency keys through to Rotor.

## RotorIdempotencyConflict

In strict mode, submitting a duplicate idempotency key throws `RotorIdempotencyConflict` (HTTP 409) instead of silently deduplicating. Handle it explicitly when you need to distinguish a genuine duplicate from a new submission:

```typescript
import { Rotor, RotorIdempotencyConflict } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

try {
  await rotor.jobs.enqueue("outreach", {
    payload: { contactId: "cid_123" },
    jobId: "outreach:cmp_456:cid_123",
  });
} catch (err) {
  if (err instanceof RotorIdempotencyConflict) {
    // A job with this key already exists
    // Safe to treat as success — the job is already queued
    console.log("Duplicate submission — job already enqueued");
    return;
  }
  throw err;
}
```

## The 24-hour replay window

Rotor does not maintain an application-state dedup table for you. Handler dedup — the DB record you write after executing — is your responsibility.

A robust pattern is a 24-hour replay table: any `rotorJobId` processed in the last 24 hours is recorded and checked at handler entry. Clean up records older than 24 hours with a scheduled job or DB TTL.

```typescript
// Check and record in one atomic upsert
const { created } = await db.jobReplay.upsert({
  where: { rotorJobId: job.id },
  create: { rotorJobId: job.id, processedAt: new Date() },
  update: {}, // already exists — do nothing
  select: { created: true }, // Prisma: true if INSERT, false if no-op
});

if (!created) {
  return { skipped: true, reason: "replay-deduplicated" };
}

// Proceed with actual work
```

<Note>
  The replay table prevents double-execution when a job is redelivered within
  the retention window. Records older than your window can be pruned — Rotor
  guarantees at-least-once, not unbounded redelivery.
</Note>

## Practical example — welcome emails without double-sending

A new user signs up. Your app enqueues a welcome email job. If your worker crashes after calling the email API but before the job is marked complete, BullMQ redelivers and your handler runs again.

Without idempotency, the user gets two welcome emails. With two-layer defense:

```typescript
import { Rotor, RotorWorker } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

// At signup — deterministic jobId prevents double-enqueue
await rotor.jobs.enqueue("welcome-email", {
  payload: { userId: user.id, email: user.email },
  jobId: `welcome:${user.id}`, // one welcome email per userId, ever
});

// Worker — handler dedup prevents double-execution on redelivery
const worker = new RotorWorker({
  workspaceId: process.env.WORKSPACE_ID!,
  queueName: "welcome-email",
  connection: process.env.REDIS_URL!,
  concurrency: 10,
  processor: async (job) => {
    const { userId, email } = job.data.payload;

    const alreadySent = await db.welcomeEmailLog.findUnique({
      where: { rotorJobId: job.id },
    });

    if (alreadySent) {
      return { skipped: true };
    }

    await emailClient.send({
      to: email,
      template: "welcome",
      data: { userId },
    });

    await db.welcomeEmailLog.create({
      data: { rotorJobId: job.id, userId, sentAt: new Date() },
    });

    return { sent: true, userId };
  },
});

await worker.ready();
```

Layer 1 (`jobId: "welcome:{userId}"`) ensures a second signup event or retry of the enqueue call never creates a second job. Layer 2 (`welcomeEmailLog` check) ensures a redelivered job never sends a second email.

### Job Tags

URL: https://rotor.sh/docs/guides/job-tags

Tags are short string labels you attach to a job at enqueue time. They pass through to the dashboard, the REST API, and the archived job history — letting you filter and group runs without embedding metadata in your job payload.

## Add tags when enqueueing

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

const job = await rotor.jobs.enqueue("outreach", {
  payload: { contactId: "cid_123" },
  tags: ["campaign:q2-outbound", "contact:cid_123", "env:production"],
});
```

Tags are returned in the response:

```json
{
  "id": "job_abc",
  "state": "waiting",
  "tags": ["campaign:q2-outbound", "contact:cid_123", "env:production"]
}
```

**Limits:** up to 10 tags per job, each tag max 64 characters.

## Batch enqueue with tags

Each job in a batch can carry its own tags:

```typescript
await rotor.jobs.enqueueBatch("outreach", [
  { payload: { contactId: "cid_001" }, tags: ["campaign:q2-outbound", "contact:cid_001"] },
  { payload: { contactId: "cid_002" }, tags: ["campaign:q2-outbound", "contact:cid_002"] },
]);
```

## Retrieve tags on a job

```typescript
const job = await rotor.jobs.get("outreach", "job_abc");
console.log(job.tags); // ["campaign:q2-outbound", "contact:cid_123", "env:production"]
```

## Filter the job list by tag

```typescript
const jobs = await rotor.jobs.list("outreach", { tag: "campaign:q2-outbound" });
```

Or via REST:

```bash
curl "https://api.rotor.sh/v1/queues/outreach/jobs?tag=campaign%3Aq2-outbound" \
  -H "Authorization: Bearer $ROTOR_API_KEY"
```

## Tags in the dashboard

The **Runs** page shows tags as grey pill badges on each row. You can scan which jobs belong to a campaign at a glance — no payload inspection required.

## Tags are archived

When a job completes or fails, its tags are written to the `job_history` table alongside the result. Historical tag data is retained for your plan's full retention window (7d free, 30d pro, 90d team).

## Naming conventions

There is no enforced format. Common patterns:

| Pattern | Example | Use case |
|---------|---------|----------|
| `key:value` | `campaign:q2-outbound` | Group by named campaign |
| `entity:id` | `contact:cid_123` | Link a job to a specific record |
| `env:name` | `env:production` | Distinguish prod vs staging runs |
| `source:name` | `source:hubspot` | Track the system that triggered the job |

<Tip>
  Tags are for filtering and grouping — not for routing or conditional logic.
  If you need jobs to behave differently based on a value, put that value in
  the job payload, not in tags.
</Tip>

### Concurrency Keys

URL: https://rotor.sh/docs/guides/concurrency-keys

A concurrency key is a string you attach to a job that guarantees at most one job with that key is processing at a time. If a second job with the same key is dispatched while the first is still running, it waits — it gets delayed 10 seconds and retried automatically.

This is the n8n equivalent of "don't run this workflow if it's already running for this contact."

## Typical use case

You're running contact enrichment. You fan out 500 jobs — one per contact. Without concurrency keys, if the same contact appears in two lists, two enrichment jobs can run simultaneously, causing duplicate API calls and conflicting writes.

```typescript
import { Rotor } from "@rotorsh/sdk";

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

await rotor.jobs.enqueue("enrichment", {
  payload: { contactId: "cid_123" },
  concurrencyKey: "contact:cid_123",
});
```

Any subsequent job with `concurrencyKey: "contact:cid_123"` will wait until the first completes or fails.

## How it works

1. When a job is dispatched to your callback URL, Rotor attempts to acquire a Redis lock: `SET NX EX 300 {workspace}:ckey:{concurrencyKey} {jobId}`.
2. If the lock is **free**, the job proceeds normally. The lock is released when the job completes or fails.
3. If the lock is **held** (another job with the same key is in flight), the current job is delayed by 10 seconds and Rotor retries it automatically. This repeats until the lock is free.
4. A 300-second safety-net TTL ensures the lock is always released, even if the worker crashes mid-job.

Concurrency keys are scoped to your workspace — there is no cross-workspace collision.

## Batch enqueue

```typescript
const contacts = ["cid_001", "cid_002", "cid_003"];

await rotor.jobs.enqueueBatch("enrichment",
  contacts.map((id) => ({
    payload: { contactId: id },
    concurrencyKey: `contact:${id}`,
  }))
);
```

Each contact gets its own lock. Two contacts can enrich in parallel; the same contact cannot.

## REST API

```bash
curl -X POST "https://api.rotor.sh/v1/queues/enrichment/jobs" \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "payload": { "contactId": "cid_123" },
    "concurrencyKey": "contact:cid_123"
  }'
```

## Key naming

Any string up to 256 characters. Common patterns:

| Pattern | Example |
|---------|---------|
| Entity ID | `contact:cid_123` |
| Account + action | `account:acc_456:sync` |
| User + resource | `user:usr_789:report` |

<Warning>
  Concurrency keys only apply to **callback-mode queues** (queues with a
  `callback_url`). They have no effect on durable workflow jobs.
</Warning>

## Combining with tags

You can use both on the same job:

```typescript
await rotor.jobs.enqueue("enrichment", {
  payload: { contactId: "cid_123" },
  tags: ["campaign:q2-outbound", "contact:cid_123"],
  concurrencyKey: "contact:cid_123",
});
```

Tags are for filtering and visibility; the concurrency key is for mutual exclusion. They serve different purposes.

### Failure Alerts

URL: https://rotor.sh/docs/guides/failure-alerts

Rotor can notify you when a job fails after all retry attempts are exhausted. Alerts fire per workspace, not per queue — one configuration covers all your queues.

## Configure alerts

Go to **Dashboard → Settings → Notifications**.

You can configure either or both:

- **Email** — enter any email address. Alerts are sent from `alerts@rotor.sh`.
- **Slack incoming webhook** — paste a webhook URL from your Slack app. Messages arrive as a formatted Block Kit card.

Save. That's it.

## What triggers an alert

Alerts fire on **terminal failure** only — when a job exhausts all retry attempts and moves to `failed` state. They do not fire on intermediate retry attempts.

If a queue has `attempts: 3` and a job fails on attempt 1 and 2 (and retries), no alert is sent. The alert fires when attempt 3 fails.

## Slack message format

```
❌ Job failed in `enrichment`

Reason: Connection timeout after 30s
Job ID: job_abc123
Queue: enrichment

[View run →]
```

The "View run" link opens the job directly in your Rotor dashboard.

## Create a Slack incoming webhook

1. Go to **api.slack.com/apps** and create or open an app.
2. Enable **Incoming Webhooks** under Features.
3. Click **Add New Webhook to Workspace** and choose a channel.
4. Copy the `https://hooks.slack.com/services/...` URL.
5. Paste it into **Settings → Notifications → Slack Incoming Webhook**.

<Note>
  The Slack webhook URL must start with `https://hooks.slack.com/`. Rotor
  validates this on save and rejects anything else.
</Note>

## Email alerts

Email alerts are sent from `alerts@rotor.sh` with the subject:

```
Job failed: {jobName} in queue {queueName}
```

The body includes the queue name, job ID, failure reason, and a direct link to the run in your dashboard.

Add `alerts@rotor.sh` to your contacts to prevent spam filtering.

## Testing alerts

The easiest way to test: enqueue a job to a queue whose callback URL returns a 5xx response, with `attempts: 1`.

```typescript
await rotor.jobs.enqueue("test-queue", {
  payload: { test: true },
});
```

With `attempts: 1`, the first failure is terminal and triggers the alert immediately.

<Tip>
  Failure alerts are workspace-level. If you want different channels for
  different queues, that's a future feature — today it's one config per
  workspace.
</Tip>

### Callback mode vs Durable Workflows

URL: https://rotor.sh/docs/guides/callback-vs-workflows

Rotor has two ways to run your code. Knowing which to use will save you time.

## Quick answer

| If you want to… | Use |
|-----------------|-----|
| Call an existing HTTP endpoint you already have | **Callback mode** |
| Build a new multi-step workflow with retries, sleeps, and waits | **Durable Workflows** |
| Replace n8n / Make / Zapier flows | **Callback mode** |
| Replace Inngest / Temporal / Trigger.dev workflows | **Durable Workflows** |
| Keep your code on your own server | **Callback mode** |
| Write workflow logic once without managing retry state | **Durable Workflows** |

---

## Callback mode

You own an HTTP endpoint. Rotor calls it with the job payload. Your endpoint runs your logic and returns 2xx. That's it.

```
[Job enqueued] → Rotor → POST your-app.example.com/handler → [Your code runs]
```

Rotor handles: delivery, retries on non-2xx, dead-letter queue, HMAC signature verification, concurrency limits, cron scheduling.

You handle: running a server, verifying the signature, writing idempotent handlers.

**When to use it:**
- You already have an HTTP server (Next.js API route, Hono, Express, anything)
- You're replacing scheduled Lambda functions, Vercel Cron handlers, or n8n/Make nodes
- You want Rotor to act as the queue and scheduler while your app stays stateless
- The job is a single unit of work (enrich this contact, send this email, sync this record)

```typescript
// Your handler — receives a signed POST for each job
app.post("/rotor/enrichment", async (c) => {
  const body = await c.req.text();
  if (!verifyRotorSignature(body, c.req.header("x-rotor-signature"), secret)) {
    return c.text("unauthorized", 401);
  }
  const { contactId } = JSON.parse(body);
  await enrichContact(contactId);
  return c.json({ ok: true });
});
```

See the [Quickstart](/docs/quickstart) for a full walkthrough.

---

## Durable Workflows

You write a TypeScript function with `step.*` calls. Rotor checkpoints each step to Postgres. If the function retries, already-completed steps return their cached results — no double-execution.

```
[Event published] → Rotor → [Step 1 runs] → [Checkpoint] → [Step 2 runs] → [Checkpoint] → ...
```

Rotor handles: checkpointing, replay on retry, sleeping without occupying a worker, waiting for external events.

You handle: writing the workflow function, deploying a small server that `serveWorkflow()` runs on.

**When to use it:**
- The job has multiple steps with independent retry budgets
- You need `step.sleep("24h")` — pause for a day without a running process
- You need `step.waitForEvent` — pause until something happens externally (email opened, payment received)
- You need `step.waitForSignal` — pause until a human approves
- You need `step.invoke` — fan out to sub-workflows and collect results

```typescript
export const outreachSequence = createFunction(
  { id: "outreach-sequence", trigger: { event: "contact.added" } },
  async ({ event, step }) => {
    const profile = await step.run("fetch-profile", () =>
      fetchContact(event.data.contactId)
    );

    await step.sleep("warm-up", "24h");   // no worker occupied during this wait

    await step.run("send-email", () =>
      sendEmail(profile.email, "intro")
    );

    const opened = await step.waitForEvent<{ at: string }>("wait-open", {
      event: "email.opened",
      match: `data.contactId == "${event.data.contactId}"`,
      timeout: "7d",
    });

    if (opened) {
      await step.run("send-followup", () => sendEmail(profile.email, "followup"));
    }
  }
);
```

See the [Workflows Quickstart](/docs/workflows/quickstart) for setup.

---

## Can you use both?

Yes. They're independent. A common pattern:

- Use **callback mode** for high-volume single-step jobs (contact enrichment, HubSpot sync)
- Use **durable workflows** for multi-step sequences that need to sleep, wait, or branch

A durable workflow step (`step.run`) can enqueue a callback-mode job via the SDK if needed — for example, delegating the heavy lifting of an enrichment step to a callback queue that runs with higher concurrency.

---

## Decision checklist

Pick callback mode if **all** of these are true:
- [ ] The job is a single unit of work (one action, one response)
- [ ] You already have or want a simple HTTP server
- [ ] You don't need to pause mid-job for more than a retry delay

Pick durable workflows if **any** of these are true:
- [ ] The job has 2+ sequential steps that should retry independently
- [ ] You need to wait hours or days between steps
- [ ] You need to pause until an external event (user action, approval, webhook) arrives
- [ ] You're building something that would require a state machine if you wrote it yourself

---

## MCP

### Install the Rotor MCP server

URL: https://rotor.sh/docs/mcp/install

# Install the Rotor MCP server

Add **27 workspace-scoped tools** to Claude Code, Cursor, or Claude Desktop. Once installed, you can prompt Claude to list queues, create schedules, replay DLQ jobs, and more — all scoped to your Rotor workspace API key.

Example prompts after install:
- _"List my Rotor queues"_
- _"Create a schedule named nightly-export that runs at 02:00 UTC"_
- _"Replay DLQ job abc-123 from the outbound queue"_

<Note>
  **Tools are workspace-scoped.** Your API key (`rt_ws_*`) grants access only to the workspace it was created in. These tools do NOT have access to other workspaces or operator-level admin resources.
</Note>

## Prerequisites

- **Node 18+** on PATH (required for `npx`)
- **A workspace API key** — prefix `rt_ws_*`. Generate one:
  ```bash
  rotor api-keys create --label "claude-code"
  ```
  (CLI install: `npm i -g @rotorsh/cli`)
- **One of:** Claude Code 1.x, Claude Desktop, or Cursor

---

## Install

### Claude Code

Paste the following into `~/.claude/.mcp.json` (create the file if it does not exist). Replace `rt_ws_<paste-your-key>` with your actual workspace API key. Save and restart Claude Code.

```json
{
  "mcpServers": {
    "rotor": {
      "command": "npx",
      "args": ["-y", "@rotorsh/mcp"],
      "env": {
        "ROTOR_API_KEY": "rt_ws_<paste-your-key>"
      }
    }
  }
}
```

### Claude Desktop

Open your Claude Desktop config file:
- **macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux:** `~/.config/Claude/claude_desktop_config.json`

Add or merge the following into the `mcpServers` key, then fully quit and reopen Claude Desktop.

```json
{
  "mcpServers": {
    "rotor": {
      "command": "npx",
      "args": ["-y", "@rotorsh/mcp"],
      "env": {
        "ROTOR_API_KEY": "rt_ws_<paste-your-key>"
      }
    }
  }
}
```

### Cursor

Open `~/.cursor/mcp.json` (create if it does not exist), add the following, then restart Cursor.

```json
{
  "servers": {
    "rotor": {
      "command": "npx",
      "args": ["-y", "@rotorsh/mcp"],
      "env": {
        "ROTOR_API_KEY": "rt_ws_<paste-your-key>"
      }
    }
  }
}
```

---

## Verify the connection

<Steps>
  <Step title="Restart your client">
    Fully quit and reopen Claude Code, Claude Desktop, or Cursor after saving the config file.
  </Step>
  <Step title="Open a new conversation and prompt">
    Type: **"List my Rotor queues"**
  </Step>
  <Step title="Confirm the tool was invoked">
    Claude should invoke `rotor_list_queues` and return your workspace queues. If you have no queues yet, the response is `{ queues: [] }` — that is expected and confirms the connection is working.
  </Step>
</Steps>

---

## Your first prompt: create a queue from Claude

Once your client picks up the MCP config, type a prompt like this in any new conversation:

> Create a Rotor queue called `welcome-emails` with concurrency 5 and 3 retry attempts.

Claude should invoke the `rotor_create_queue` tool with arguments matching:

```json
{
  "name": "welcome-emails",
  "concurrency": 5,
  "retry_attempts": 3
}
```

The tool returns the created queue object — confirm in your terminal with:

```bash
rotor queues list
# welcome-emails should appear with concurrency 5
```

This proves three things at once:
- The MCP server is connected (tool resolution).
- Your `ROTOR_API_KEY` is valid (server-side auth succeeded).
- The CLI and MCP share the same backend (queue created via MCP appears in CLI list).

<Tip>
If Claude picks a different tool (`rotor_list_queues` instead of `rotor_create_queue`), reword the prompt with stronger imperatives: "Create a new Rotor queue named welcome-emails..." Tool selection improves with explicit verbs.
</Tip>

---

## Available tools

27 workspace-scoped tools organized by category:

**Queues** — `rotor_list_queues`, `rotor_get_queue`, `rotor_create_queue`, `rotor_update_queue`, `rotor_set_queue_state` (pause/resume/active), `rotor_drain_queue`, `rotor_delete_queue`

**Schedules** — `rotor_list_schedules`, `rotor_get_schedule`, `rotor_create_schedule`, `rotor_update_schedule`, `rotor_delete_schedule`, `rotor_force_fire_schedule`, `rotor_inspect_run_history`

**Jobs** — `rotor_list_jobs`, `rotor_get_job`, `rotor_add_job`, `rotor_retry_job`, `rotor_delete_job`

**DLQ** — `rotor_get_dlq` (list failed jobs), `rotor_replay_dlq` (idempotent replay)

**Status** — `rotor_get_status` (workspace health)

**API Keys** — `rotor_list_api_keys`, `rotor_create_api_key`, `rotor_revoke_api_key`

**Workspaces** — `rotor_list_workspaces`, `rotor_get_workspace`

For full input schema on any tool, call `tools/list` from your MCP client.

---

## Troubleshooting

| Symptom | Resolution |
|---------|------------|
| "Tool not found" after restart | Verify Claude Code can spawn `npx` — run `npx --version` in a terminal. Check the config file syntax is valid JSON. |
| "401 Unauthorized" | Rotate your API key: `rotor api-keys create --label "claude-code-new"`. Ensure the key starts with `rt_ws_` (workspace-scoped), not `rt_q_` (queue-scoped). |
| Cold-start latency (>10s first call) | Install globally to skip `npx` download: `npm i -g @rotorsh/mcp`, then replace `"command": "npx"` with the absolute path to the `rotor-mcp` binary (find it via `which rotor-mcp`). |
| stdout corruption / JSON parse errors | This is a known guard (Pitfall 6 — all logs go to stderr). If you see parse errors, please file an issue — it indicates an unexpected `console.log` in the server. |

---

## Next steps

- [Quickstart guide](/docs/quickstart) — build your first queue-backed workflow
- [Connection guide](/mcp/connection-guide) — the hosted Streamable-HTTP MCP for workflow tools (`trigger_workflow`, `approve_run`, etc.)

### MCP Connection Guide

URL: https://rotor.sh/docs/mcp/connection-guide

# MCP Connection Guide

<Note>
  **MCP v2 ships workflow-led tools.** The previous job-shaped tools were removed. If your agent or client referenced them, see [the v2 migration guide](/mcp/v2-migration).
</Note>

The rotor.sh MCP server exposes **12 tools** that let AI agents trigger workflows, inspect runs, manage queues, and approve pending operations. Workflows are first-class. Runs are how you observe them.

## Prerequisites

1. A rotor.sh account — [sign up free](https://rotor.sh)
2. An API key — generate one at [rotor.sh/dashboard/api-keys](https://rotor.sh/dashboard/api-keys)

Your key looks like `rt_ws_<workspaceId>_<secret>`.

---

## Claude Code (recommended)

Claude Code supports MCP servers natively via the `claude mcp add` command.

```bash
claude mcp add rotor https://rotor.sh/mcp \
  --header "Authorization: Bearer $ROTOR_API_KEY"
```

Replace `$ROTOR_API_KEY` with your actual key, or set the environment variable first:

```bash
export ROTOR_API_KEY=rt_ws_your_key_here
claude mcp add rotor https://rotor.sh/mcp \
  --header "Authorization: Bearer $ROTOR_API_KEY"
```

Verify the connection:

```bash
claude mcp list
# Output should include:
# rotor: https://rotor.sh/mcp (12 tools)
```

### Available tools

| Tool | Description |
|------|-------------|
| `list_workflows` | List the workspace's registered workflow functions |
| `trigger_workflow` | Fire-and-forget trigger of a workflow. Returns `run_id`. |
| `invoke_workflow` | Synchronous wait. Polls until the run completes or times out. |
| `get_run` | Get full run detail including step graph |
| `list_runs` | List runs with filters by workflow, status, and time |
| `cancel_run` | Cancel a running workflow. Idempotent on terminal runs. |
| `upsert_queue` | Create or update a BullMQ-backed queue |
| `set_queue_state` | Pause, resume, or drain a queue |
| `drain_queue` | Destructively drain a queue (requires `confirm: true`) |
| `queue_stats` | Get job counts and throughput for a queue |
| `upsert_schedule` | Create or update a cron schedule |
| `approve_run` | Approve or reject a pending workflow approval |

---

## Claude Desktop

Claude Desktop reads MCP server configuration from a JSON file.

<Steps>
  <Step title="Open your Claude Desktop config">
    - **macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`
    - **Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
    - **Linux:** `~/.config/Claude/claude_desktop_config.json`
  </Step>
  <Step title="Add the rotor MCP server">
    Paste the following into `mcpServers` (create the key if the file does not have it):

    ```json
    {
      "mcpServers": {
        "rotor": {
          "url": "https://rotor.sh/mcp",
          "headers": {
            "Authorization": "Bearer rt_ws_your_key_here"
          }
        }
      }
    }
    ```

    Replace `rt_ws_your_key_here` with your actual API key.
  </Step>
  <Step title="Restart Claude Desktop">
    Fully quit and reopen Claude Desktop. You should see the rotor tools available in the tools panel.
  </Step>
</Steps>

---

## Cursor

Cursor supports MCP via its settings panel or a config file.

### Via Cursor Settings UI

1. Open **Cursor Settings** (`Cmd/Ctrl + ,`)
2. Navigate to **Features → Model Context Protocol**
3. Click **Add MCP Server**
4. Enter:
   - **Name:** `rotor`
   - **URL:** `https://rotor.sh/mcp`
   - **Auth Header:** `Authorization: Bearer rt_ws_your_key_here`

### Via Config File

Add to your Cursor MCP config (usually `~/.cursor/mcp.json`):

```json
{
  "servers": {
    "rotor": {
      "url": "https://rotor.sh/mcp",
      "headers": {
        "Authorization": "Bearer rt_ws_your_key_here"
      }
    }
  }
}
```

Restart Cursor after saving.

---

## Verify the connection

Regardless of which client you use, verify the MCP server is responding by asking the AI assistant:

> "List my rotor workflows"

or

> "Trigger the lead-enrichment workflow with email lead@acme.com"

or

> "What was the result of run abc-123?"

The assistant should call the appropriate tool and return data from your workspace.

---

## Troubleshooting

**Tools not appearing after connecting:**
- Ensure your API key starts with `rt_ws_` (workspace-scoped key required)
- Check the key is not expired — generate a fresh one from [dashboard/api-keys](https://rotor.sh/dashboard/api-keys)
- Restart the AI client after changing the config

**401 Unauthorized errors:**
- Verify the `Authorization: Bearer ...` header is correctly formatted
- Ensure no extra whitespace around the key value

**Connection timeout:**
- The MCP endpoint is `https://rotor.sh/mcp` (not the API base URL `https://api.rotor.sh`)
- Check your network is not blocking SSE connections (some corporate proxies do)

---

## What changed in v2

The MCP surface was redesigned to put workflows at the center. If you used the previous job-shaped tools, the [v2 migration guide](/mcp/v2-migration) maps every removed tool to its replacement.

**Need help?** Open an issue on [GitHub](https://github.com/shyftai/rotor) or reach out in the [Anthropic Discord](https://discord.gg/anthropic) `#rotor` channel.

---

## Claude Code Plugin

### Claude Code Plugin Install

URL: https://rotor.sh/docs/skill/install

# Claude Code Plugin Install

The rotor.sh Claude Code plugin adds the `/rotor:new-workflow` slash command and agent rules to Claude Code, letting you scaffold GTM workflows without leaving your editor.

## Quick Install (Marketplace)

```bash
/plugin marketplace add rotor-sh/claude-plugin
/plugin install rotor
```

That's it. Restart Claude Code and run `/rotor:new-workflow outbound-with-approvals` to scaffold your first workflow.

---

## Manual Install (Fallback)

If the marketplace is unavailable or you prefer local control:

```bash
git clone https://github.com/rotor-sh/claude-plugin ~/.claude/plugins/rotor
```

Then in Claude Code:

```bash
/plugin install ~/.claude/plugins/rotor
```

---

## What Gets Installed

After installing, Claude Code gains:

| Feature | Description |
|---------|-------------|
| `/rotor:new-workflow` | Skill that scaffolds starter templates via `rotor template apply` |
| Agent rules | Copies `@rotorsh/sdk/CLAUDE.md` to your project root when you scaffold |
| MCP suggestion | Guides you through `claude mcp add rotor ...` if not already wired |

---

## Using `/rotor:new-workflow`

```
/rotor:new-workflow <template-name>
```

Available template names:

| Template | Use case |
|----------|----------|
| `outbound-with-approvals` | Outbound email/message sequences requiring human approval |
| `enrichment-dag` | Parallel data enrichment from multiple providers |
| `attribution-rollup` | Daily cron attribution rollup |
| `reply-classifier` | LLM-based reply intent classification |

### Example

```
/rotor:new-workflow outbound-with-approvals
```

Claude Code will:
1. Check `rotor` CLI is installed (install if missing)
2. Add `@rotorsh/sdk` to your project if absent
3. Run `rotor template apply outbound-with-approvals`
4. Show customization options from the generated README
5. Update `CLAUDE.md` with rotor agent rules
6. Remind you to set `ROTOR_API_KEY`
7. Suggest wiring the MCP server

---

## Verify Installation

After installing, check the skill is available:

```
/rotor:new-workflow --help
```

Or ask Claude Code directly:

> "What rotor templates are available?"

Claude should list the four starter templates.

---

## Update the Plugin

To update to the latest version:

### Marketplace install

```bash
/plugin update rotor
```

### Manual install

```bash
cd ~/.claude/plugins/rotor && git pull
/plugin install ~/.claude/plugins/rotor
```

---

## Troubleshooting

**`/rotor:new-workflow` not recognized after install:**

1. Restart Claude Code completely (quit and reopen)
2. Verify the skill file exists: `ls ~/.claude/plugins/rotor/skills/new-workflow/SKILL.md`
3. Re-run `/plugin install rotor`

**`rotor template apply` command not found:**

The `rotor` CLI is separate from the Claude Code plugin. Install it globally:

```bash
npm install -g rotor-cli
rotor --version
```

**Plugin install fails with "repository not found":**

Use the manual install method above. If `rotor-sh/claude-plugin` is not yet on GitHub, clone from the main repo:

```bash
git clone https://github.com/shyftai/rotor ~/.claude/plugins/rotor-src
cp -r ~/.claude/plugins/rotor-src/packages/rotor-plugin ~/.claude/plugins/rotor
/plugin install ~/.claude/plugins/rotor
```

**Need help?** Open an issue on [GitHub](https://github.com/shyftai/rotor) or ask in the [Anthropic Discord](https://discord.gg/anthropic) `#rotor` channel.

---

## API Reference

### API Reference

URL: https://rotor.sh/docs/api-reference/introduction

## Base URL

```
https://api.rotor.sh
```

All API endpoints are versioned under `/v1/`.

## Authentication

All `/v1/*` routes require a workspace API key passed as a Bearer token:

```http
Authorization: Bearer rt_ws_your_key_here
```

Workspace keys are prefixed `rt_ws_` and scoped to a single workspace. Generate them from your workspace settings at [rotor.sh](https://rotor.sh).

## Rate Limits

Requests are rate-limited per workspace based on your plan:

| Plan       | Requests/minute |
| ---------- | --------------- |
| Free       | 60              |
| Pro        | 300             |
| Team       | 1,000           |
| Enterprise | Custom          |

## Error Format

All errors return a consistent JSON envelope:

```json
{
  "error": {
    "code": "QUOTA_EXCEEDED",
    "message": "Monthly job quota exceeded for this workspace",
    "details": { "used": 10000, "limit": 10000, "plan": "free" }
  }
}
```

## Tag Groups

The OpenAPI spec below groups endpoints by resource:

- **Queues** — create, list, get, delete queues; per-queue metrics
- **Jobs** — enqueue single + batch; get status; retry; DLQ management
- **Schedules** — create, list, pause, resume, delete recurring jobs
- **Status** — workspace health check + quota usage
- **Usage** — billing-period execution breakdown
- **Metrics** — Prometheus-compatible counters (internal; gated at infra level)

## OpenAPI Spec

Download the full machine-readable spec:

```bash
curl https://api.rotor.sh/doc -o openapi.json
```

The spec covers all `/v1` endpoints with full request/response schemas. You can import it into Insomnia, Postman, or any OpenAPI-compatible client.

## Endpoints at a glance

| Method | Path | Description |
|--------|------|-------------|
| GET | `/v1/queues` | List all queues |
| POST | `/v1/queues` | Create a queue |
| GET | `/v1/queues/:name` | Get queue config + metrics |
| PATCH | `/v1/queues/:name` | Update queue config |
| DELETE | `/v1/queues/:name` | Delete a queue |
| POST | `/v1/queues/:name/jobs` | Enqueue a job |
| POST | `/v1/queues/:name/jobs/batch` | Enqueue jobs in bulk |
| GET | `/v1/queues/:name/jobs` | List jobs (with `?state=`, `?tag=`) |
| GET | `/v1/queues/:name/jobs/:id` | Get job status |
| DELETE | `/v1/queues/:name/jobs/:id` | Cancel / delete a job |
| POST | `/v1/queues/:name/jobs/:id/retry` | Retry a failed job |
| GET | `/v1/schedules` | List schedules |
| POST | `/v1/schedules` | Create a schedule |
| PATCH | `/v1/schedules/:name/pause` | Pause a schedule |
| PATCH | `/v1/schedules/:name/resume` | Resume a schedule |
| DELETE | `/v1/schedules/:name` | Delete a schedule |
| GET | `/v1/status` | Workspace health + quota |
| GET | `/v1/usage` | Billing-period execution breakdown |

---

## Workflows

### Durable Workflows Quickstart

URL: https://rotor.sh/docs/workflows/quickstart

# Durable Workflows Quickstart

Rotor durable workflows are multi-step TypeScript functions that execute reliably across retries, sleeps, and external event waits — without occupying a long-lived worker process.

## How it works

Each workflow step is durably checkpointed to Postgres. If a retry occurs, the function body re-executes from the top, but already-completed steps return their cached results immediately — no double-execution of side effects.

## Installation

```bash
npm install @rotorsh/sdk
```

## Define a function

```typescript
import { createFunction } from '@rotorsh/sdk';

export const welcomeSequence = createFunction(
  {
    id: 'welcome-sequence',
    trigger: { event: 'user.created' },
    retries: 3,
  },
  async ({ event, step }) => {
    // step.run() executes exactly once per run — safe to put side effects here
    const profile = await step.run('fetch-profile', async () => {
      const res = await fetch(`/api/users/${event.data.userId}`);
      return res.json();
    });

    // Pause for 24 hours without occupying a worker slot
    await step.sleep('initial-wait', '24h');

    // Send a personalized welcome email
    await step.run('send-welcome', async () => {
      await sendEmail({
        to: profile.email,
        subject: `Welcome, ${profile.name}!`,
        body: `Thanks for signing up...`,
      });
    });

    return { sent: true };
  },
);
```

## Register with serve()

```typescript
import { Hono } from 'hono';
import { serveWorkflow } from '@rotorsh/sdk';
import { welcomeSequence } from './workflows/welcome-sequence';

const app = new Hono();

serveWorkflow(app, {
  functions: [welcomeSequence],
  signingKey: process.env.ROTOR_SIGNING_KEY!, // set in Rotor dashboard
});

export default app;
```

## Register the function with Rotor

```bash
curl -X POST https://api.rotor.sh/v1/functions \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "welcome-sequence",
    "serve_url": "https://your-app.example.com/api/rotor",
    "trigger_type": "event",
    "trigger_event": "user.created"
  }'
```

## Trigger the function

```typescript
import { Rotor } from '@rotorsh/sdk';

const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

// Publish an event — Rotor fans out to all matching functions
await rotor.send('user.created', { userId: '123', email: 'alice@example.com' });
```

## Step primitives

| Primitive | Description |
|-----------|-------------|
| `step.run(id, fn)` | Executes `fn` exactly once per run; cached on retry |
| `step.sleep(id, duration)` | Suspends for a duration — `'30m'`, `'2d'`, `'1h'` |
| `step.sleepUntil(id, timestamp)` | Suspends until a specific timestamp |
| `step.waitForEvent(id, opts)` | Suspends until a matching event arrives via `rotor.send()` |

## Phase 10 primitives

Three new step primitives extend the base set with sub-workflow composition, event fan-out, and human-in-the-loop capabilities:

**[step.invoke](/docs/workflows/step-invoke)** — Spawn a child workflow and wait for its result. `timeout` is required (TypeScript build fails without it). Returns `{ result, runId }`. Throws `WorkflowFailedError`, `WorkflowTimeoutError`, or `WorkflowCancelledError` on child failure. Memoized on parent retry — child is not re-spawned.

**[step.sendEvent](/docs/workflows/step-send-event)** — Fire-and-forget event emission from inside a step. Parent step completes immediately. Optional `ts` parameter defers emission via BullMQ delay queue. Memoized on retry — no double-emit.

**[step.waitForSignal](/docs/workflows/step-wait-for-signal)** — Named pause/resume. Suspends the run until `POST /v1/signals/:signal_id/complete` is called. The signal ID is workspace-unique — use `approval-${runId}` to avoid collisions. Throws `SignalTimeoutError` if `timeout` elapses before the POST arrives.

### Full-primitive workflow example

This example uses all six step primitives in one workflow — mirrors the backward-compatibility test suite Case E:

```typescript
import { createFunction } from '@rotorsh/sdk';
import {
  WorkflowFailedError,
  WorkflowTimeoutError,
  SignalTimeoutError,
} from '@rotor/sdk/errors';

export const fullPrimitivesWorkflow = createFunction(
  {
    id: 'full-primitives-workflow',
    trigger: { event: 'campaign.requested' },
    retries: 2,
  },
  async ({ event, step, runId }) => {
    // 1. step.run — fetch contact profile (exactly once, cached on retry)
    const profile = await step.run('fetch-profile', async () => {
      const res = await fetch(`/api/contacts/${event.data.contactId}`);
      return res.json() as Promise<{ email: string; name: string }>;
    });

    // 2. step.sleep — wait 1 hour before enriching (no worker slot occupied)
    await step.sleep('initial-delay', '1h');

    // 3. step.invoke — delegate to enrichment specialist workflow
    let enriched: { score: number; company: string };
    try {
      const { result } = await step.invoke<typeof enriched>('enrich-contact', {
        workflow: { id: 'contact-enricher' },
        data: { contactId: event.data.contactId },
        timeout: '5m',   // REQUIRED — TypeScript build fails without this
      });
      enriched = result;
    } catch (err) {
      if (err instanceof WorkflowFailedError || err instanceof WorkflowTimeoutError) {
        enriched = { score: 0, company: 'Unknown' };
      } else {
        throw err;
      }
    }

    // 4. step.sendEvent — fan-out to analytics pipeline (fire and forget)
    await step.sendEvent('notify-analytics', {
      name: 'analytics/contact.scored',
      data: { contactId: event.data.contactId, score: enriched.score },
    });

    // 5. step.waitForSignal — require human approval before sending
    let approved = false;
    try {
      const approval = await step.waitForSignal<{ approved: boolean }>(
        'approval-gate',
        {
          signal: `campaign-approval-${runId}`,
          timeout: '48h',
        }
      );
      approved = approval.approved;
    } catch (err) {
      if (err instanceof SignalTimeoutError) {
        // Auto-reject after 48h
        return { status: 'auto-rejected', contactId: event.data.contactId };
      }
      throw err;
    }

    if (!approved) {
      return { status: 'rejected', contactId: event.data.contactId };
    }

    // 6. step.waitForEvent — wait for the contact to open the email (max 7 days)
    const opened = await step.waitForEvent<{ openedAt: string }>('wait-for-open', {
      event: 'email.opened',
      match: `data.contactId == "${event.data.contactId}"`,
      timeout: '7d',
    });

    return {
      status: 'complete',
      contactId: event.data.contactId,
      score: enriched.score,
      emailOpened: opened !== null,
    };
  }
);
```

## Fan-out with Promise.all

```typescript
const [resultA, resultB] = await Promise.all([
  step.run('enrich-email', () => enrichEmail(contact.email)),
  step.run('enrich-phone', () => enrichPhone(contact.phone)),
]);
```

Both steps run in parallel. The workflow resumes when both complete.

## Cancellation

```typescript
// Cancel a run via event
await rotor.send('rotor/cancel.run', { runId: 'run-abc-123' });
```

In-flight steps complete; subsequent steps are skipped. Run status moves to `cancelled`.

## View step graph

Open `/dashboard/runs/:id` to see a real-time step DAG with status colors and output previews.

## Starter templates

| Template | Description |
|----------|-------------|
| `multi-step-outbound` | Outbound email sequence with waitForEvent reply detection |
| `event-driven-enrichment` | Parallel contact enrichment on campaign.started |
| `scheduled-with-approval` | Weekly cron report requiring Slack/terminal approval |

```bash
npx rotor template add multi-step-outbound
```

### step.invoke — Sync Sub-Workflow Call

URL: https://rotor.sh/docs/workflows/step-invoke

## Overview

`step.invoke` spawns a child workflow run and suspends the parent step until the child completes. When the child finishes, the parent resumes with the child's return value. The parent's step result is durably memoized — on retry, the cached child result is returned without re-spawning.

Unlike enqueueing a job and polling, `step.invoke` gives you a type-safe return value and propagates errors — the parent step throws if the child fails, times out, or is cancelled.

## API shape

```typescript
const { result, runId } = await step.invoke<{ score: number }>("enrich-contact", {
  workflow: { id: "contact-enrichment" },   // target workflow id (must be event-triggered)
  data: { contactId: "c_123" },             // trigger data sent as the event payload
  timeout: "10m",                           // REQUIRED — no infinite default. Max 24h.
});
```

```typescript
// Full type signature:
invoke<R = unknown>(
  id: string,
  opts: {
    workflow: { id: string };
    data: unknown;
    timeout: string;    // REQUIRED. e.g. "5m", "1h", "24h". TypeScript build fails without it.
  }
): Promise<{ result: R; runId: string }>
```

## Return value

`step.invoke` returns `{ result, runId }`:

| Field | Type | Description |
|-------|------|-------------|
| `result` | `R` | The child workflow's return value, typed via the `<R>` generic |
| `runId` | `string` | The child run's UUID — useful for debugging, linking logs, and dashboard navigation |

The `runId` is included because observability matters: when a parent invokes dozens of children, you need to link parent and child spans without scraping logs.

## Error model

`step.invoke` throws one of three errors when the child doesn't complete successfully:

| Error class | When it fires | Import |
|-------------|--------------|--------|
| `WorkflowFailedError` | Child workflow threw or reached terminal failure | `@rotor/sdk/errors` |
| `WorkflowTimeoutError` | Parent-side timeout elapsed (child continues running) | `@rotor/sdk/errors` |
| `WorkflowCancelledError` | Child was cancelled mid-flight (user cancel, future `cancelOn` match) | `@rotor/sdk/errors` |

```typescript
import {
  WorkflowFailedError,
  WorkflowTimeoutError,
  WorkflowCancelledError,
} from "@rotor/sdk/errors";

const { result } = await step.invoke("run-analysis", {
  workflow: { id: "analysis-workflow" },
  data: { reportId },
  timeout: "30m",
}).catch((err) => {
  if (err.name === "WorkflowTimeoutError") {
    // Child is still running independently — record the runId for follow-up
    return { result: null, runId: "unknown" };
  }
  if (err.name === "WorkflowCancelledError") {
    // Child was explicitly cancelled — compensate
    await step.run("record-cancellation", () => recordCancellation(reportId));
    throw err;
  }
  // WorkflowFailedError or unexpected — rethrow
  throw err;
});
```

## Cancellation matrix

| Direction | What happens | Error surfaced |
|-----------|--------------|----------------|
| Parent cancelled → child | `cancelRun(parentId)` recursively cancels all in-flight children via `parent_run_id` scan | N/A — parent itself was cancelled |
| Child cancelled → parent step | Child reaches `cancelled` terminal status; parent's pending `step.invoke` throws | `WorkflowCancelledError` |
| Parent-side timeout fires | `workflow:invoke-timeout` marks parent step failed; **child keeps running independently** | `WorkflowTimeoutError` |

<Warning>
  **`WorkflowTimeoutError` does NOT cancel the child.** The child run continues
  executing in the background. The timeout means "I'm done waiting for you" — it
  is a parent concern, not a child concern. This is intentional and matches the
  "wait for at most N minutes, but don't waste the work already done" semantics.
  Record the `runId` so you can track or cancel the child separately if needed.
</Warning>

## Why timeout is required

<Note>
  **Inngest's blocks-forever default is a billing bug pretending to be a UX
  feature.** Forgetting to set a timeout on a blocking sub-workflow call means
  your run — and your billing meter — runs until the child completes, which could
  be days. Rotor surfaces this as a compile error: `timeout` is required at the
  TypeScript type level. The maximum is 24 hours.

  Recommended default: `"5m"` — matches the step.run cap and is long enough for
  most enrichment/analysis sub-agents. Increase only when you have a concrete
  SLA reason.
</Note>

## Memoization

On parent retry, the cached child result is returned without re-spawning:

1. The serve adapter hashes `"stepId:counter"` to produce a stable step key.
2. If `completedSteps[hash]` is already populated, `step.invoke` returns the cached `{ result, runId }` immediately.
3. No duplicate child runs are created — the child workflow is spawned exactly once per unique step ID.

If the parent timed out on the previous attempt, the cached `WorkflowTimeoutError` is re-thrown without spawning a new child. Catch `WorkflowTimeoutError` inside your workflow function if you want to break out of this retry loop.

## Common patterns

### Sub-agent delegation

Fan out to a specialist workflow and collect structured results:

```typescript
export const analysisOrchestrator = createFunction(
  { id: "analysis-orchestrator", trigger: { event: "analysis.requested" }, retries: 2 },
  async ({ event, step }) => {
    // Delegate to a contact enrichment specialist
    const { result: enriched } = await step.invoke("enrich", {
      workflow: { id: "contact-enricher" },
      data: { contactId: event.data.contactId },
      timeout: "5m",
    });

    // Delegate to a scoring model
    const { result: score } = await step.invoke("score", {
      workflow: { id: "lead-scorer" },
      data: { enriched },
      timeout: "2m",
    });

    return { contactId: event.data.contactId, score: score.value };
  }
);
```

### Pipeline composition

Chain workflows where each stage's output feeds the next:

```typescript
const { result: stage1 } = await step.invoke("stage-1", {
  workflow: { id: "data-cleaner" },
  data: rawPayload,
  timeout: "10m",
});

const { result: stage2 } = await step.invoke("stage-2", {
  workflow: { id: "data-transformer" },
  data: stage1,
  timeout: "10m",
});

const { result: final } = await step.invoke("stage-3", {
  workflow: { id: "data-loader" },
  data: stage2,
  timeout: "5m",
});
```

### Fan-out with Promise.all

Invoke multiple children in parallel and merge results:

```typescript
const [{ result: a }, { result: b }, { result: c }] = await Promise.all([
  step.invoke("scorer-a", { workflow: { id: "scorer-a" }, data: contact, timeout: "3m" }),
  step.invoke("scorer-b", { workflow: { id: "scorer-b" }, data: contact, timeout: "3m" }),
  step.invoke("scorer-c", { workflow: { id: "scorer-c" }, data: contact, timeout: "3m" }),
]);
return { consensus: average(a.score, b.score, c.score) };
```

## Limits

| Constraint | Value |
|------------|-------|
| Maximum `timeout` | 24 hours |
| Target workflow trigger type | Must be `event`-triggered (not `cron`) |
| Recursion depth | No platform limit — user's responsibility to avoid infinite loops |
| Parallel invocations | No platform limit — use `Promise.all` freely |

<Tip>
  **Avoid invoking cron workflows.** Cron workflows are not designed to receive
  event data — they read from schedules, not event payloads. `step.invoke` will
  fail immediately if the target workflow has `trigger_type = 'cron'`.
</Tip>

### step.sendEvent — Fire-and-Forget Event Emission

URL: https://rotor.sh/docs/workflows/step-send-event

## Overview

`step.sendEvent` publishes an event to your workspace event bus from within a running workflow. The parent step completes immediately — it does not wait for any downstream workflow triggered by the event. This is the async counterpart to `step.invoke`: use `step.sendEvent` when you want to fan out work without blocking the current workflow.

The call is memoized: on parent retry, the event is not emitted a second time.

## API shape

```typescript
await step.sendEvent("spawn-enricher", {
  name: "agent/contact.enrich.requested",   // event name — dot-notation convention
  data: { contactId: "c_123", priority: "high" },
  ts: undefined,   // optional: Unix timestamp in ms — omit for immediate emission
});
```

```typescript
// Full type signature:
sendEvent(
  id: string,
  opts: {
    name: string;
    data: unknown;
    ts?: number;   // optional Unix timestamp (ms). Future ts defers emission via BullMQ delay queue.
  }
): Promise<void>
```

## Return value

`step.sendEvent` returns `Promise<void>`. There is no result to await — the operation is fire-and-forget from the parent workflow's perspective.

## Future scheduling

When `ts` is set to a future Unix timestamp (milliseconds), the event emission is deferred via BullMQ's delay queue. The parent step completes immediately upon scheduling — it does not block until the future time.

```typescript
const tomorrow = Date.now() + 24 * 60 * 60 * 1000;

await step.sendEvent("schedule-followup", {
  name: "outreach/followup.scheduled",
  data: { contactId, templateId: "follow-up-v2" },
  ts: tomorrow,   // event fires ~24h from now
});

// Parent workflow continues immediately after this line
await step.run("record-scheduled", () => db.outreach.markScheduled(contactId));
```

BullMQ stores the deferred job and emits the event at `ts`. If the worker restarts before `ts`, BullMQ's durable queue ensures the event fires on recovery — no manual retry logic needed.

## Memoization

On parent retry, `step.sendEvent` is a no-op for an already-emitted step:

1. The serve adapter hashes `"stepId:counter"` to a stable key.
2. If `completedSteps[hash]` is already populated, `step.sendEvent` returns immediately without re-inserting to the events table.
3. No duplicate events are emitted.

This gives the same memoization guarantee as `step.run`: side effects happen exactly once per workflow execution.

## Common patterns

### Async fan-out to parallel agents

Spawn N sub-agents without waiting for any of them:

```typescript
for (const contact of contacts) {
  await step.sendEvent(`spawn-enricher-${contact.id}`, {
    name: "agent/enrich.requested",
    data: { contactId: contact.id, workflowRunId: runId },
  });
}
// All N enrichment workflows start in parallel — parent does not wait
```

### Scheduled sub-agent

Schedule follow-up work at a specific time without running a separate cron:

```typescript
const sendAt = new Date("2026-05-01T09:00:00Z").getTime();

await step.sendEvent("schedule-campaign-blast", {
  name: "campaign/blast.scheduled",
  data: { campaignId, segmentId },
  ts: sendAt,
});
```

### Chain without coupling

Trigger a downstream workflow by name, allowing you to evolve each workflow independently:

```typescript
const { data: enriched } = await step.run("enrich", () => enrichContact(contactId));

// Hand off to a separate "scoring" workflow — no direct import required
await step.sendEvent("trigger-scoring", {
  name: "scoring/contact.ready",
  data: enriched,
});
```

## Limits

| Constraint | Value |
|------------|-------|
| Workspace scope | Events are workspace-scoped — no cross-workspace emit |
| Deferred sendEvent durability | BullMQ delay queue — durable across worker restarts |
| Deferred sendEvent idempotency | NOT idempotent across worker crashes before the delay job is persisted (Phase 11 follow-up) |
| `ts` precision | Millisecond Unix timestamp — use `Date.now()` + offset |

<Warning>
  **Deferred `sendEvent` is not yet idempotent across worker crashes.** If the
  worker crashes between the serve adapter call and the BullMQ `add(delay:)` call,
  the event may not be scheduled. Phase 11 (idempotency keys) closes this window.
  For critical scheduled events, use a dedicated `step.sleep` + `step.sendEvent`
  sequence instead — the sleep memoizes durably and the sendEvent fires after
  resume.
</Warning>

### step.waitForSignal — Named Pause / Resume

URL: https://rotor.sh/docs/workflows/step-wait-for-signal

## Overview

`step.waitForSignal` suspends a workflow run until a named signal arrives via `POST /v1/signals/:signal_id/complete`. This is the idiomatic Rotor pattern for human-in-the-loop workflows: approval gates, review queues, external integration callbacks, and any case where a workflow needs to wait for an action that originates outside the system.

Unlike `step.waitForEvent` — which matches against the global event stream — `step.waitForSignal` targets exactly one run by a signal ID you control. No fan-out, no broadcast: one signal resumes one run.

## API shape

```typescript
const approval = await step.waitForSignal<{ approved: boolean; reason?: string }>(
  "approval-gate",
  {
    signal: `approval-${runId}`,   // unique per run — include runId to avoid collisions
    timeout: "48h",                // REQUIRED — throws SignalTimeoutError if no POST arrives
  }
);

if (!approval.approved) {
  throw new Error(`Rejected: ${approval.reason}`);
}
```

```typescript
// Full type signature:
waitForSignal<R = unknown>(
  id: string,
  opts: {
    signal: string;    // workspace-unique signal ID
    timeout: string;   // e.g. "24h", "7d" — required
  }
): Promise<R>
```

## Return value

`step.waitForSignal` returns the `data` field from the resume POST body, typed via the `<R>` generic. If the POST body is `{ "data": { "approved": true } }`, the step returns `{ approved: true }`.

## Resume mechanism

Resume a waiting run by POSTing to the signal's completion endpoint:

```
POST /v1/signals/:signal_id/complete
Authorization: Bearer <rt_ws_key>
Content-Type: application/json

{
  "data": { "approved": true, "reviewer": "alice@example.com" }
}
```

### Authentication

The API key must belong to the **same workspace** that owns the signal. Cross-workspace resume attempts receive a `404` (not a `401`) to prevent signal enumeration across tenants — see the 404/401 note below.

### Response codes

| Status | Meaning |
|--------|---------|
| `200` | Signal received; run queued for resume |
| `404` | Signal not found (unknown ID or cross-workspace) |
| `409` | Collision — signal_id already in use by another active waiter in this workspace |
| `410` | Signal already resolved (run already resumed or timed out) |
| `422` | Missing or invalid `data` field in body |

### 404 on cross-workspace — not 401

Rotor returns `404` for both unknown signal IDs and cross-workspace resume attempts. Returning `401` for cross-workspace would leak signal existence (an attacker could enumerate live signal IDs by observing `401` vs `404`). The `404` response keeps signal existence opaque across workspace boundaries.

## Error model

| Error class | When it fires |
|-------------|--------------|
| `SignalTimeoutError` | `timeout` elapsed before a `POST` was received |

```typescript
import { SignalTimeoutError } from "@rotor/sdk/errors";

try {
  const result = await step.waitForSignal("human-review", {
    signal: `review-${runId}`,
    timeout: "72h",
  });
  return { status: "approved", data: result };
} catch (err) {
  if (err.name === "SignalTimeoutError") {
    await step.run("auto-reject", () => recordAutoRejection(runId));
    return { status: "auto-rejected" };
  }
  throw err;
}
```

## signal_id uniqueness scope

Signal IDs are **unique per workspace** — not per run and not globally. Two simultaneous runs in the same workspace cannot share the same signal ID. Recommended pattern:

```typescript
// Use runId (unique per workflow execution) as part of the signal ID
const signal = `approval-${runId}`;

// If you need multiple signals per run, add a qualifier:
const signal1 = `budget-approval-${runId}`;
const signal2 = `legal-approval-${runId}`;
```

If a signal ID collision occurs (e.g., you accidentally reuse an ID that's already waiting), you get a `409` response from the POST endpoint, which surfaces as a run failure in the orchestrator.

## Collision behavior

A `409` from the `workflow_run_waiters` UNIQUE constraint (on `workspace_id + signal_id`) surfaces as a run failure with a distinct error message. Your consumer code (the entity POSTing to the signal endpoint) should treat `409` as a bug in signal ID generation — not a retry-able transient error.

## Memoization

On parent retry, `step.waitForSignal` does not create a duplicate waiter:

1. If a `workflow_run_waiters` row already exists for `(run_id, step_id)`, the serve adapter returns the cached result.
2. If the signal was already completed before the retry, the cached resolved value is returned immediately.
3. If the signal is still pending (awaiting POST), the run re-suspends on the existing waiter.

## Common patterns

### Approval gate

```typescript
export const outreachWithApproval = createFunction(
  { id: "outreach-with-approval", trigger: { event: "outreach.requested" }, retries: 2 },
  async ({ event, step, runId }) => {
    const draft = await step.run("draft-message", async () =>
      generateDraft(event.data.contactId)
    );

    // Post to Slack with a link to your approval UI
    await step.run("notify-approver", async () =>
      slack.send({
        channel: "#outreach-approvals",
        text: `New outreach draft for ${event.data.contactId}`,
        actions: [
          { text: "Approve", url: `https://app.example.com/approve?signal=approval-${runId}` },
          { text: "Reject",  url: `https://app.example.com/reject?signal=approval-${runId}` },
        ],
      })
    );

    // Pause until the approver clicks Approve or Reject
    const { approved, reason } = await step.waitForSignal<{ approved: boolean; reason?: string }>(
      "wait-for-approval",
      { signal: `approval-${runId}`, timeout: "48h" }
    );

    if (!approved) {
      return { status: "rejected", reason };
    }

    await step.run("send-message", () => sendOutreach(draft, event.data.contactId));
    return { status: "sent" };
  }
);
```

Your approval UI POSTs to complete the signal:

```typescript
// In your approval UI API handler:
await fetch(`https://api.rotor.sh/v1/signals/approval-${runId}/complete`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.ROTOR_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ data: { approved: true, reviewer: "alice@example.com" } }),
});
```

### Human review with fallback

```typescript
let reviewResult: { action: "approve" | "escalate" } | null = null;

try {
  reviewResult = await step.waitForSignal("human-review", {
    signal: `review-${runId}`,
    timeout: "24h",
  });
} catch (err) {
  if (err.name === "SignalTimeoutError") {
    // Auto-escalate after 24h of no response
    await step.run("escalate", () => escalateToManager(runId));
    return { status: "escalated" };
  }
  throw err;
}
```

### External integration callback

Use `step.waitForSignal` when an external service calls back asynchronously — e.g., a KYC provider, a document signing service, or a payment processor:

```typescript
// Register your signal ID with the external service before suspending
await step.run("register-kyc-webhook", () =>
  kycProvider.startVerification({
    userId: event.data.userId,
    callbackSignalId: `kyc-${runId}`,
  })
);

const { verified, failureReason } = await step.waitForSignal<KycResult>(
  "wait-for-kyc",
  { signal: `kyc-${runId}`, timeout: "7d" }
);
```

Your KYC webhook handler then calls `POST /v1/signals/kyc-${runId}/complete` with the result.

## Cancellation

Cancelling a workflow run that is paused on `step.waitForSignal` automatically deletes the waiter row — the signal ID is released and the run moves to `cancelled`. Any subsequent `POST` to that signal ID returns `404`.

## Limits

| Constraint | Value |
|------------|-------|
| Uniqueness scope | Per workspace (not per run) — use `runId` in signal ID |
| Collision | `409` at resume endpoint; surfaces as run failure |
| Timeout maximum | No hard platform max — use `"7d"` for week-long human-review flows |
| Resume body size | Limited by API gateway max body (512 KB) |
| Cross-workspace resume | Returns `404` (not `401`) — workspace boundary enforced silently |

### step.waitForEvent

URL: https://rotor.sh/docs/workflows/step-wait-for-event

`step.waitForEvent` suspends the current workflow run until a specific event is published to your workspace via `rotor.send()`. If no matching event arrives before the timeout, the step returns `null`.

## Signature

```typescript
const result = await step.waitForEvent<T>(
  id: string,
  opts: {
    event: string;        // event name to wait for
    match?: string;       // filter expression (see below)
    timeout: string;      // required — e.g. "1h", "7d", "30m"
  }
): Promise<T | null>
```

Returns the `data` field of the matching event, or `null` if the timeout elapses first.

## Basic example

```typescript
const opened = await step.waitForEvent<{ openedAt: string }>("wait-for-open", {
  event: "email.opened",
  timeout: "7d",
});

if (opened === null) {
  // No open event within 7 days
  return { status: "no-open" };
}

console.log("Email opened at:", opened.openedAt);
```

## Filtering with `match`

The `match` field is a filter expression that lets you wait for a specific event among many. It uses a CEL-like syntax operating on the incoming event's `data` field.

```typescript
const opened = await step.waitForEvent<{ openedAt: string }>("wait-for-open", {
  event: "email.opened",
  match: `data.contactId == "${event.data.contactId}"`,
  timeout: "7d",
});
```

The expression is evaluated against each incoming event of the given name. The step resumes with the first event where the expression evaluates to `true`.

**Supported operators:**

| Operator | Example |
|----------|---------|
| Equality | `data.userId == "u_123"` |
| Inequality | `data.status != "bounced"` |
| Logical AND | `data.campaignId == "c1" && data.step == "opened"` |
| Logical OR | `data.source == "web" \|\| data.source == "mobile"` |
| Numeric comparison | `data.score >= 50` |

<Warning>
  Always use `match` when you have multiple workflow runs that could each be
  waiting for the same event name. Without a filter, any workflow run waiting
  for `email.opened` will resume on the first open event for any contact.
</Warning>

## Triggering the event

From anywhere in your application — a webhook handler, another workflow, or the SDK:

```typescript
const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });

await rotor.send("email.opened", {
  contactId: "cid_123",
  openedAt: new Date().toISOString(),
  campaignId: "camp_456",
});
```

`rotor.send()` fans out to all workflow runs currently suspended at a matching `step.waitForEvent`.

## Timeout behavior

When the timeout elapses before a matching event arrives, `step.waitForEvent` returns `null` — it does **not** throw. Check the return value:

```typescript
const reply = await step.waitForEvent<{ approved: boolean }>("wait-for-reply", {
  event: "campaign.reply",
  match: `data.contactId == "${contactId}"`,
  timeout: "3d",
});

if (reply === null) {
  // Follow up after 3 days of silence
  await step.run("send-followup", () => sendFollowUp(contactId));
  return { status: "followed-up" };
}
```

## Difference from step.waitForSignal

| | `step.waitForEvent` | `step.waitForSignal` |
|--|---------------------|----------------------|
| Triggered by | `rotor.send(eventName, data)` | `POST /v1/signals/:id/complete` |
| ID collision | Multiple runs can match the same event with filters | Signal ID must be globally unique per workspace |
| On timeout | Returns `null` | Throws `SignalTimeoutError` |
| Best for | Waiting for user behaviour (opens, clicks, replies) | Approval gates, human-in-the-loop steps |

Use `step.waitForEvent` when you're waiting for an event that happens naturally in your system. Use `step.waitForSignal` when you need an explicit approval or external trigger with an unambiguous ID.

## Memoization

`step.waitForEvent` is memoized. If the workflow retries after the event was already received and cached, the step returns the cached event data immediately without re-suspending.

---

## Migrate

### Migrate from Inngest

URL: https://rotor.sh/docs/migrate/from-inngest

## TL;DR

Inngest serves HTTP events; Rotor's **HTTP callback mode** (new in Phase 3) maps 1:1 to your existing handler code. Migration is a handler URL swap and a config copy — your Inngest handlers are already Rotor-compatible HTTP routes. No SDK rewrite, no worker container, no runtime change.

At **1M runs/mo**, Rotor Team costs **$99 flat**; Inngest requires Enterprise pricing.

<Note>
  Rotor is the right fit if you like Inngest's HTTP-function ergonomics but want
  flat per-workspace pricing instead of per-execution usage metering. It is
  **not** a replacement for multi-step Durable Execution (step.waitForEvent,
  step.invoke, cross-step observability). See [What Rotor does NOT
  replace](#what-rotor-does-not-replace).
</Note>

## Pricing at a glance

| Plan | Price | Executions / mo | Retention | Notes |
|------|-------|-----------------|-----------|-------|
| Inngest Free | $0 | 50k runs | — | Low limits |
| Inngest Cloud | $100/mo | 500k runs | — | Per-execution meter |
| Inngest Enterprise | Custom | 1M+ | — | Required above Cloud limits |
| **Rotor Free** | **$0** | **10k** | 7d | Unlimited queues + schedules |
| **Rotor Pro** | **$19/mo** | **100k** | 30d | 14-day trial, no card |
| **Rotor Team** | **$99/mo** | **1M** | 90d | Flat — no overages |
| Rotor Enterprise | Custom | Custom | 365d | Managed workers + SOC 2 + SSO |

<Tip>
  **100k runs/mo:** Inngest Cloud $100 vs Rotor Pro $19 → **5.3x cheaper**.
  **1M runs/mo:** Inngest requires Enterprise (unknown pricing); Rotor Team is
  **$99 flat**. Competitor pricing was verified against inngest.com/pricing on
  the publish date of this guide — re-check before making a purchasing
  decision.
</Tip>

## Architecture mapping

| Inngest concept | Rotor equivalent | Migration notes |
|-----------------|------------------|-----------------|
| `inngest.createFunction({id, event}, handler)` | Queue + `callback_url` | `POST /v1/queues` then `PATCH` the queue with your handler URL |
| `inngest.send({name, data})` | `POST /v1/queues/:name/jobs` | 1:1 — same enqueue-from-API pattern |
| `step.run("name", async () => ...)` | Idempotent handler + BullMQ `attempts` + backoff | Rotor has one handler per queue; split steps into separate queues if you need independent retry budgets |
| `step.sleep("1h")` | Delayed enqueue (`delay: 3600` on the job) | Same outcome — job becomes eligible after delay |
| `step.waitForEvent(...)` | Approval queue (Phase 2 — [see docs](/enterprise)) | Partial match: approval queues pause until approved/rejected; not a general "wait for any event" primitive |
| `createFunction({cron})` | `POST /v1/schedules` | Cron syntax is the same; Rotor **requires** a `timezone` field |
| Dashboard event history | Job history archive (Phase 3) | 7–365 days retention by plan |

## Before / After

<CodeGroup>

```typescript Inngest (before)
// inngest/functions.ts
import { inngest } from "./client";

export const welcomeEmail = inngest.createFunction(
  { id: "welcome-email" },
  { event: "user/created" },
  async ({ event, step }) => {
    await step.run("send-email", async () => {
      return sendEmail({
        to: event.data.email,
        template: "welcome",
      });
    });
  }
);
```

```typescript Rotor (after)
// app/api/rotor/welcome-email/route.ts (your existing handler)
import { createHmac, timingSafeEqual } from "node:crypto";

export async function POST(req: Request) {
  const body = await req.text();
  if (!verifyRotorSignature(req.headers, body)) {
    return new Response("unauthorized", { status: 401 });
  }
  const job = JSON.parse(body);
  const jobId = req.headers.get("x-rotor-job-id")!;

  // Idempotent send keyed on the Rotor job id
  await sendEmailOnce(jobId, {
    to: job.payload.email,
    template: "welcome",
  });

  return new Response("ok");
}
```

</CodeGroup>

<CodeGroup>

```bash One-time setup (create the queue)
curl -X POST https://api.rotor.sh/v1/queues \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"welcome-email"}'

# Attach your handler URL (paid plans only)
curl -X PATCH https://api.rotor.sh/v1/queues/welcome-email \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "callback_url": "https://your-app.example.com/api/rotor/welcome-email",
    "rotate_callback_secret": true
  }'
# Response includes callback_secret once — save it to your env var.
```

```bash Replacing inngest.send
# Was: inngest.send({ name: "user/created", data: { email, id } })
curl -X POST https://api.rotor.sh/v1/queues/welcome-email/jobs \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"payload": {"email":"alice@example.com","id":"u_1"}}'
```

</CodeGroup>

## Signature verification

Every callback from Rotor includes three headers you must verify:

| Header | Purpose |
|--------|---------|
| `X-Rotor-Job-Id` | Unique job id — use as idempotency / dedup key |
| `X-Rotor-Attempt` | Retry attempt counter (`1` on first delivery) |
| `X-Rotor-Signature` | Svix-compatible HMAC-SHA256 signature of the body |

The signature format is identical to the webhook signing format — see [Webhook Signature Verification](/webhooks/signing) for Node, Python, Go, and Ruby examples. The only difference is the header names (`x-rotor-*` instead of `rotor-webhook-*`).

<CodeGroup>

```typescript Node.js
import { createHmac, timingSafeEqual } from "node:crypto";

function verifyRotorSignature(
  headers: Headers,
  body: string,
): boolean {
  const id = headers.get("x-rotor-job-id") ?? "";
  const ts = headers.get("x-rotor-timestamp") ?? "";
  const sig = headers.get("x-rotor-signature") ?? "";
  if (!id || !ts || !sig) return false;

  if (Math.abs(Date.now() / 1000 - Number(ts)) > 300) return false;

  const expected = `v1,${createHmac("sha256", process.env.ROTOR_CALLBACK_SECRET!)
    .update(`${id}.${ts}.${body}`)
    .digest("base64")}`;

  const a = Buffer.from(expected);
  const b = Buffer.from(sig);
  return a.length === b.length && timingSafeEqual(a, b);
}
```

```python Python
import hmac, hashlib, base64, time

def verify_rotor_signature(headers, body, secret):
    jid = headers.get("x-rotor-job-id", "")
    ts  = headers.get("x-rotor-timestamp", "")
    sig = headers.get("x-rotor-signature", "")
    if not (jid and ts and sig): return False
    if abs(time.time() - int(ts)) > 300: return False
    digest = hmac.new(secret.encode(), f"{jid}.{ts}.{body}".encode(), hashlib.sha256).digest()
    expected = f"v1,{base64.b64encode(digest).decode()}"
    return hmac.compare_digest(expected, sig)
```

```go Go
import (
    "crypto/hmac"; "crypto/sha256"; "encoding/base64"
    "fmt"; "math"; "strconv"; "time"
)

func VerifyRotorSignature(headers map[string]string, body, secret string) bool {
    jid, ts, sig := headers["x-rotor-job-id"], headers["x-rotor-timestamp"], headers["x-rotor-signature"]
    if jid == "" || ts == "" || sig == "" { return false }
    tsInt, err := strconv.ParseInt(ts, 10, 64)
    if err != nil || math.Abs(float64(time.Now().Unix()-tsInt)) > 300 { return false }
    mac := hmac.New(sha256.New, []byte(secret))
    mac.Write([]byte(fmt.Sprintf("%s.%s.%s", jid, ts, body)))
    expected := fmt.Sprintf("v1,%s", base64.StdEncoding.EncodeToString(mac.Sum(nil)))
    return hmac.Equal([]byte(expected), []byte(sig))
}
```

</CodeGroup>

## Retry semantics

Rotor uses BullMQ-native retry counts + exponential backoff. Where Inngest retries individual steps, Rotor retries the entire handler invocation.

| Behavior | Inngest | Rotor |
|----------|---------|-------|
| Default attempts | 4 per step | 3 per job (configurable per queue) |
| Backoff | Exponential (step-level) | Exponential (job-level) |
| On exhaustion | Marked failed; visible in dashboard | Moved to workspace **DLQ** — retry via `POST /v1/queues/:name/retry-all` |
| Partial progress | Each step independently retried | Handler re-runs from the top — **must be idempotent** |
| Delivery semantics | At-least-once | At-least-once |

Configure per-queue retry policy at queue creation time:

```json
{
  "name": "welcome-email",
  "defaultJobOptions": {
    "attempts": 5,
    "backoff": { "type": "exponential", "delay": 2000 }
  }
}
```

## Idempotency — read this

<Warning>
  **At-least-once delivery means your handler MUST be idempotent.** Use
  `X-Rotor-Job-Id` as a dedup key on every side-effect. The header is stable
  across all retry attempts of the same job.
</Warning>

```typescript
// Wrong — will send multiple emails on retry
await sendEmail(event.data);

// Right — idempotent via unique DB constraint on job_id
await db.outreach.upsert({
  where: { rotorJobId: jobId },
  create: { rotorJobId: jobId, sentAt: new Date(), ... },
  update: {}, // no-op if already sent
});
if (!alreadySent) await sendEmail(event.data);
```

This is the same rule Inngest imposes — it's just that Rotor makes it explicit.

## Automated import

We ship a small script that statically analyzes your Inngest function definitions and emits the equivalent Rotor configuration as JSON-lines, one object per queue/schedule. Pipe the output to `jq` or save it and replay with `curl` / `rotor queue create`.

```bash
# Download the script (no install needed)
npx tsx https://rotor.sh/docs/scripts/import-from-inngest.ts ./inngest/functions.ts
# or run locally:
npx tsx docs/scripts/import-from-inngest.ts ./inngest/functions.ts

# Example output:
# {"type":"queue","config":{"name":"welcome-email","notes":"event: user/created"}}
# {"type":"schedule","config":{"queue":"nightly-digest","cron":"0 2 * * *","timezone":"UTC"}}
```

The script:
- Walks the directory you point it at looking for `.ts` / `.js` / `.tsx` files
- Greps for `inngest.createFunction({ id: ..., event: ... })` and `cron: ...` shapes
- Emits `{ type: "queue" | "schedule", config: {...} }` per line
- Prints warnings (to stderr) for shapes it couldn't parse — you handle those manually

The output is **diagnostic** — we never modify your code. Paste the queue JSONs into a bootstrap script or run them through `curl` to provision Rotor.

## What Rotor does NOT replace

Be honest with yourself about which Inngest features you actually use:

- **Multi-step observability** (step-level timing, step-level retries, branching). Rotor treats each handler as one atomic invocation — if you need step-level visibility, Rotor is not the right fit.
- **`step.waitForEvent(...)`** — general "wait for any event" coordination. Rotor's [approval queues](/enterprise) solve the "human-in-the-loop pause" subset, but not arbitrary event coordination.
- **`step.invoke(otherFunction)`** — cross-function fan-out with result composition. In Rotor you do this by enqueueing jobs to downstream queues and correlating on your own.
- **Durable, pausable function state across days/weeks** — that's Temporal's domain.

If you are using Inngest primarily as "a queue with good ergonomics + webhook handlers + cron" → Rotor is a perfect fit and you'll save money. If you've built multi-step durable workflows with event-based coordination → evaluate [Temporal](https://temporal.io) or stay on Inngest.

## Next steps

1. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card.
2. **Install the SDK:** `pnpm add @rotorsh/sdk` — see the [Node.js quickstart](/guides/node-quickstart).
3. **Join the [Anthropic Discord](https://www.anthropic.com/discord)** — `#rotor` channel for migration help.

---

## Migrating step.invoke / step.sendEvent / step.waitForSignal

Rotor ships native equivalents for all three Inngest durable-execution primitives. The APIs are intentionally similar — most migrations are a rename and a `timeout` addition.

### step.invoke

<CodeGroup>

```typescript inngest
import { inngest } from './inngest';

export const orchestrator = inngest.createFunction(
  { id: 'orchestrator' },
  { event: 'campaign/requested' },
  async ({ event, step }) => {
    // Inngest: timeout is optional — omitting it blocks forever (billing bug)
    const result = await step.invoke('enrich-contact', {
      function: contactEnricher,
      data: { contactId: event.data.contactId },
    });
    return result;
  }
);
```

```typescript rotor
import { createFunction } from '@rotorsh/sdk';
import { WorkflowFailedError, WorkflowTimeoutError } from '@rotor/sdk/errors';

export const orchestrator = createFunction(
  { id: 'orchestrator', trigger: { event: 'campaign.requested' }, retries: 2 },
  async ({ event, step }) => {
    // Rotor: timeout is REQUIRED — TypeScript build fails without it (max 24h)
    const { result } = await step.invoke<{ score: number }>('enrich-contact', {
      workflow: { id: 'contact-enricher' },   // target by workflow id string, not function ref
      data: { contactId: event.data.contactId },
      timeout: '5m',                           // REQUIRED
    });
    return result;
  }
);
```

</CodeGroup>

### step.sendEvent

<CodeGroup>

```typescript inngest
// Inngest: step.sendEvent uses 'at' (ISO string) for future scheduling
await step.sendEvent('spawn-sub-agent', {
  name: 'agent/sub-agent.spawn',
  data: { contactId: '123' },
  ts: new Date('2026-05-01T09:00:00Z'), // Date object or ISO string
});
```

```typescript rotor
// Rotor: ts is Unix timestamp in milliseconds (not ISO string or Date object)
await step.sendEvent('spawn-sub-agent', {
  name: 'agent/sub-agent.spawn',
  data: { contactId: '123' },
  ts: new Date('2026-05-01T09:00:00Z').getTime(), // convert Date to Unix ms
});
```

</CodeGroup>

### step.waitForSignal

Inngest does not have a direct `step.waitForSignal` equivalent. The closest Inngest pattern uses `cancelOn` matchers to respond to specific events — a fundamentally different model (event-bus broadcast vs. single-run named signal).

<CodeGroup>

```typescript inngest
// Inngest: no step.waitForSignal — closest pattern is cancelOn event matcher
// This CANCELS the run (not a pause/resume pattern)
export const withCancelOn = inngest.createFunction(
  {
    id: 'approval-workflow',
    cancelOn: [{ event: 'approval/rejected', match: 'data.runId' }],
  },
  { event: 'approval/requested' },
  async ({ event, step }) => {
    // If 'approval/rejected' fires with matching runId, the whole run is cancelled
    // No way to "wait for approval and branch on it" without external state
    await step.sleep('wait', '48h');
  }
);
```

```typescript rotor
// Rotor: step.waitForSignal — pause/resume with typed return value + timeout
import { SignalTimeoutError } from '@rotor/sdk/errors';

export const approvalWorkflow = createFunction(
  { id: 'approval-workflow', trigger: { event: 'approval.requested' }, retries: 2 },
  async ({ event, step, runId }) => {
    const { approved, reason } = await step.waitForSignal<{ approved: boolean; reason?: string }>(
      'wait-for-approval',
      {
        signal: `approval-${runId}`,   // workspace-unique; POST to /v1/signals/:id/complete
        timeout: '48h',
      }
    ).catch((err) => {
      if (err.name === 'SignalTimeoutError') return { approved: false, reason: 'Timed out' };
      throw err;
    });

    return approved ? { status: 'approved' } : { status: 'rejected', reason };
  }
);
```

</CodeGroup>

### API differences table

| Aspect | Inngest | Rotor Phase 10+ |
|--------|---------|----------------|
| `step.invoke` — `timeout` param | Optional; omitting blocks forever (no cap) | **REQUIRED**; TypeScript build error if missing; max 24h |
| `step.invoke` — target reference | Function reference (`function: myFn`) | Workflow ID string (`workflow: { id: 'my-workflow' }`) |
| `step.invoke` — return shape | Bare result `R` | `{ result: R; runId: string }` — `runId` included for observability |
| Parent-side timeout cancels child | N/A (no timeout concept) | **NO** — child runs independently; parent step throws `WorkflowTimeoutError` |
| Child cancellation surfaces to parent | YES — `WorkflowCancelledError` | YES — `WorkflowCancelledError` (same behavior) |
| `step.sendEvent` — future scheduling | `ts: Date` or ISO string | `ts: number` — Unix timestamp in **milliseconds** |
| `step.waitForSignal` | **Does not exist** | YES — `signal` is workspace-scoped unique ID; resume via `POST /v1/signals/:id/complete` |
| Signal resume auth | N/A | `rt_ws_` API key scoped to the workspace owning the signal |
| Cross-workspace signal resume | N/A | Returns `404` (not `401`) — existence is opaque across workspace boundaries |
| Concurrency keys | Shipped (`concurrency: { key, limit }`) | Shipped — set `concurrencyKey: string` on enqueue; see [Concurrency Keys](/docs/guides/concurrency-keys) |
| `cancelOn` config | Shipped | **E2 STUB DELETED in Phase 10** — Wave 2 reintroduces with real enforcement |
| Idempotent triggers | ✓ (`idempotency` field on `inngest.send`) | ✓ (`idempotencyKey` on `rotor.workflow.trigger`) |
| `onFailure` hook | ✓ | ✓ |
| `onSuccess` hook | ✓ | ✓ |
| `onCancel` hook | partial | ✓ |
| `onStartAttempt` / `onWait` / `onResume` / `catchError` | ✓ | — (deferred) |

### Key migration callouts

**1. Required timeout on step.invoke**

Add `timeout` to every `step.invoke` call. Omitting it is a TypeScript compile error — the build will fail and point at the missing field. Start with `"5m"` and increase only when you have a concrete SLA.

**2. Parent-side timeout does not cancel the child**

When `WorkflowTimeoutError` fires on the parent, the child workflow keeps running independently. This matches "I'm done waiting, but don't waste the work" semantics. If you want to cancel the child on timeout, you must call the Rotor cancel API manually using the `runId` captured before the timeout.

**3. step.waitForSignal is a new mental model — not cancelOn**

Inngest's `cancelOn` terminates the run on a matching event. Rotor's `step.waitForSignal` pauses the run and resumes it with the signal payload as the step result. You can branch on the result, catch `SignalTimeoutError` for fallback logic, or rethrow — the run stays in your control.

For more detail: [step.invoke deep dive](/docs/workflows/step-invoke) · [step.sendEvent deep dive](/docs/workflows/step-send-event) · [step.waitForSignal deep dive](/docs/workflows/step-wait-for-signal)

---

## Idempotency keys

Inngest's `inngest.send({ name, data, idempotency })` returns the same event ID for duplicate calls within a configured window.

Rotor's equivalent: `rotor.workflow.trigger(name, data, { idempotencyKey })`. Same call site, same intent, same dedup behavior — the key is scoped per (workspace, function) so the same Stripe webhook ID won't dedup against itself across two unrelated workflows.

<CodeGroup>

```typescript Before (Inngest)
await inngest.send({
  name: "stripe/charge.succeeded",
  data: { chargeId: charge.id },
  idempotency: `stripe-${event.id}`,
});
```

```typescript After (Rotor)
await rotor.workflow.trigger(
  "stripe/charge.succeeded",
  { chargeId: charge.id },
  { idempotencyKey: `stripe-${event.id}` },
);
```

</CodeGroup>

**What you get:**
- Same `idempotencyKey` twice → second call returns `{ duplicate: true, runIds: [...] }` with the same run ID.
- Different `idempotencyKey` (or none) → fresh run.
- Workspace at quota → 402 still wins. A duplicate webhook from a workspace at the Hobby cap gets the 402 response, not a free duplicate-hit pass. (Per [pricing.md](/docs/pricing).)

**Differences from Inngest:**
- Rotor's idempotency window is "forever" — the unique constraint is on a Postgres column, not a 24h Redis cache. Storage is bounded only by row retention (purged at the same rates as workflow_runs).
- The dedup happens at the workflow level, not the global event-bus level. Same key under two different workflows fires two distinct runs (intended — they have different side effects).

---

## Lifecycle hooks

Inngest exposes per-function `onFailure` (and a half-dozen others). Rotor ships three: `onSuccess`, `onFailure`, `onCancel`.

<CodeGroup>

```typescript Before (Inngest)
inngest.createFunction(
  { id: "send-receipt", name: "Send Receipt" },
  { event: "order/placed" },
  async ({ event, step }) => { /* ... */ },
);
inngest.createFunction(
  { id: "send-receipt-failure-handler" },
  { event: "inngest/function.failed", if: "event.data.function_id == 'send-receipt'" },
  async ({ event }) => { /* send Slack alert */ },
);
```

```typescript After (Rotor)
const sendReceipt = workflow({
  id: "send-receipt",
  trigger: { event: "order/placed" },
  onFailure: async ({ ctx, error }) => {
    // Same Slack alert. ctx.runId, ctx.functionId, ctx.attempt available.
    await slack.send({
      channel: "#alerts",
      text: `Receipt failed: ${error.message} (run ${ctx.runId})`,
    });
  },
  steps: async ({ step }) => { /* ... */ },
});
```

</CodeGroup>

**Locked semantics (matching Inngest where applicable):**
- `onSuccess` fires only on FINAL completion (after retries succeed). A workflow that fails twice then succeeds → `onSuccess` fires once.
- `onFailure` fires only after retries are exhausted. Per-attempt failures do NOT fire `onFailure`.
- `onCancel` fires when the run is cancelled mid-execution (`running | sleeping | waiting` → `cancelled`). It does NOT fire if the run was queued (`pending`) and never started — same as Inngest.
- Hooks are fire-and-forget. Errors thrown inside a hook are logged to your audit_event channel (kind=`workflow.hook.errored`) and visible at `GET /v1/runs/:id` under `hookErrors[]` — but do NOT change the run's terminal status. A workflow that succeeded with a failing `onSuccess` still shows `status: completed`.
- Hooks have a 30s soft timeout. Hooks running longer trigger an audit event (`workflow.hook.slow`) but are NOT killed — handle long work via `step.sendEvent` from inside `steps` instead.

**Hooks Rotor does NOT ship (yet):**
`onStartAttempt`, `onWait`, `onResume`, `onComplete`, `catchError`. Trigger.dev has these. Rotor doesn't, by design — three hooks cover 90% of buyer needs and we won't bloat the surface to mimic competitor feature counts. If you have a specific need, [open an issue](https://github.com/rotor-sh/rotor/issues).

---

## Side-by-side: Inngest vs Rotor createFunction

<CodeGroup>

```typescript inngest
import { inngest } from './inngest';

export const welcomeEmail = inngest.createFunction(
  { id: 'welcome-email' },
  { event: 'user/created' },
  async ({ event, step }) => {
    const profile = await step.run('fetch-profile', async () => {
      return fetchProfile(event.data.userId);
    });

    await step.sleep('wait-1-day', '1d');

    await step.run('send-email', async () => {
      await sendEmail({ to: profile.email });
    });
  }
);
```

```typescript rotor
import { createFunction } from '@rotorsh/sdk';

export const welcomeEmail = createFunction(
  {
    id: 'welcome-email',
    trigger: { event: 'user.created' }, // dot-notation, not slash
    retries: 3,                          // explicit retry count
  },
  async ({ event, step }) => {
    const profile = await step.run('fetch-profile', async () => {
      return fetchProfile(event.data.userId);
    });

    await step.sleep('wait-1-day', '1d'); // identical

    await step.run('send-email', async () => {
      await sendEmail({ to: profile.email });
    });
  }
);
```

</CodeGroup>

### Key differences

| Feature | Inngest | Rotor |
|---------|---------|-------|
| Event name format | `user/created` (slash) | `user.created` (dot) |
| Serve adapter | `serve()` (conflicts with Node.js) | `serveWorkflow()` (no naming conflict) |
| Pricing | $100/mo for 500k steps | $19/mo Pro for 100k jobs (5.3x cheaper) |
| State storage | Inngest cloud | Your Postgres (you own the data) |
| Open source | Server is closed | `rotor-core` is MIT |
| Step graph | Inngest Cloud UI | `/dashboard/runs/:id` on your domain |

### Migrating `serve()`

```typescript
// Inngest
import { serve } from 'inngest/hono';
app.use('/api/inngest', serve({ client: inngest, functions: [myFn] }));

// Rotor — same pattern, different import
import { serveWorkflow } from '@rotorsh/sdk';
serveWorkflow(app, { functions: [myFn], signingKey: process.env.ROTOR_SIGNING_KEY! });
```

### Migrating event publishing

```typescript
// Inngest
await inngest.send({ name: 'user/created', data: { userId: '123' } });

// Rotor
const rotor = new Rotor({ apiKey: process.env.ROTOR_API_KEY! });
await rotor.send('user.created', { userId: '123' });
```

### Migrate from Vercel Cron

URL: https://rotor.sh/docs/migrate/from-vercel-cron

## TL;DR

Vercel Cron triggers HTTP GETs (or POSTs) on a schedule. **Rotor schedules + HTTP callback mode** replace it — with no **1-cron-per-day Hobby cap** and no per-project Pro ceiling. Your existing `/api/cron/*` handlers work as-is; Rotor becomes the scheduler that calls them.

At **Rotor Free** ($0) you get **unlimited schedules** against a 10k/mo execution budget — vs. Vercel Hobby's hard **1 cron per day** limit.

## Pricing at a glance

| Plan | Price | Schedule limits | Notes |
|------|-------|-----------------|-------|
| Vercel Hobby | $0 | **1 cron per day** | Daily granularity only |
| Vercel Pro | $20/mo | Every minute | Per-project Vercel plan cost |
| **Rotor Free** | **$0** | **Unlimited schedules** | 10k executions/mo total |
| **Rotor Pro** | **$19/mo** | **Unlimited schedules** | 100k executions/mo (covers most cron use cases) |
| **Rotor Team** | **$99/mo** | **Unlimited schedules** | 1M executions/mo |

<Tip>
  **Key differentiator:** Rotor Free lets you run 10,000 scheduled executions
  per month across **as many schedules as you want**. Vercel Hobby caps you at
  1 cron per day regardless of execution volume. If you're hitting the Hobby
  cron cap, Rotor Free is a drop-in replacement at the same price ($0).
</Tip>

<Note>
  Competitor pricing was verified against vercel.com/pricing on the publish
  date of this guide. Vercel frequently adjusts their cron allowances — re-read
  their pricing page before making a purchasing decision.
</Note>

## Architecture mapping

| Vercel Cron concept | Rotor equivalent | Migration notes |
|---------------------|------------------|-----------------|
| `vercel.json` `crons[]` entry | `POST /v1/schedules` | One schedule per entry |
| Handler route (e.g. `/api/cron/cleanup`) | Queue with `callback_url` | Same handler — just attach its URL to a Rotor queue |
| `CRON_SECRET` bearer header | `X-Rotor-Signature` HMAC verification | Swap the auth check; handler body stays the same |
| Default UTC | **Explicit `timezone` required** | Rotor schedules MUST declare a timezone — see below |
| Per-project isolation | Per-workspace isolation | Works across any deploy target (Vercel, Railway, Fly, self-hosted) |

**No handler-side changes required:** Vercel Cron already hits an HTTP route on your app. Rotor does the same thing — you just change what authenticates the incoming request (HMAC instead of bearer).

## Before / After

<CodeGroup>

```json vercel.json (before)
{
  "crons": [
    {
      "path": "/api/cron/cleanup",
      "schedule": "0 */6 * * *"
    },
    {
      "path": "/api/cron/daily-digest",
      "schedule": "0 12 * * *"
    }
  ]
}
```

```bash Rotor (after)
# 1. Create one queue per cron, pointing at your existing handler URL
curl -X POST https://api.rotor.sh/v1/queues \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"cron-cleanup"}'

curl -X PATCH https://api.rotor.sh/v1/queues/cron-cleanup \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "callback_url": "https://your-app.vercel.app/api/cron/cleanup",
    "rotate_callback_secret": true
  }'
# Response returns callback_secret ONCE — save to env var ROTOR_CALLBACK_SECRET.

# 2. Attach a schedule
curl -X POST https://api.rotor.sh/v1/schedules \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "queue": "cron-cleanup",
    "cron": "0 */6 * * *",
    "timezone": "UTC"
  }'
```

</CodeGroup>

Delete the `crons` array from `vercel.json` once the Rotor schedule is live. Your handler route stays exactly where it is.

## Timezone — read this first

<Warning>
  **Rotor schedules REQUIRE an explicit `timezone` parameter** (API-12). Vercel
  Cron defaults to UTC silently — when you migrate, **decide** what timezone
  you want.

  - If your existing cron was "run at 12:00 every day" and you intended that in
    UTC, set `"timezone": "UTC"`.
  - If you intended local business hours (e.g. "noon eastern"), set
    `"timezone": "America/New_York"` — Rotor will handle DST.

  Pick wrong and your reports will fire an hour off twice a year.
</Warning>

Supported timezone values: any IANA zone name (e.g. `UTC`, `America/New_York`, `Europe/London`, `Asia/Tokyo`). See the full list at [IANA tz database](https://www.iana.org/time-zones).

## Signature verification

Vercel Cron authenticates callbacks via the `CRON_SECRET` env var + bearer header:

```typescript
// Before: Vercel Cron handler
export async function GET(req: Request) {
  const auth = req.headers.get("authorization");
  if (auth !== `Bearer ${process.env.CRON_SECRET}`) {
    return new Response("unauthorized", { status: 401 });
  }
  // ...do cron work
}
```

Rotor authenticates via **HMAC signature** on the request body — stronger because it binds the signature to the exact payload, preventing replay with altered bodies. See the full [Webhook Signature Verification](/webhooks/signing) reference for Node, Python, Go, and Ruby.

```typescript
// After: Rotor callback handler
import { createHmac, timingSafeEqual } from "node:crypto";

export async function POST(req: Request) {
  const body = await req.text();
  if (!verifyRotorSignature(req.headers, body)) {
    return new Response("unauthorized", { status: 401 });
  }
  const jobId = req.headers.get("x-rotor-job-id");
  // ...do cron work (same as before); use jobId for idempotency if needed
  return new Response("ok");
}

function verifyRotorSignature(headers: Headers, body: string): boolean {
  const id = headers.get("x-rotor-job-id") ?? "";
  const ts = headers.get("x-rotor-timestamp") ?? "";
  const sig = headers.get("x-rotor-signature") ?? "";
  if (!id || !ts || !sig) return false;
  if (Math.abs(Date.now() / 1000 - Number(ts)) > 300) return false;

  const expected = `v1,${createHmac("sha256", process.env.ROTOR_CALLBACK_SECRET!)
    .update(`${id}.${ts}.${body}`)
    .digest("base64")}`;
  const a = Buffer.from(expected);
  const b = Buffer.from(sig);
  return a.length === b.length && timingSafeEqual(a, b);
}
```

**Request method changes:** Vercel Cron sends `GET`. Rotor sends `POST` with a JSON body (empty `{}` for schedules with no payload). Update your route handler from `GET` to `POST` — or accept both if you want to leave Vercel Cron running in parallel during migration.

## Auth migration cheatsheet

| Check | Vercel Cron | Rotor callback |
|-------|-------------|----------------|
| Header read | `authorization: Bearer $CRON_SECRET` | `x-rotor-signature: v1,<base64>` |
| Secret source | `process.env.CRON_SECRET` | `process.env.ROTOR_CALLBACK_SECRET` (minted once via `PATCH /v1/queues/:name`) |
| Replay window | None | ±300s (checked against `x-rotor-timestamp`) |
| Idempotency | — | `x-rotor-job-id` — stable across retries |
| Request method | `GET` | `POST` |

## Automated import

We ship a small script that reads your `vercel.json`, extracts the `crons[]` array, and emits the Rotor queue + schedule configuration as JSON-lines.

```bash
# Run it
npx tsx docs/scripts/import-from-vercel-cron.ts ./vercel.json
```

Example input and output:

```json
// vercel.json
{
  "crons": [
    { "path": "/api/cron/cleanup", "schedule": "0 */6 * * *" },
    { "path": "/api/cron/daily-digest", "schedule": "0 12 * * *" }
  ]
}
```

```
// stdout — one object per line
{"type":"queue","config":{"name":"cron-cleanup","callback_url":"https://YOUR_DOMAIN/api/cron/cleanup"}}
{"type":"schedule","config":{"queue":"cron-cleanup","cron":"0 */6 * * *","timezone":"UTC"}}
{"type":"queue","config":{"name":"cron-daily-digest","callback_url":"https://YOUR_DOMAIN/api/cron/daily-digest"}}
{"type":"schedule","config":{"queue":"cron-daily-digest","cron":"0 12 * * *","timezone":"UTC"}}
```

After running the script:

1. Search-and-replace `YOUR_DOMAIN` with your actual production host (e.g. `your-app.vercel.app`).
2. Replay the `type:queue` lines as `POST /v1/queues` + `PATCH .../callback_url`.
3. Replay the `type:schedule` lines as `POST /v1/schedules`.
4. Verify one scheduled execution fires and hits your handler, **then** remove the `crons` array from `vercel.json`.

## Next steps

1. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card.
2. **Install the SDK:** `pnpm add @rotorsh/sdk` — see the [Node.js quickstart](/guides/node-quickstart).
3. **Join the [Anthropic Discord](https://www.anthropic.com/discord)** — `#rotor` channel for migration help.

### Move your Vercel env vars to Rotor

URL: https://rotor.sh/docs/migrate/from-vercel-env-vars

## TL;DR

If your Rotor jobs reference `process.env.STRIPE_KEY` or `process.env.SLACK_TOKEN` in callback URLs, headers, or payloads, you can move those into Rotor's secrets vault and reference them as `${{ secrets.STRIPE_KEY }}` instead — gaining a centralized audit trail, rotation API, and team-shared access without redeploying your app.

<Note>
  This guide covers secrets **used by Rotor jobs** at dispatch time (callback
  URLs, callback headers, job payloads). It does NOT cover secrets your app
  reads at boot. See [What stays in Vercel/Railway](#what-stays-in-vercelrailway)
  for the full scope boundary.
</Note>

## Why this matters

- **Audit trail** — every secret access is recorded in `audit_event` with actor and timestamp. Know who changed what and when, without digging through Vercel's team activity log.
- **Rotation without redeploy** — `PATCH /v1/secrets/:name` re-encrypts the value. Queued jobs pick up the new plaintext at next dispatch. No redeploy, no downtime.
- **Team-shared** — secrets are workspace-scoped. Teammates with workspace access read the same values without sharing your Railway or Vercel login credentials.

## What stays in Vercel/Railway

Rotor secrets are resolved at **job dispatch time**. Anything your application reads **outside** of a Rotor job should stay where it is:

- Database connection strings (your app needs them at boot)
- Supabase service role key and anon key (read by your backend at startup)
- Auth provider secrets (NextAuth, Clerk, etc.)
- Anything consumed outside a Rotor callback handler

If you're not sure, ask: "Does Rotor need this value to dispatch a job?" YES → vault it. NO → leave it in Vercel/Railway.

## Architecture mapping

| Concept | Vercel / Railway env vars | Rotor secrets vault |
|---------|--------------------------|---------------------|
| Storage | Platform UI or `.env` file | Encrypted at rest, AES-256-GCM |
| Reference | `process.env.MY_KEY` | `${{ secrets.MY_KEY }}` |
| Rotation | Edit in dashboard, redeploy | `PATCH /v1/secrets/:name`, no redeploy |
| Audit | None (or Vercel team log) | `audit_event` rows on every create / rotate / delete / access |
| Scope | Per project | Per workspace (team-shared) |
| Resolution | At app boot / request time | At job dispatch time (plaintext never in BullMQ or Postgres) |

## Before / After

The common pattern: your callback handler reads `process.env.NOTIFY_TOKEN` to authenticate outbound requests. After migration, Rotor injects the token directly into the callback URL or header — your handler no longer needs the env var.

<Note>
  **`ROTOR_CALLBACK_SECRET` stays in your env.** That secret verifies inbound
  requests *from* Rotor to your handler. Rotor can't inject it into itself. Only
  secrets that Rotor *sends outbound* (in callback URLs, headers, or payloads)
  belong in the vault.
</Note>

<CodeGroup>

```typescript Before — env var in callback URL
// your-app/api/rotor-callback.ts

export async function POST(req: Request) {
  const auth = req.headers.get("authorization");
  // ROTOR_CALLBACK_SECRET stays in your env — it verifies inbound Rotor requests
  if (auth !== `Bearer ${process.env.ROTOR_CALLBACK_SECRET}`) {
    return new Response("unauthorized", { status: 401 });
  }

  const job = await req.json();

  // NOTIFY_TOKEN used to authenticate outbound request — this moves to Rotor vault
  await fetch(
    `https://api.example.com/notify?token=${process.env.NOTIFY_TOKEN}`,
    { method: "POST", body: JSON.stringify(job) }
  );

  return new Response("ok");
}
```

```typescript After — secret vaulted, URL interpolation
// 1. Store the secret once:
//    POST /v1/secrets  {"name":"NOTIFY_TOKEN","value":"abc123..."}

// 2. Configure your queue's callback URL to interpolate the secret:
//    PATCH /v1/queues/notifications {
//      "callback_url": "https://api.example.com/notify?token=${{ secrets.NOTIFY_TOKEN }}"
//    }
//    Rotor resolves ${{ secrets.NOTIFY_TOKEN }} at dispatch — plaintext never
//    persists in Redis or Postgres.

// 3. Your callback handler no longer needs NOTIFY_TOKEN:
export async function POST(req: Request) {
  const auth = req.headers.get("authorization");
  if (auth !== `Bearer ${process.env.ROTOR_CALLBACK_SECRET}`) {
    return new Response("unauthorized", { status: 401 });
  }

  // The token is already in the URL Rotor called — handler doesn't need it.
  return new Response("ok");
}
```

</CodeGroup>

### Using secrets in callback headers

If your downstream API expects the token as a header rather than a query parameter:

```bash
curl -X PATCH https://api.rotor.sh/v1/queues/notifications \
  -H "Authorization: Bearer $ROTOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "callback_headers": {
      "X-API-Key": "${{ secrets.NOTIFY_TOKEN }}"
    }
  }'
```

Rotor resolves the template at dispatch and forwards the plaintext header value to your URL. The template string `${{ secrets.NOTIFY_TOKEN }}` is what gets stored in Postgres — never the plaintext.

## Migration steps

1. **Create the secret** — use the dashboard at `/dashboard/settings/secrets` or the API:

   ```bash
   curl -X POST https://api.rotor.sh/v1/secrets \
     -H "Authorization: Bearer $ROTOR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"name":"NOTIFY_TOKEN","value":"your-token-here"}'
   # Response: {"name":"NOTIFY_TOKEN","value_hint":"your-t...","created_at":"..."}
   ```

2. **Update your queue's callback URL or `callback_headers`** to reference `${{ secrets.YOUR_KEY }}`:

   ```bash
   curl -X PATCH https://api.rotor.sh/v1/queues/notifications \
     -H "Authorization: Bearer $ROTOR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"callback_url":"https://api.example.com/notify?token=${{ secrets.NOTIFY_TOKEN }}"}'
   ```

3. **Verify with a test job** — enqueue a job and check that your downstream endpoint received the resolved token (not the template string) in the URL or header.

4. **Remove the env var from Vercel/Railway** — once you've confirmed delivery works end-to-end.

5. **Redeploy if needed** — only if the env var was previously read at app boot. For most callback-token use cases (env var only needed in outbound calls), no redeploy is required.

<Warning>
  **Run a 24-hour observation window before removing the env var.** Confirm zero
  callback failures with the new vault-sourced value before deleting the
  original env var from Railway or Vercel. If Rotor's resolution fails, you can
  roll back instantly by re-adding the env var.
</Warning>

## Bulk migration script

The script below reads env var names from your environment (or from `.env`), filters the ones you want to migrate, and POSTs each to `/v1/secrets`. Review the list before running — it only creates secrets you explicitly include.

<CodeGroup>

```typescript Node.js (npx tsx)
#!/usr/bin/env node
// scripts/migrate-to-rotor-vault.ts
// Usage:  ROTOR_API_KEY=xxx npx tsx scripts/migrate-to-rotor-vault.ts
// Add the env var names you want to migrate to KEYS_TO_MIGRATE below.

const ROTOR_API_URL = process.env.ROTOR_API_URL ?? "https://api.rotor.sh";
const ROTOR_API_KEY = process.env.ROTOR_API_KEY ?? "";

const KEYS_TO_MIGRATE = [
  "NOTIFY_TOKEN",
  "SLACK_BOT_TOKEN",
  "STRIPE_WEBHOOK_SECRET",
  // add more...
];

if (!ROTOR_API_KEY) {
  console.error("ROTOR_API_KEY is required");
  process.exit(1);
}

for (const name of KEYS_TO_MIGRATE) {
  const value = process.env[name];
  if (!value) {
    console.warn(`[skip] ${name} — not set in current env`);
    continue;
  }

  const res = await fetch(`${ROTOR_API_URL}/v1/secrets`, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${ROTOR_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ name, value }),
  });

  if (res.status === 201) {
    const data = (await res.json()) as { value_hint: string };
    console.log(`[ok]   ${name} → hint: ${data.value_hint}`);
  } else if (res.status === 409) {
    console.log(`[skip] ${name} — already exists in vault`);
  } else {
    const text = await res.text();
    console.error(`[fail] ${name} — HTTP ${res.status}: ${text}`);
    process.exit(1);
  }
}

console.log("\nNext step: update your queue callback_url / callback_headers");
console.log('  PATCH /v1/queues/<queue> {"callback_headers":{"X-Token":"${{ secrets.YOUR_KEY }}"}}');
```

</CodeGroup>

The script never logs plaintext values — it only prints the `value_hint` returned by the API (first 8 characters + `...`).

## Reference values

| Field | Details |
|-------|---------|
| Secret name format | `[A-Z][A-Z0-9_]*` (uppercase, underscores, digits — no lowercase) |
| Surfaces supported | Callback URL, callback headers, job payload fields, workflow step inputs |
| Resolution timing | Dispatch-time — Rotor substitutes plaintext before sending; template string stored in Postgres/Redis |
| Audit trail | Every create / rotate / delete / access logged to `audit_event` with `resource_type = 'secret'` |
| Rotation | `PATCH /v1/secrets/:name` with `{"value":"new-value"}` — zero-downtime, no redeploy |
| Listing | `GET /v1/secrets` returns names + hints, never plaintext |

## Audit verification

After migrating, confirm the audit trail is clean:

```sql
-- No plaintext values should appear in input/output blobs
SELECT event_type, occurred_at, actor_id
FROM audit_event
WHERE resource_type = 'secret'
ORDER BY occurred_at DESC
LIMIT 10;
```

Expected: `secret.created` rows for each migrated key. Confirm the `input` and `output` columns do not contain the raw secret values.

## Next steps

1. **[Dashboard secrets page](https://rotor.sh/dashboard/settings/secrets)** — create, rotate, and delete secrets in the UI.
2. **[/v1/secrets API reference](/api-reference/introduction)** — full CRUD: create, list, rotate (`PATCH`), delete.
3. **[Audit log API](/api-reference/introduction)** — query `audit_event` for secret access history.
4. **[Start a Pro trial](https://rotor.sh/signup)** — 14 days, no credit card.

---

## Enterprise

### Enterprise

URL: https://rotor.sh/docs/enterprise/index

## Audit Export

Export your workspace's full audit trail to your data warehouse for compliance,
forensic analysis, and long-term retention.

- [Audit Export to Snowflake](/enterprise/audit-export-snowflake) — pipe `audit_event` rows
  to Snowflake via S3 staging with a daily `COPY INTO` task.
- [Audit Export to S3](/enterprise/audit-export-s3) — write parquet files to your S3 bucket
  for Databricks, BigQuery, Redshift, or Athena ingestion.

The `GET /v1/audit?format=csv` endpoint is available **now** on all plans for ad-hoc exports.
Automated S3/Snowflake push is Enterprise-tier and ships in Phase 4.

## SSO / SAML

<Info>
  **Coming in Phase 4.** Enterprise customers will be able to configure SAML 2.0 single
  sign-on with Okta, Azure AD, and Google Workspace. Team members log in via your IdP; Rotor
  maps group memberships to workspace roles. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh)
  to be notified when SSO reaches general availability.
</Info>

## SOC 2 Type II

<Info>
  **Coming in Phase 4.** Rotor is pursuing SOC 2 Type II certification covering Security,
  Availability, and Confidentiality trust service criteria. The audit period begins once the
  managed-worker and dedicated-Redis infrastructure (Phase 4) is stable in production for
  90 days. Expected report availability: Q4 2026. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh)
  to receive a draft controls summary under NDA.
</Info>

## Contact

For Enterprise pricing, custom contracts, or data-residency requirements:
[enterprise@rotor.sh](mailto:enterprise@rotor.sh)

### Snowflake Audit Export

URL: https://rotor.sh/docs/enterprise/audit-export-snowflake

<Warning>
  Audit-event data is available **now** via `GET /v1/audit?format=csv`. Automated push to
  S3/Snowflake is an Enterprise-tier add-on shipping in Phase 4 — until then, customers can
  run the documented `pg_dump` + `COPY INTO` recipe themselves on a CSV export.
</Warning>

## Why Snowflake

The `audit_event` table is append-only and partitioned by month — a perfect fit for Snowflake's
billing model. Rotor generates structured JSON in the `metadata` column (JSONB in Postgres,
VARIANT in Snowflake), which Snowflake queries natively with `metadata:key` dot-notation.

Key advantages of routing audit data through Snowflake:

- **Cost-efficient long-term storage.** Monthly parquet files are ~10× smaller than row-based
  exports. A 10M-row workspace accumulates ~200 MB parquet/month vs 2 GB raw CSV.
- **Clustered tables.** Snowflake's automatic clustering on `occurred_at` keeps query costs
  predictable for time-range scans regardless of table size.
- **VARIANT for metadata.** No schema migration is needed as Rotor adds fields to the metadata
  payload — Snowflake's semi-structured storage handles schema evolution transparently.

## Pipeline Overview

```
Postgres (audit_event, partitioned by month)
   │
   │  pg_dump --format=custom → parquet conversion (or direct CSV → S3 → Snowflake)
   ▼
S3 bucket (customer-managed or Rotor-managed)
   s3://bucket/rotor-audit/workspace=<ws_id>/year=YYYY/month=MM/
   │
   │  Snowflake COPY INTO (scheduled daily task, X-SMALL warehouse)
   ▼
Snowflake table: rotor_audit_events
```

**Data residency note:** Enterprise customers may require that Rotor writes parquet files to a
customer-owned S3 bucket, and that no data transits Rotor-managed S3. See the
[Customer-Managed Bucket Variant](#customer-managed-s3-bucket-variant) section below.

## Schema

The `audit_event` Postgres table maps to Snowflake as follows:

| Postgres column  | Postgres type    | Snowflake type   | Notes                                      |
|------------------|------------------|------------------|--------------------------------------------|
| `id`             | UUID             | VARCHAR(36)      | Hyphen-separated UUID string               |
| `workspace_id`   | TEXT             | VARCHAR(64)      | Foreign key; use for row-level security    |
| `event_type`     | TEXT             | VARCHAR(128)     | e.g. `webhook.created`, `job.completed`    |
| `actor_id`       | TEXT             | VARCHAR(64)      | API key ID or system actor                 |
| `resource_id`    | TEXT             | VARCHAR(256)     | Affected resource ID (nullable)            |
| `metadata`       | JSONB            | VARIANT          | Event-specific payload; query with `:`     |
| `occurred_at`    | TIMESTAMPTZ      | TIMESTAMP_TZ(9)  | Cluster key; used for COPY INTO partition  |
| `ip_address`     | TEXT             | VARCHAR(45)      | IPv4 or IPv6 (nullable)                    |

## S3 Stage Setup

Create a Snowflake external stage pointing to your S3 bucket. Replace `<ACCOUNT_ID>`,
`<BUCKET>`, and `<ROLE_ARN>` with your values:

```sql
-- Create storage integration (run once as ACCOUNTADMIN)
CREATE OR REPLACE STORAGE INTEGRATION rotor_audit_s3
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<ROLE_ARN>'
  STORAGE_ALLOWED_LOCATIONS = ('s3://<BUCKET>/rotor-audit/');

-- Retrieve the Snowflake AWS account ID and external ID for IAM trust policy
DESC INTEGRATION rotor_audit_s3;

-- Create the stage
CREATE OR REPLACE STAGE rotor_audit
  STORAGE_INTEGRATION = rotor_audit_s3
  URL = 's3://<BUCKET>/rotor-audit/'
  FILE_FORMAT = (TYPE = 'PARQUET');
```

### IAM Role (trust policy)

Attach this trust policy to the IAM role you pass as `STORAGE_AWS_ROLE_ARN`. Replace
`<SNOWFLAKE_ACCOUNT_ID>` and `<SNOWFLAKE_EXTERNAL_ID>` with the values from `DESC INTEGRATION`:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<SNOWFLAKE_ACCOUNT_ID>:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "<SNOWFLAKE_EXTERNAL_ID>"
        }
      }
    }
  ]
}
```

### IAM permissions policy

The role needs minimal S3 access on the audit prefix:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::<BUCKET>",
        "arn:aws:s3:::<BUCKET>/rotor-audit/*"
      ]
    }
  ]
}
```

## COPY INTO Command

Run this after each daily parquet drop to load the previous day's partition:

```sql
COPY INTO rotor_audit_events
FROM @rotor_audit/workspace=<YOUR_WS_ID>/year=2026/month=04/
FILE_FORMAT = (TYPE = 'PARQUET')
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE
ON_ERROR = 'CONTINUE';
```

For a full historical backfill, omit the date-prefix to copy all partitions:

```sql
COPY INTO rotor_audit_events
FROM @rotor_audit/workspace=<YOUR_WS_ID>/
FILE_FORMAT = (TYPE = 'PARQUET')
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE
ON_ERROR = 'CONTINUE';
```

## Scheduling (Daily Snowflake Task)

Schedule the COPY INTO to run automatically each night. This keeps warehouse size
at X-SMALL since parquet files for a typical 10M-row workspace are ~20 MB/day:

```sql
CREATE OR REPLACE TASK load_rotor_audit_daily
  WAREHOUSE = 'COMPUTE_WH'
  SCHEDULE = 'USING CRON 0 3 * * * UTC'   -- 03:00 UTC daily
AS
  COPY INTO rotor_audit_events
  FROM @rotor_audit/workspace=<YOUR_WS_ID>/
  FILE_FORMAT = (TYPE = 'PARQUET')
  MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE
  ON_ERROR = 'CONTINUE';

ALTER TASK load_rotor_audit_daily RESUME;
```

### Cost estimate

| Workspace size     | Parquet/day | Warehouse time | Approx cost/month |
|--------------------|-------------|----------------|-------------------|
| 1M events/month    | ~2 MB       | ~10s X-SMALL   | under $1          |
| 10M events/month   | ~20 MB      | ~30s X-SMALL   | ~$2               |
| 100M events/month  | ~200 MB     | ~3 min X-SMALL | ~$15              |

Prices based on Snowflake on-demand at $3.00/credit; X-SMALL = 1 credit/hour.

## Customer-Managed S3 Bucket Variant

Enterprise customers with data-residency requirements can configure Rotor to write parquet
files directly to a customer-owned S3 bucket. In this mode:

1. Rotor assumes a customer-provided IAM role via `sts:AssumeRole` (cross-account).
2. The customer S3 bucket receives parquet files at the standard Hive-style prefix:
   `s3://customer-bucket/rotor-audit/workspace=<ws_id>/year=YYYY/month=MM/day=DD/`.
3. Rotor never reads the data back — write-only access via `s3:PutObject`.
4. The customer runs their own Snowflake COPY INTO on their own schedule.

To enable, contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) with your AWS account ID
and the ARN of the role you want Rotor to assume.

## Verification

After each COPY INTO, verify the row count matches the Postgres source:

```sql
-- Snowflake: count for a specific day
SELECT COUNT(*) FROM rotor_audit_events
WHERE occurred_at::DATE = '2026-04-13'
  AND workspace_id = '<YOUR_WS_ID>';
```

```sql
-- Postgres: equivalent count (run against your audit replica)
SELECT COUNT(*) FROM audit_event
WHERE occurred_at::DATE = '2026-04-13'
  AND workspace_id = '<YOUR_WS_ID>';
```

Row counts should match. A discrepancy of ≤0.01% is acceptable (COPY INTO `ON_ERROR=CONTINUE`
skips malformed rows; these are logged in `COPY_HISTORY`).

```sql
-- Check for skipped rows
SELECT * FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY(
  TABLE_NAME => 'ROTOR_AUDIT_EVENTS',
  START_TIME => DATEADD('hour', -25, CURRENT_TIMESTAMP())
))
WHERE STATUS = 'PARTIALLY_LOADED';
```

---

*Reference: Snowflake audit log to S3 pipeline pattern — see [Snowflake docs: Loading from S3](https://docs.snowflake.com/en/user-guide/data-load-s3).*

### S3 Audit Export

URL: https://rotor.sh/docs/enterprise/audit-export-s3

<Warning>
  Audit-event data is available **now** via `GET /v1/audit?format=csv`. Automated push to
  S3 is an Enterprise-tier add-on shipping in Phase 4 — until then, customers can download
  CSV exports from `GET /v1/audit?format=csv` and upload them manually to their S3 bucket
  using the schema and prefix layout documented below.
</Warning>

## Output Schema

Parquet files produced by Rotor follow this column schema, derived from the `audit_event`
Postgres table (shipped in plan 02-01):

| Column         | Parquet type              | Nullable | Notes                                      |
|----------------|---------------------------|----------|--------------------------------------------|
| `id`           | STRING                    | No       | UUID (hyphen-separated)                    |
| `workspace_id` | STRING                    | No       | Workspace key (`ws_...`)                   |
| `event_type`   | STRING                    | No       | e.g. `job.completed`, `guardrail.blocked`  |
| `actor_id`     | STRING                    | No       | API key ID or `system`                     |
| `resource_id`  | STRING                    | Yes      | Affected resource ID (job ID, webhook ID)  |
| `metadata`     | STRING (JSON-encoded)     | Yes      | Event-specific payload as JSON string      |
| `occurred_at`  | INT96 (TIMESTAMP_MICROS)  | No       | UTC; partition key                         |
| `ip_address`   | STRING                    | Yes      | IPv4 or IPv6                               |

`metadata` is written as a JSON string (not a nested struct) to maximise compatibility across
Databricks, BigQuery, and Redshift, all of which handle JSON strings with native functions
(`JSON_VALUE`, `from_json`, `JSON_EXTRACT_PATH_TEXT`).

## Bucket Layout

Files are written in **Hive-style partitioning** for compatibility with Athena, BigQuery
wildcard queries, and Databricks Auto Loader:

```
s3://customer-bucket/rotor-audit/
  workspace=ws_abc123/
    year=2026/
      month=04/
        day=13/
          part-00000.parquet
          part-00001.parquet
          _checksum.sha256       ← row count + sha256 of all parts
```

Each `part-NNNNN.parquet` file is at most 128 MB uncompressed (snappy-compressed in practice).
A `_checksum.sha256` file is written atomically after all parts for a day complete, enabling
idempotent detection of complete vs. in-progress drops.

## Schedule

By default, Rotor exports the previous day's events at **02:00 UTC** each day. The schedule
is configurable per workspace (Enterprise tier). Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh)
to change frequency or timezone.

## IAM Setup

Rotor assumes a customer-provided IAM role to write to your S3 bucket. The role needs only
`s3:PutObject` on the audit prefix — Rotor never reads from your bucket:

### Step 1: Create the IAM role

Create a new IAM role with a trust policy that allows Rotor's AWS account to assume it:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "<PROVIDED_BY_ROTOR>"
        }
      }
    }
  ]
}
```

Replace `123456789012` with Rotor's AWS account ID (provided during Enterprise onboarding).
The `ExternalId` is unique per workspace and is provided by Rotor to prevent confused-deputy
attacks.

### Step 2: Attach a minimal permissions policy

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": "arn:aws:s3:::customer-bucket/rotor-audit/*"
    }
  ]
}
```

### Step 3: Provide Rotor with the role ARN

Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) with your:
- AWS account ID
- IAM role ARN
- S3 bucket name and region
- Preferred export schedule (default: daily at 02:00 UTC)

## Athena Example Query

Once files are in S3, register a Glue table pointing to the Hive partition layout:

```sql
CREATE EXTERNAL TABLE rotor_audit (
  id           STRING,
  workspace_id STRING,
  event_type   STRING,
  actor_id     STRING,
  resource_id  STRING,
  metadata     STRING,
  occurred_at  TIMESTAMP,
  ip_address   STRING
)
PARTITIONED BY (year STRING, month STRING, day STRING)
STORED AS PARQUET
LOCATION 's3://customer-bucket/rotor-audit/workspace=ws_abc123/'
TBLPROPERTIES ('parquet.compression'='SNAPPY');

-- Load partitions
MSCK REPAIR TABLE rotor_audit;
```

Run queries filtered to a specific event type:

```sql
-- Find all guardrail blocks for a workspace in April
SELECT occurred_at, actor_id, JSON_EXTRACT_SCALAR(metadata, '$.reason') AS reason
FROM rotor_audit
WHERE workspace = 'ws_abc123'
  AND event_type = 'guardrail.blocked'
  AND year = '2026' AND month = '04'
ORDER BY occurred_at DESC
LIMIT 100;
```

Cost estimate: Athena charges $5/TB scanned. A 10M-row workspace generates ~20 MB
parquet/day; scanning one month of data costs approximately $0.003.

## BigQuery Transfer

Use [BigQuery Data Transfer Service](https://cloud.google.com/bigquery/docs/s3-transfer)
to ingest S3 parquet files into BigQuery automatically:

1. In BigQuery, go to **Data Transfers** → **Create Transfer**.
2. Select **Amazon S3** as the source.
3. Set the source URI pattern:
   `s3://customer-bucket/rotor-audit/workspace=<ws_id>/year=*/month=*/day=*/*.parquet`
4. Configure the service account with cross-account S3 read access.
5. Set schedule to **Daily** starting at 04:00 UTC (after Rotor's 02:00 UTC export completes).

BigQuery automatically infers the Parquet schema. The `metadata` column is ingested as
`STRING` and can be queried with `JSON_VALUE(metadata, '$.reason')`.

## Databricks Auto Loader

Databricks Auto Loader natively supports S3 + Hive partitioning:

```python
df = (
  spark.readStream
    .format("cloudFiles")
    .option("cloudFiles.format", "parquet")
    .option("cloudFiles.schemaLocation", "/mnt/checkpoints/rotor-audit-schema")
    .load("s3://customer-bucket/rotor-audit/workspace=ws_abc123/")
)

df.writeStream \
  .format("delta") \
  .option("checkpointLocation", "/mnt/checkpoints/rotor-audit") \
  .table("rotor_audit_events")
```

Auto Loader processes only new files on each trigger, making it efficient for daily batch
drops as well as near-real-time streaming if Rotor is configured for hourly exports.

## Redshift Spectrum

For Redshift customers, use Redshift Spectrum to query directly from S3 without ingestion:

```sql
-- Create external schema pointing to your S3 audit data
CREATE EXTERNAL SCHEMA rotor_audit
FROM DATA CATALOG
DATABASE 'rotor_audit_db'
IAM_ROLE 'arn:aws:iam::<your-account>:role/RedshiftSpectrumRole';

-- Query directly
SELECT event_type, COUNT(*) as cnt
FROM rotor_audit.audit_events
WHERE workspace_id = 'ws_abc123'
  AND occurred_at > GETDATE() - INTERVAL '30 days'
GROUP BY 1
ORDER BY cnt DESC;
```

## Verification

After each daily export, verify completeness using the `_checksum.sha256` file:

```bash
# Download checksum file
aws s3 cp \
  s3://customer-bucket/rotor-audit/workspace=ws_abc123/year=2026/month=04/day=13/_checksum.sha256 \
  ./checksum.sha256

# File format: "<row_count> <sha256_hex_of_all_parts>"
cat checksum.sha256
# Example: 142857 a3f9...

# Compare with Postgres source count
psql $DATABASE_URL -c "
  SELECT COUNT(*)
  FROM audit_event
  WHERE workspace_id = 'ws_abc123'
    AND occurred_at::date = '2026-04-13'
"
```

Row counts should match. Contact [enterprise@rotor.sh](mailto:enterprise@rotor.sh) if you
observe consistent discrepancies greater than 0.01%.

---

## Operators

### Admin MCP Server

URL: https://rotor.sh/docs/operators/admin-mcp

The **Rotor admin MCP server** lets a Claude Code session on your laptop inspect and triage a Rotor deployment in natural language. No SSH, no `redis-cli`.

<Note>
This is separate from the customer-facing Rotor MCP at `rotor.sh/mcp` (which uses `rt_ws_` / `rt_team_` keys). The admin MCP is operator-only and gated by a single `ADMIN_TOKEN` shared with the API.
</Note>

## Install

<Steps>
  <Step title="Build the server">
    From the rotor monorepo root:
    ```bash
    pnpm --filter=@rotor/mcp-server build
    ```
  </Step>
  <Step title="Register with Claude Code">
    ```bash
    claude mcp add rotor-admin -- node $(pwd)/packages/mcp-server/dist/index.js
    ```
  </Step>
  <Step title="Set environment variables">
    Add to your shell profile or your Claude Code MCP config:
    ```bash
    export ROTOR_ADMIN_TOKEN="<value of ADMIN_TOKEN on the API>"
    export ROTOR_API_URL="https://api.rotor.sh"  # optional override
    ```
  </Step>
  <Step title="Verify">
    Open a fresh Claude Code session and ask "What workspaces are running on Rotor?" — Claude should call `rotor_list_workspaces` and render the result.
  </Step>
</Steps>

## Available tools

| Tool                           | Destructive | Description                                                |
| ------------------------------ | ----------- | ---------------------------------------------------------- |
| `rotor_list_workspaces`        | —           | List all workspaces with plan and creation date            |
| `rotor_get_workspace_queues`   | —           | Queue depth per state for every queue in a workspace       |
| `rotor_get_workspace_schedules`| —           | Schedules with BullMQ state and `drift` flag               |
| `rotor_inspect_job`            | —           | Full job detail (data, return value, failure reason)       |
| `rotor_get_dlq`                | —           | DLQ depth and recent failed jobs                           |
| `rotor_replay_dlq`             | yes         | Idempotently re-enqueue DLQ jobs (deterministic IDs)       |
| `rotor_fire_schedule_now`      | yes         | Out-of-band schedule execution (does not reset cron)       |

Destructive tools require `confirm: true` in the call arguments. Claude Code will ask you to confirm before invoking them.

## Idempotency guarantees

`rotor_replay_dlq` uses BullMQ's job-id uniqueness to dedup replays. Calling it twice with the same parameters produces the same `replay-${originalId}` IDs and BullMQ silently no-ops the second `addBulk`.

`rotor_fire_schedule_now` enqueues a one-shot job — it does **not** call `upsertJobScheduler`. The cron's normal `next_run_at` is unchanged.

## Troubleshooting

See the package README at `packages/mcp-server/README.md` for env var and 403 troubleshooting.

---

## Trust

### Security

URL: https://rotor.sh/docs/security/index

# Security

<Note>SOC 2 Type II observation window started 2026-04-20; target report ~2026-10-20.</Note>

## Posture

- **SOC 2 Type II** in observation (start: 2026-04-20; target report: ~2026-10-20)
- All production traffic **TLS 1.3+**
- Data at rest: **AES-256** (Supabase Postgres, AWS S3)
- Secrets: Railway vault + AWS KMS; rotated on incident + annually
- Penetration test: annual (next: 2026-10)
- Vanta-monitored: Supabase, Fly.io, AWS, GitHub, Slack

## Sub-processors

| Sub-processor | Purpose | Data | Region |
| --- | --- | --- | --- |
| Supabase | Auth, Postgres | PII, audit\_event, team\_member | us-east-1 |
| Railway | Redis, application hosting | transient queue state | us-west2 / us-east-1 |
| Fly.io | Managed workers (Enterprise) | customer compute | multi-region |
| Stripe | Billing | customer\_id, invoice | US |
| Anthropic | Brand-tone LLM judge | outbound payload samples | US |
| Slack | Approval callbacks | approval metadata | US |
| AWS | S3 staging (audit export) | Parquet exports | us-east-1 |
| Sentry | Error tracking | stack traces, user\_id | US |

## Data Handling

- **Encryption at rest**: AES-256-GCM for customer secrets (webhook, callback, mTLS CA) and audit-export credentials
- **Encryption in transit**: TLS 1.3+
- **Retention**: plan-based — Free 7d · Pro 30d · Team 90d · Enterprise 365d+
- **PII redaction**: applied before worker receives payload (see Guardrail Engine)
- **Audit logs**: partitioned by month, immutable, 1-year retention minimum

## Incident Response

We commit to disclosing confirmed breaches affecting customer data **within 24 hours** of confirmation.

To report a vulnerability or security incident:
- **Email**: security@rotor.sh
- We acknowledge reports within 24 hours
- Critical issues are remediated within 7 days

## Policies

The following policies govern our security program:

- [Acceptable Use Policy](/compliance/acceptable-use-policy)
- [Access Control Policy](/compliance/access-control-policy)
- [Change Management Policy](/compliance/change-management-policy)
- [Incident Response Policy](/compliance/incident-response-policy)
- [Data Retention Policy](/compliance/data-retention-policy)
- [Vendor Management Policy](/compliance/vendor-management-policy)
- [Risk Assessment Policy](/compliance/risk-assessment-policy)

### compliance/acceptable-use-policy

URL: https://rotor.sh/docs/compliance/acceptable-use-policy

# Acceptable Use Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define expected behavior for all users of rotor.sh services — employees, contractors, and customers — to protect the platform's integrity, security, and other users' quality of service.

## Scope

All Rotor employees, contractors, and end customers (including free-tier, Pro, Team, and Enterprise plans).

## Policy

### 1. Authorized Use Only

Users may only interact with rotor.sh APIs, SDKs, MCP tools, and CLI using credentials they have legitimately obtained. Sharing API keys outside one's organization is prohibited.

### 2. Tenant Isolation Respect

Users must not attempt to circumvent BullMQ tenant isolation (the `{ws_<id>}` prefix boundary). Probing, enumerating, or accessing another workspace's queues, jobs, or audit events is prohibited regardless of technical feasibility.

### 3. PII and Sensitive Data Handling

Payloads must not contain unencrypted PII outside the supported PII-redaction path (Guardrail Engine). Users who need to process PII must enable PII redaction in their workspace's guardrail config before enqueuing jobs containing personal data.

### 4. Rate and Quota Compliance

Users must not attempt to circumvent per-plan job-execution quotas (Free: 10k/mo, Pro: 100k/mo, Team: 1M/mo). Artificially spreading load across multiple free-tier workspaces to exceed limits is prohibited and will result in account termination.

### 5. No Unauthorized Scraping or Reverse Engineering

Users must not attempt to reverse-engineer rotor.sh internal APIs, Redis key structures, or BullMQ Lua scripts beyond what is documented in the public API reference.

### 6. No Abuse of the Approval Flow

The approvals system is intended for genuine human-in-the-loop oversight. Automated scripts that auto-approve all jobs to bypass guardrail review violate the spirit of the approval system and may trigger account review.

### 7. No Malicious Payloads

Job payloads must not contain instructions intended to exploit Rotor's infrastructure, other customers' handlers, or downstream callback recipients. This includes code injection, prompt injection against the brand-tone LLM judge, and SSRF payloads targeting callback URLs.

### 8. Compliance with Laws

Users are responsible for ensuring their use of rotor.sh complies with applicable laws and regulations, including data protection laws (GDPR, CCPA) and export control regulations.

## Enforcement

Violations may result in:
- Immediate API key revocation
- Workspace suspension (BIL-06 kill-switch)
- Account termination without refund
- Legal action where warranted

Suspected violations are logged as `compliance.aup_violation` audit events and reviewed by the security team.

## Review Cadence

This policy is reviewed annually. Next review: 2026-10-20.

### compliance/access-control-policy

URL: https://rotor.sh/docs/compliance/access-control-policy

# Access Control Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define how access to rotor.sh systems, data, and administrative functions is granted, maintained, and revoked — covering both internal team access and customer-facing RBAC.

## Scope

All Rotor employees, contractors, and enterprise customers using SSO/SAML + RBAC features.

## Policy

### 1. Customer-Facing RBAC (ENT-01)

rotor.sh implements a fixed four-role model:
- **admin**: Full workspace control; manages members, API keys, SSO config, billing.
- **developer**: Manages queues, jobs, schedules, guardrails, webhooks; cannot manage members.
- **approver**: Approves or rejects pending jobs; cannot manage queues or members.
- **viewer**: Read-only access to runs, audit events, and metrics.

Role assignments follow least-privilege: new invites default to `viewer` unless explicitly set by an admin. The `approver` capability is additive — a developer's team member record may include `approver` in their capabilities array.

### 2. API Key Hierarchy

API keys follow a three-tier hierarchy (AUTH-03):
- **Team keys** (`rtr_t_*`): Cross-workspace access; issued only to admins.
- **Workspace keys** (`rtr_w_*`): Single-workspace access; issued by admins/developers.
- **Queue keys** (`rtr_q_*`): Single-queue access; issued by admins/developers.

Keys expire after 90 days unless extended by an admin. Rotated keys (via `POST /v1/api-keys/:id/rotate`) inherit the parent key's expiry.

### 3. SSO / SAML (ENT-01)

Enterprise customers may configure SAML 2.0 SSO via Supabase Auth. SAML role assertions (`raw_app_meta_data.role`) override invite-time roles where the asserted role is a valid four-role value. The Supabase SAML provider toggle must be enabled per-project by the Rotor operator.

### 4. Internal Team Access

Rotor employees access production infrastructure via:
- Supabase Dashboard: email + TOTP 2FA required.
- Railway Dashboard: SSO via GitHub; 2FA required on GitHub.
- Fly.io Dashboard: email + TOTP 2FA required.
- AWS Console: IAM roles with MFA enforcement; no root account usage.

New employee access is provisioned on their first day; deprovisioned within 24 hours of offboarding.

### 5. Secret Rotation

- Webhook signing secrets: rotated via `POST /v1/webhooks/:id/rotate-secret`.
- Callback signing secrets: rotated via `PATCH /v1/queues/:id` (`rotate_callback_secret: true`).
- WEBHOOK_SECRET_ENCRYPTION_KEY: rotated annually and on any credential-exposure incident.
- All secrets stored in Railway vault; never in `.env` files committed to version control.

### 6. Orphan / Legacy Keys

Orphan API key rows (null or broken `team_member_id`) default to `role=viewer` (least-privilege). `scripts/audit-orphan-api-keys.sh` runs as a pre-deploy CI gate before production cutover.

## Review Cadence

Reviewed annually. Next review: 2026-10-20.

### compliance/change-management-policy

URL: https://rotor.sh/docs/compliance/change-management-policy

# Change Management Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Ensure that changes to rotor.sh production systems are reviewed, tested, and deployed in a controlled manner to minimize risk and maintain availability.

## Scope

All changes to production infrastructure: code deployments (apps/api, apps/worker, apps/www), database migrations, Railway/Fly configuration changes, and infrastructure-as-code changes.

## Policy

### 1. Pull Request Review

All code changes require at least one approving review from a Rotor team member before merge. Self-approval is prohibited except for hotfixes under the emergency procedure below.

### 2. CI Gates (must pass before merge)

The following CI checks must pass on every PR:
- `pnpm typecheck` — TypeScript compilation across all packages
- `pnpm test` — full unit test suite (≥95% pass rate; flaky tests are fixed within 24h)
- `bash scripts/audit-prefix.sh` — BullMQ prefix isolation invariant
- `bash scripts/audit-compliance-docs.sh` — no PLACEHOLDER compliance docs
- `bash scripts/audit-openapi.sh` — all /v1/ routes documented

### 3. Deploy Cadence

- **Staging**: automatic deploy on merge to `main` via Railway CI
- **Production**: manual promotion from staging after smoke tests pass (minimum 30 min soak)
- **Database migrations**: applied via `pnpm db:migrate` before worker/API restart; rollback script required in PR description for all DDL changes

### 4. Feature Flags

New features that affect billing, quotas, or enterprise behavior must be gated behind a feature flag in the `feature_flags` table. This allows instant rollback without a code deployment.

### 5. Emergency Hotfix Procedure

For critical production incidents (data loss, auth bypass, billing error):
1. Create a hotfix branch directly from the last production tag.
2. Implement the minimal fix; add a regression test.
3. Get async review from at least one team member (Slack DM acceptable for P0).
4. Deploy directly to production; backfill the PR review within 2 business days.
5. File an incident report within 24 hours.

### 6. Secrets and Environment Variables

Environment variable changes (Railway vault) must be:
- Documented in the PR description with the variable name (not value).
- Applied in staging first, then production.
- Never committed to version control.

## Review Cadence

Reviewed annually. Next review: 2026-10-20.

### compliance/incident-response-policy

URL: https://rotor.sh/docs/compliance/incident-response-policy

# Incident Response Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define the process for detecting, responding to, and communicating security incidents and service disruptions affecting rotor.sh customers.

## Scope

All security incidents (data breaches, credential exposure, unauthorized access) and significant service disruptions (>1h downtime, data loss, billing errors).

## Definitions

- **P0 — Critical**: Data breach, auth bypass, billing fraud, or complete service outage. Response starts within 1 hour.
- **P1 — High**: Partial service outage, significant performance degradation (>2x latency), or credential exposure. Response starts within 4 hours.
- **P2 — Medium**: Non-critical bug affecting a subset of customers. Response within 24 hours.
- **P3 — Low**: Cosmetic issues, documentation errors. Response within 72 hours.

## Incident Response Process

### 1. Detection

Incidents may be detected via:
- Sentry error alerts (apps/api, apps/worker)
- Railway service health monitors
- Customer reports to support@rotor.sh or security@rotor.sh
- Vanta compliance monitoring alerts
- Internal monitoring (BullMQ job failure spikes, Postgres connection errors)

### 2. Triage (within 1h for P0, 4h for P1)

1. Confirm the incident is real (not a false positive).
2. Classify severity (P0–P3).
3. Identify affected scope: which workspaces, customers, data types.
4. Open an incident Slack channel: `#incident-YYYY-MM-DD-brief-description`.
5. Assign an incident commander (IC).

### 3. Containment

- For credential exposure: rotate affected secrets immediately (WEBHOOK_SECRET_ENCRYPTION_KEY, Supabase service key, Railway tokens).
- For unauthorized access: revoke affected API keys / sessions via Supabase admin.
- For service disruption: engage Railway/Fly support and activate the kill-switch (BIL-06) for affected workspaces if data integrity is at risk.

### 4. Eradication and Recovery

- Identify root cause; implement the minimum fix.
- Apply the fix via the emergency hotfix procedure (see Change Management Policy).
- Validate fix in staging before applying to production.
- Restore service; verify affected customers can access their data.

### 5. Customer Communication (P0 / P1 — 24h disclosure commitment)

We commit to disclosing confirmed breaches affecting customer data **within 24 hours** of confirmation.

Communication channels:
- Email to affected workspace admins (sourced from Supabase team_member table).
- Status page update at status.rotor.sh.
- For Enterprise: direct phone/Slack contact if account has a CSM.

Disclosure must include:
- What happened and when (UTC timestamps).
- What data was affected.
- What we have done to contain it.
- What customers should do (e.g. rotate API keys).

### 6. Post-Incident Review

Within 5 business days of P0/P1 resolution:
- Write a blameless post-mortem (5 Whys format).
- Document timeline, contributing factors, and action items.
- Share with affected customers on request.
- Add action items to engineering backlog with due dates.

## Escalation Contacts

| Role | Contact |
|------|---------|
| Incident Commander | daan@shyft.ai |
| Supabase support | support.supabase.com |
| Railway support | railway.app/help |
| Fly.io support | fly.io/docs/support |
| Sentry on-call | via Sentry alerting rules |

## Review Cadence

Reviewed annually. Next review: 2026-10-20.

### compliance/data-retention-policy

URL: https://rotor.sh/docs/compliance/data-retention-policy

# Data Retention Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define how long rotor.sh retains customer data and system logs, and how data is deleted when it is no longer needed.

## Scope

All customer data stored by rotor.sh: job history (Redis + Postgres), audit events, webhook delivery logs, billing records, and team/user PII.

## Retention Schedule

| Data Type | Free | Pro | Team | Enterprise |
|-----------|------|-----|------|------------|
| Active jobs (Redis) | 7 days after completion | 30 days | 90 days | 365 days |
| job_history (Postgres archive) | 7 days | 30 days | 90 days | 365 days |
| audit_event | 1 year | 1 year | 1 year | 1 year (+ custom on request) |
| Webhook delivery logs | Same as job history | Same | Same | Same |
| Billing records (Stripe) | Indefinite (legal) | Indefinite | Indefinite | Indefinite |
| Team / user PII | Until account deletion + 30 days | Same | Same | Same |
| COGS daily metrics | 2 years | N/A | N/A | 2 years |

### Notes

- **Retention enforcement**: The nightly history archiver (`0 2 * * *` UTC) moves completed/failed jobs from Redis to the `job_history` Postgres table according to the workspace's plan retention. After the retention period, jobs are permanently deleted from the archive.
- **Partition cleanup**: Monthly Postgres partitions for `audit_event` and `job_history` older than the applicable retention period are dropped by the monthly partition preflight job.
- **Enterprise custom retention**: Enterprise customers with contractual requirements beyond 365 days should contact enterprise@rotor.sh to arrange a custom export pipeline.

## PII Handling

- PII is redacted at the Guardrail Engine layer before job payloads reach workers (when `pii_redaction_enabled = true` in the workspace guardrail config).
- Auth PII (email, user_id) stored in Supabase `auth.users` is deleted within 30 days of account deletion via Supabase auth admin API.
- Stripe billing PII is retained per Stripe's own retention policy (Stripe is the controller for payment card data; Rotor is a processor).

## Data Deletion Requests

Customers may request deletion of their workspace data by emailing privacy@rotor.sh. Rotor will acknowledge within 5 business days and complete deletion within 30 days (or within the timeframe required by applicable law).

Deletion covers:
- All jobs, job history, and audit events for the workspace
- Webhook endpoints and delivery history
- COGS daily metrics (if Enterprise)
- Team members and API keys

Deletion does NOT cover:
- Billing records required for legal/tax purposes (retained 7 years)
- Anonymized aggregate metrics that cannot identify the workspace

## Review Cadence

Reviewed annually. Next review: 2026-10-20.

### compliance/vendor-management-policy

URL: https://rotor.sh/docs/compliance/vendor-management-policy

# Vendor Management Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define how rotor.sh evaluates, onboards, monitors, and offboards third-party vendors (sub-processors) who handle customer data or provide critical infrastructure.

## Scope

All third-party services that (a) store or process customer data, or (b) are in the critical path of rotor.sh production availability.

## Current Sub-processors

| Vendor | Purpose | Data Processed | Region | SOC 2 / Compliance |
|--------|---------|----------------|--------|-------------------|
| Supabase | Auth, Postgres | PII, audit_event, team_member | us-east-1 | SOC 2 Type II |
| Railway | Redis, app hosting | transient queue state | us-west2 / us-east-1 | SOC 2 (in progress) |
| Fly.io | Managed workers (Enterprise) | customer compute | multi-region | SOC 2 Type II (Compliance package) |
| Stripe | Billing | customer_id, invoice | US | PCI DSS Level 1, SOC 2 |
| Anthropic | Brand-tone LLM judge | outbound payload samples | US | Trust & Safety policy |
| Slack | Approval callbacks | approval metadata | US | SOC 2 Type II |
| AWS | S3 staging (audit export) | Parquet exports | us-east-1 | SOC 2 Type II, ISO 27001 |
| Sentry | Error tracking | stack traces, user_id | US | SOC 2 Type II |

## Vendor Evaluation Criteria

New sub-processors must be evaluated against:
1. **Security posture**: SOC 2 Type II report (or equivalent ISO 27001 / HIPAA BAA if applicable)
2. **Data residency**: Does the vendor support US-only data residency where required by Enterprise customers?
3. **Incident response SLA**: Does the vendor commit to notification within 72 hours of a data breach?
4. **Business continuity**: What is the vendor's documented uptime SLA and DR capability?
5. **Data deletion**: Does the vendor support deletion within 30 days of contract termination?

## Vendor Monitoring

### Vanta-Integrated Vendors (automated evidence collection)
The following vendors have active Vanta integrations and are continuously monitored:
- Supabase (OAuth integration)
- Fly.io (Compliance package + Vanta OAuth)
- AWS (native Vanta integration)
- GitHub (native Vanta integration)
- Slack (native Vanta integration)

### Manual Evidence Cadence (Railway)
Railway does not have a Vanta integration. Manual evidence is collected monthly per `docs/compliance/vanta-manual-evidence.md`:
- Railway Team Members list screenshot
- Railway Project/Environment configuration screenshot
- Railway Billing invoice screenshot

### Annual Review
All sub-processors are reviewed annually for:
- Updated SOC 2 / compliance reports
- Changes to data processing terms (DPA updates)
- Service changes that affect data handling

## Sub-processor Change Process

Adding or removing a sub-processor requires:
1. Evaluation against the criteria above.
2. PR updating this document and `docs/security/index.mdx`.
3. If the vendor processes customer PII: update the customer-facing Data Processing Agreement (DPA) and notify Enterprise customers with 30 days notice.

## Vendor Offboarding

On contract termination:
1. Revoke all credentials and API tokens within 24 hours.
2. Request data deletion confirmation within 30 days.
3. Update `docs/compliance/vendor-management-policy.md` and `docs/security/index.mdx`.

## Review Cadence

Reviewed annually. Next review: 2026-10-20.

### compliance/risk-assessment-policy

URL: https://rotor.sh/docs/compliance/risk-assessment-policy

# Risk Assessment Policy

Status: ACTIVE
Owner: Daan (daan@shyft.ai)
Effective: 2026-04-20
Last reviewed: 2026-04-20
Next review: 2026-10-20

## Purpose

Define how rotor.sh identifies, assesses, prioritizes, and mitigates risks to the confidentiality, integrity, and availability of customer data and platform services.

## Scope

All risks affecting rotor.sh production systems, customer data, and business operations.

## Risk Assessment Cadence

| Activity | Frequency | Owner |
|----------|-----------|-------|
| Full risk review | Quarterly | Daan |
| New feature risk assessment | Per feature (in PR description) | Feature author |
| Vendor risk review | Annual | Daan |
| Penetration test | Annual | External firm |

## Risk Categories

### 1. Infrastructure Risks

| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| Redis maxmemory eviction causing queue corruption | Low (guarded by assertNoEviction) | High | FND-07 startup guard; Railway noeviction confirmed (Phase 0) |
| Railway service outage | Medium | High | Railway 99.9% SLA; BullMQ persists jobs; worker auto-restarts |
| Supabase outage | Low | High | Supabase 99.9% SLA; read-through cache on resolver; jobs still queue in Redis |
| Fly.io managed worker failure | Medium | Medium (Enterprise only) | Autoscaler retries provisioning; BullMQ job retries with backoff |

### 2. Security Risks

| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| Credential leak (API key, Railway secret) | Low | Critical | Railway vault; key rotation on detection; audit log; Sentry alerts |
| DDoS on public API | Medium | High | Railway load balancer; rate limiting (manual Redis INCR+PEXPIRE); quota middleware |
| SSRF via callback URLs | Low | High | validateCallbackUrl DNS check; SSRF guard in delivery worker |
| Tenant cross-contamination (Redis prefix escape) | Very Low | Critical | BullMQ prefix audit (audit-prefix.sh); single source of truth in prefix.ts |
| PII leakage in job payloads | Medium | High | Guardrail Engine PII redaction; configurable per-workspace |
| Compromised LLM judge output | Low | Medium | Brand-tone circuit breaker fails open (safe default); humans approve |
| Billing abuse / quota circumvention | Medium | Medium | Per-plan hard caps at API write time; Stripe reconciler detects drift |

### 3. Compliance Risks

| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| SOC 2 observation window gap (Railway manual evidence) | Medium | Medium | vanta-manual-evidence.md monthly cadence |
| GDPR data subject request beyond 30-day SLA | Low | Medium | Manual process documented in data-retention-policy.md |
| API misuse / AUP violation | Medium | Low-Medium | Audit logs; kill-switch (BIL-06); AUP enforcement process |

### 4. Operational Risks

| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| Stalled-job duplication (at-least-once semantics) | Medium | Medium | SDK SIGTERM drain; idempotency keys; history archiver dedup |
| BullMQ Lua script version drift | Low | Medium | peerDep pin `>=5.73 <6`; version check at startup |
| Claude Code skill distribution drift (Anthropic registry change) | Medium | Low | Fall back to local skill install; monitor Claude Code releases |

## Risk Acceptance

Risks rated **Low × Low** may be accepted without further mitigation. All other risks require a documented mitigation plan and owner.

Accepted risks must be documented in the Vanta risk register with:
- Risk description
- Current mitigation
- Residual risk level
- Acceptance date and owner

## Review Cadence

Reviewed quarterly. Next full review: 2026-07-20.