orkidata API documentation

Quickstart

Two steps to your first authenticated API call. This page assumes you already have an Orkidata account, sign-up happens through the web UI, not the API.

1. Mint a Bearer token

Tokens are minted from Profile → API tokens in the web UI. Click Generate, copy the token (it's shown once, store it somewhere safe), and use it as your Authorization header for every API request.

Each account holds one active token at a time. Generating a new token revokes the previous one. If you rotate, update every system using the old token before that token's last call.

2. First authenticated request

List the workflows in your account:

curl -H "Authorization: Bearer $ORKIDATA_TOKEN" \
  https://orkidata.com/api/workflows

Response: empty account

{
  "confirmEmpty": true,
  "items": []
}

Response: populated account

{
  "items": [
    {
      "id":            "id_1777515532786_jwshcxvic",
      "name":          "Daily sales report",
      "type":          "workflow",
      "createdAt":     "2026-04-30T02:18:52Z",
      "showInSidebar": false
    },
    {
      "id":   "fold_customer_pipelines",
      "name": "Customer pipelines",
      "type": "folder"
    }
  ]
}

The tree may also include a top-level confirmEmpty field, that's a write-time safety flag (see tutorial step 3) the server persists from your last save; it's not meaningful in read responses.

From here, jump to the tutorial for an end-to-end walkthrough that creates a workflow, runs it, and downloads its export, entirely via curl.

Authentication

The public API uses Bearer-token authentication only. Every request must carry an Authorization: Bearer <token> header.

Token lifecycle

Operation	Where	Notes
Mint	Profile → API tokens (UI) or `POST /api/auth/token`	The plaintext token is shown ONCE. Store it in a secret manager.
Revoke	Profile → API tokens (UI) or `DELETE /api/auth/token`	Revocation is immediate. The next request with that token returns 401.
Rotate	Mint a new one	Re-minting auto-revokes the previous active token.

Each request counts toward your account's RPM quota, see the rate limits table.

Connect AI assistants (MCP)

Orkidata is also a Model Context Protocol (MCP) server: connect it to an AI assistant and the assistant can build, run, and inspect your workflows conversationally — 44 curated tools covering the full build flow (list_step_types → create_workflow → add_step → run_workflow → results).

Server URL	`https://orkidata.com/api/mcp`
Transport	Streamable HTTP (stateless JSON responses — no SSE)
Auth (chat clients)	OAuth 2.1 with PKCE — sign in with your Orkidata account, no keys to paste
Auth (headless clients)	Bearer token from Profile → API tokens

A connected assistant acts as you. Your tier, your data, your rate limits — and workflow runs execute real side effects (emails, database writes, SFTP/Drive delivery). Only approve the consent screen for connections you started yourself, and review what the assistant proposes before letting it run side-effecting workflows.

Claude (claude.ai)

Open Settings → Connectors → Add custom connector.
Name: Orkidata. Remote MCP server URL: https://orkidata.com/api/mcp.
Leave the Advanced settings fields (OAuth Client ID / OAuth Client Secret) blank — Orkidata registers Claude automatically during the handshake.
Click Add, then Connect. A window opens on orkidata.com: sign in if prompted, check the consent screen (it names the client and where it redirects), and click Allow.
Back in Claude, Orkidata's tools appear in the tools menu. Try: “List my Orkidata workflows.”

ChatGPT (developer mode)

In Settings, enable Developer mode (menu placement varies by app version), then browse connectors/plugins and click + to add a custom one.
Name: Orkidata (icon and description optional).
MCP Server URL: https://orkidata.com/api/mcp — exactly this. The form's placeholder shows an /sse URL; Orkidata uses Streamable HTTP, so do not append /sse.
Authentication: OAuth. The advanced OAuth settings are discovered automatically from the server URL — nothing to fill in.
Acknowledge ChatGPT's third-party notice and click Create, then complete the same Orkidata sign-in and consent flow.

Claude Code, IDEs & other API clients

Headless clients authenticate with a Bearer token instead of OAuth:

claude mcp add --transport http orkidata https://orkidata.com/api/mcp \
  --header "Authorization: Bearer $ORKIDATA_TOKEN"

Any MCP client that supports Streamable HTTP with a custom Authorization header works the same way.

Behavior & limits

run_workflow is always asynchronous: it returns a runId; the assistant polls get_run_status, then fetches results or an export download link.
Every MCP request counts 1 request toward your RPM quota (see rate limits); runs count toward your hourly execution cap like any other run, and appear in usage with source mcp.
Inline uploads are capped at 3 MB; larger files go through the web UI or presigned upload URLs.
Admin, billing-mutation, and account-management surfaces are deliberately not exposed as tools.
Disconnecting: remove the connector in your assistant's settings. The assistant discards (and typically revokes) its tokens; access tokens expire on their own within an hour in any case.

Rate limits & tiers

Every account has per-minute, per-hour, and concurrency budgets sized to its tier. GET /api/usage returns your current consumption and the active limits at any time.

Tier	RPM	Exec/hr	Concurrent	Max data	Max exec sec	Emails/day
Free	30	10	1	10 MB	60	0
Entry	120	60	5	500 MB	300	500
Pro	600	300	20	5 GB	870	5,000

Inspect your current usage

curl -H "Authorization: Bearer $ORKIDATA_TOKEN" \
  https://orkidata.com/api/usage

Response (free tier, idle)

{
  "tier": "free",
  "current": {
    "concurrent_active": 0,
    "exec_hour_used": 0,
    "rpm_used": 2
  },
  "limits": {
    "rpm": 30,
    "exec_per_hour": 10,
    "concurrent": 1,
    "max_data_mb": 10,
    "max_exec_seconds": 60,
    "max_emails_per_day": 0
  }
}

Error responses

Every non-2xx response is JSON with at minimum an error field:

{ "error": "Human-readable description", "retry_after": 60 }

The retry_after field appears on rate-limit responses (429).

Status	Meaning
400	Validation error in the request body or query.
401	Missing, invalid, or expired Bearer token.
403	Authenticated, but not authorized for this resource.
404	Resource doesn't exist (or you can't see it, same code on purpose).
409	Conflict, for example, re-adding a previously unsubscribed email recipient.
429	Rate limit hit. `retry_after` indicates seconds until next attempt.
500	Unhandled server error. Please report at support@orkidata.com.

Example: missing token

curl -i https://orkidata.com/api/workflows

Response

HTTP/1.1 401 Unauthorized
Content-Type: application/json

{"error":"Authentication required"}

Tutorial: build & introspect a complete pipeline via API

A nine-step end-to-end walkthrough, build a workflow, run it, pull per-step results, download the export, and sample everything else your account exposes. Every command and response below was captured from a real Free-tier account on https://orkidata.com.

Set your token once for the rest of the tutorial:

export ORKIDATA_TOKEN='etl_…your token…'
export H="Authorization: Bearer $ORKIDATA_TOKEN"
export BASE='https://orkidata.com'

1. List what's in your account

curl -s -H "$H" "$BASE/api/workflows"

Response: fresh account

{ "items": [] }

A populated tree returns {"items":[…]} with each workflow / folder entry. The response may also carry a confirmEmpty flag persisted from your last write, see step 3 for what it does on the write side.

2. Upload a tiny CSV (data for the workflow)

Uploads use a 2-step presigned-URL pattern (it bypasses the API Gateway 10 MB body limit). First, ask for the URL:

curl -s -H "$H" \
  "$BASE/api/data-storage/upload-url?filename=orders.csv&contentType=text/csv"

Response

{
  "bucket":     "etl-platform-prod-…",
  "expiresIn":  300,
  "key":        "data-storage/<user-id>/1777513758_orders.csv",
  "uploadUrl":  "https://…s3.amazonaws.com/…?AWSAccessKeyId=…&Signature=…&Expires=…"
}

Then PUT the file bytes to that signed URL, and confirm the upload:

curl -X PUT -H "Content-Type: text/csv" \
  --data-binary @orders.csv \
  "<uploadUrl from above>"

curl -s -X POST -H "$H" -H 'Content-Type: application/json' \
  -d '{"name":"orders.csv","s3Key":"data-storage/<user-id>/…orders.csv","format":"csv","mimeType":"text/csv"}' \
  "$BASE/api/data-storage/confirm-upload"

Response

{
  "status": "success",
  "file": {
    "id":          "491ec56f-0e2d-41b4-8f8c-db530fdaeea8",
    "type":        "file",
    "name":        "orders.csv",
    "format":      "csv",
    "mimeType":    "text/csv",
    "rowCount":    10,
    "columnCount": 5,
    "s3Key":       "data-storage/<user-id>/…orders.csv",
    "uploadedAt":  "2026-04-30T01:49:54Z"
  }
}

Hold on to file.id: the workflow's load_data step references it.

3. Add the workflow to your tree

POST /api/workflows replaces the entire tree. Always GET first, append your new entry, then POST the full result. Sending a partial tree wipes anything missing. The empty-tree wipe requires confirmEmpty: true: partial trees do not have that guard.

curl -s -X POST -H "$H" -H 'Content-Type: application/json' \
  -d '{"items":[{"id":"docs-tutorial","name":"docs-tutorial","type":"workflow","parentId":null}]}' \
  "$BASE/api/workflows"

Response

{ "success": true }

4. Save the step graph

Three steps, load_data reads the file you just uploaded, filter_rows keeps the paid orders, and export_file writes the result to CSV.

curl -s -X POST -H "$H" -H 'Content-Type: application/json' \
  -d '{
    "steps": [
      {
        "id":   "load",
        "type": "load_data",
        "name": "Load orders",
        "config": {
          "sources": [
            { "fileId": "491ec56f-…", "outputVariable": "orders" }
          ]
        }
      },
      {
        "id":   "filter",
        "type": "filter_rows",
        "name": "Paid only",
        "config": {
          "source":         "orders",
          "outputVariable": "paidOrders",
          "conditions": [
            { "field": "status", "operator": "==", "value": "paid" }
          ]
        }
      },
      {
        "id":   "export",
        "type": "export_file",
        "name": "Export paid orders",
        "config": {
          "source":    "paidOrders",
          "delimiter": "comma",
          "filename":  "paid-orders.csv"
        }
      }
    ]
  }' \
  "$BASE/api/workflow/docs-tutorial/definition"

Response

{ "success": true }

5. Inspect the saved definition

The same definition you just stored, now read back:

curl -s -H "$H" "$BASE/api/workflow/docs-tutorial/definition"

Response

{
  "_permission": "owner",
  "steps": [
    { "id": "load",   "type": "load_data",    "name": "Load orders",
      "config": { "sources": [ { "fileId": "491ec56f-…", "outputVariable": "orders" } ] } },
    { "id": "filter", "type": "filter_rows",  "name": "Paid only",
      "config": { "source": "orders", "outputVariable": "paidOrders",
                  "conditions": [ { "field": "status", "operator": "==", "value": "paid" } ] } },
    { "id": "export", "type": "export_file",  "name": "Export paid orders",
      "config": { "source": "paidOrders", "delimiter": "comma", "filename": "paid-orders.csv" } }
  ]
}

6. Run it

Tier 1 (≤10 MB estimated peak) returns the full result inline. Tier 2 or 3 returns {"async": true, "runId": "…"}; poll /api/execution/{runId}/status until the status is completed or failed. See section 8 below.

curl -s -X POST -H "$H" -H 'Content-Type: application/json' \
  -d '{"input":{},"source":"api"}' \
  "$BASE/api/workflow/docs-tutorial/execute"

Response (Tier 1, sync)

{
  "workflow_id":      "docs-tutorial",
  "success":          true,
  "started_at":       "2026-04-30T01:50:54Z",
  "completed_at":     "2026-04-30T01:50:54Z",
  "duration_seconds": 0.000109,
  "step_count":       3,
  "steps": [
    {
      "step_id":   "load", "step_type": "load_data", "success": true,
      "message":   "Loaded 10 row(s) from orders.csv into 'orders'",
      "output": {
        "totalRows":   10,
        "sources": [
          { "fileName": "orders.csv", "rowCount": 10, "columnCount": 5,
            "columns":   ["order_id","customer_email","status","amount_usd","created_at"],
            "preview":   [/* first 3 rows */] }
        ]
      }
    },
    {
      "step_id":   "filter", "step_type": "filter_rows", "success": true,
      "message":   "Filtered 10 → 6 rows (4 removed)",
      "output": {
        "originalCount": 10, "filteredCount": 6, "removedCount": 4,
        "logic":         "AND",
        "conditions":    ["status = paid"],
        "preview":       [/* first 3 matching rows */]
      }
    },
    {
      "step_id":   "export", "step_type": "export_file", "success": true,
      "message":   "Export ready: 6 rows, 5 columns (comma)",
      "output": {
        "filename":    "paid-orders.csv",
        "delimiter":   "comma",
        "rowCount":    6,
        "columnCount": 5,
        "exportReady": true
      }
    }
  ]
}

7. Inspect run history (per-step results)

List recent runs (use ?slim=true for a lightweight payload, exports/HTML are stripped):

curl -s -H "$H" "$BASE/api/workflow/docs-tutorial/history?slim=true"

Response (slim)

{
  "runs": [
    {
      "id":             "run_1777513854059",
      "runId":          "run_1777513854059",
      "status":         "success",
      "apiTriggered":   true,
      "startedAt":      "2026-04-30T01:50:54Z",
      "completedAt":    "2026-04-30T01:50:54Z",
      "duration":       0.000109,
      "stepCount":      3,
      "hasExports":     true,
      "exportSummary": [
        { "stepId": "export", "name": "Export paid orders", "rows": 6, "delimiter": "comma" }
      ],
      "dataMetrics": {
        "totalInputRows":  10,
        "totalOutputRows": 6,
        "emptyResult":     false
      }
    }
  ]
}

Pull the full per-step detail for a specific run:

curl -s -H "$H" "$BASE/api/workflow/docs-tutorial/history/run/run_1777513854059"

Returns the same top-level fields as the slim list plus result.steps[]: each step's full output payload, duration, errors, and the stepsSnapshot (the exact config the run executed against).

8. Async variant (for larger data)

When a run's estimated memory peak exceeds 10 MB, execute returns immediately with an async-dispatch response:

{ "async": true, "runId": "async_id_17772_a0c5ffd8", "tier": 2 }

Poll until terminal:

RUN_ID='async_id_17772_a0c5ffd8'
while true; do
  STATUS=$(curl -s -H "$H" "$BASE/api/execution/$RUN_ID/status" | jq -r .status)
  case "$STATUS" in
    completed|failed|timeout|cancelled) break ;;
  esac
  sleep 3
done
echo "Final: $STATUS"

Once completed, fetch the full result via the same /history/run/{runId} endpoint as step 7.

9. Download the export

curl -X POST -H "$H" \
  -o paid-orders.csv \
  "$BASE/api/workflow/docs-tutorial/history/run_1777513854059/export/export"

Tier-1 (sync) runs return the CSV bytes directly with Content-Type: text/csv. Tier-2/3 (async) runs return {"downloadUrl": "https://s3…","filename": "…"}, follow the redirect or curl -O it. The presigned URL is valid for 5 minutes.

paid-orders.csv

order_id,customer_email,status,amount_usd,created_at
ord-1001,alice@example.com,paid,49.50,2026-04-21
ord-1003,carol@example.com,paid,128.75,2026-04-22
ord-1005,eve@example.com,paid,21.40,2026-04-24
ord-1006,frank@example.com,paid,87.25,2026-04-25
ord-1008,heidi@example.com,paid,9.99,2026-04-27
ord-1009,ivan@example.com,paid,210.00,2026-04-28

You can see your entire account from `curl`

The endpoints used above are a subset. The same Bearer token also reads:

What	Endpoint
Current usage & limits	`GET /api/usage`
Day-by-day usage history	`GET /api/usage/history`
Subscription state	`GET /api/billing/subscription`
Plan catalog	`GET /api/billing/plans`
Data Storage tree	`GET /api/data-storage`
Connections list	`GET /api/connections`
Dashboards	`GET /api/dashboards`
Verified email recipients	`GET /api/email/recipients`
Per-workflow usage breakdown	`GET /api/workflow/{id}/usage`

Every one of those is a plain curl -H "$H" away. See the API reference below for the full list with request/response schemas.

Step types reference

Each card shows the step's purpose, config schema, and a minimal create example, the JSON you'd embed in steps[] when saving a workflow definition (see tutorial step 4). To see the runtime output of any step, run the workflow and read result.steps[] from GET /history/run/{runId}.

Free-tier accounts cannot execute webhook or send_email: those step types are gated to Entry+ (send_email's via Gmail mode: Enterprise).

load_data: read uploaded files, DB query results, or SFTP files

One step can load multiple sources in parallel. Each source becomes a context variable downstream. Pick exactly one source mode per entry: Data Storage file, DB connection + query, or SFTP connection + path.

Config

{
  "sources": [
    // Data Storage file
    { "fileId":         "<data-storage file id>",
      "outputVariable": "myData",
      "maxRows":        10000  /* optional cap */ },

    // DB connector (requires a connection of type postgres/mysql/etc.)
    { "connectionId":   "<connection id>",
      "query":          "SELECT … FROM …",
      "outputVariable": "myData",
      "maxRows":        10000 },

    // SFTP file (requires a connection of type sftp)
    { "connectionId":   "<sftp connection id>",
      "remotePath":     "/exports/orders.csv",
      "outputVariable": "myData",
      "matchMode":      "specific"  /* or "pattern" with folderPath + pattern + aggregateMode */ }
  ]
}

Example

{ "id": "load", "type": "load_data", "name": "Load orders",
  "config": { "sources": [
    { "fileId": "491ec56f-…", "outputVariable": "orders" }
  ] } }

Output: { totalRows, sourceCount, sources: [{ outputVariable, fileName, fileFormat, rowCount, columnCount, columns, preview, truncated }] }

print_data: dump variables to execution history

Diagnostic step. Surfaces the resolved value of each listed variable path in the run's output. Doesn't transform data.

Config

{
  "variables": ["orders", "steps.filter.output.outputRowCount"]
}

join_data: SQL-style join across two variables

Config

{
  "leftSource":     "orders",
  "rightSource":    "customers",
  "leftKey":        "customer_email",
  "rightKey":       "email",
  "joinType":       "left",            // inner | left | right | outer  (default: left)
  "rightPrefix":    "customer",        // optional, prefix for right-side columns to avoid collisions
  "flatten":        false,             // optional, flatten nested right-side object into top-level keys
  "outputVariable": "joined"
}

select_columns: keep / drop / rename columns

Each columns[] entry can be either a plain string (use the original name as-is) or a {original, rename} object to alias on the way through.

Config

{
  "source":         "orders",
  "outputVariable": "trimmed",
  "mode":           "keep",                                  // keep | drop
  "columns": [
    "order_id",                                              // plain, keep / drop as-is
    { "original": "amount_usd", "rename": "amount" }         // alias while keeping
  ]
}

filter_rows: keep rows that match conditions

Config

{
  "source":         "orders",
  "outputVariable": "filtered",
  "logic":          "and",            // and | or  (default: and)
  "conditions": [
    { "field": "status",     "operator": "==",  "value": "paid" },
    { "field": "amount_usd", "operator": ">",   "value": 50 }
  ]
}

Operators: ==, !=, >, <, >=, <=, contains, not_contains, is_null, is_not_null. Empty strings count as null for the null checks.

split_column: explode one column into many by delimiter

Config

{
  "source":         "rows",
  "outputVariable": "rowsSplit",
  "column":         "full_name",
  "delimiter":      " ",                       // any string; default is "|"
  "newColumns":     ["first_name", "last_name"],   // required unless dynamicSplit=true
  "dynamicSplit":   false,                     // if true, auto-detect max parts → column_1, column_2, …
  "removeOriginal": true                       // drop the source column from output rows (default: true)
}

return_data: pick which variables to surface in the run output

Map one or more variables into the named keys you want returned. Useful when you want a tidy result object instead of dumping the entire context.

Config

{
  "outputs": [
    { "name": "orders",  "source": "paidOrders" },
    { "name": "summary", "source": "steps.filter.output" }
  ],
  "dataOnly": false  // optional, when true, the run response is just the data dict
}

export_file: mark a variable for download as CSV/TSV/text

The step itself doesn't write a file, it records metadata (delimiter, columns, row count). The actual file bytes are generated on demand by the download endpoint.

Config

{
  "source":    "paidOrders",
  "delimiter": "comma",                          // comma | tab | semicolon | pipe
  "filename":  "paid-orders.csv"                 // free-form
}

File extension is auto-derived from the delimiter: comma → .csv, tab → .tsv, anything else → .txt. Download via POST /api/workflow/{id}/history/{runId}/export/{stepId}, see tutorial step 9.

move_file: relocate a Data Storage or SFTP file

Two systems, two match modes. system: data_storage moves entries inside your Data Storage tree (recorded as a deferred intent that applies after the workflow succeeds). system: sftp moves remote files immediately during the step. Path templates support {{variable}} placeholders that resolve from context.

Config: Data Storage, specific file

{
  "system":     "data_storage",     // default
  "matchMode":  "specific",         // default
  "sourcePath": "inbox/orders.csv",
  "destPath":   "processed/{{date}}"
}

Config: Data Storage, glob pattern

{
  "system":           "data_storage",
  "matchMode":        "pattern",
  "sourceFolderPath": "inbox",
  "sourcePattern":    "orders-*.csv",
  "destPath":         "processed/{{date}}"
}

Config: SFTP

{
  "system":       "sftp",
  "connectionId": "<sftp connection id>",
  "sourcePath":   "/inbox/orders.csv",
  "destPath":     "/processed/orders-{{date}}.csv"
}

webhook: POST a variable to your URL Entry+

Detailed shape, retry rules, and SSRF guards in Outbound webhook step below.

stop: terminate execution early

Config

{ "reason": "All upstream rows already processed" }

Marks the run successful and skips remaining steps. Pairs naturally with conditional.

conditional: branch on one or more conditions

Config

{
  "conditions": [
    { "variable": "steps.load.output.totalRows", "operator": "==",        "value": 0 },
    { "variable": "orders",                       "operator": "has_rows",  "value": null }
  ],
  "logic":      "and",   // and | or  (default: and)
  "then_steps": [
    /* step definitions to execute if conditions pass */
  ],
  "else_steps": [
    /* step definitions to execute otherwise (optional) */
  ]
}

Operators: ==, !=, >, <, >=, <=, contains, not_contains, has_rows, is_empty, exists, not_exists.

send_email: send a workflow email Entry+

Config

{
  "recipients":    ["ops@acme.com"],     // must be in your verified-recipients list (status=verified)
  "subject":       "Daily report, ${steps.summary.output.row_count} rows",
  "body_mode":     "inline_data",         // plain | inline_data
  "body_text":     "…${path}…",            // when body_mode=plain
  "body_variable": "steps.summary.output.rows",  // when body_mode=inline_data
  "attachment": {
    "kind":          "variable",          // none | variable | step
    "source":        "steps.transform.output.results",
    "format":        "csv",                // csv | tsv | psv | ssv | json  (variable kind only)
    "filename":      "report-${run_date}.csv",
    "delivery":      "auto",               // auto | attach | link  (auto flips to link at >5MB)
    "requires_auth": false                 // link/auto kinds only
  }
}

Recipients are managed via /api/email/recipients with a verification flow. Daily caps per tier, see the rate limits table. Sender's account email must be confirmed.

Send via Gmail Enterprise

{
  "via":         "gmail",               // default "orkidata" = the classic flow above
  "recipients":  ["ana@client.com"],    // ANY addresses — up to 100 per run
  "senderLabel": "Acme Reports",        // optional From display name
  "attestation": { "consented": true }  // REQUIRED — see below
}

Sends as you, from your connected Gmail — no per-recipient verification. Google's Gmail policy allows this only for recipients who consented to receive your emails (the required attestation); cold or unsolicited outreach is prohibited. Every email carries a one-click unsubscribe that stops all your Gmail-mode sends (sender-scoped) and is always honored. Attachments: link delivery only, public links. Google caps total sending (~500/day consumer, ~2,000/day Workspace); a scheduled workflow re-sends to the full list on every run.

write_to_db: append or upsert rows to a SQL connection

v1 supports PostgreSQL via the connector framework. Single-transaction semantics with batched commits.

Config

{
  "source":       "rows",
  "connectionId": "<db-connection id>",
  "targetTable":  "fact_orders",
  "targetSchema": "analytics",       // optional
  "mode":         "append",          // append | upsert
  "keyColumns":   ["order_id"],       // required when mode=upsert
  "batchSize":    5000,               // optional (default 5000)
  "maxRows":      5000000             // optional safety cap (default 5M)
}

Column names are taken from the source rows verbatim, there's no per-column source→target mapping. Rename columns upstream with select_columns if you need to.

write_to_sftp: upload a serialized variable to SFTP

Atomic upload: payload is written to <remotePath>.tmp.<run_id>, then renamed to the final path on success. Consumers polling remotePath never see a partial file.

Config

{
  "source":       "rows",
  "connectionId": "<sftp-connection id>",
  "remotePath":   "/exports/orders-${date}.csv",
  "outputFormat": "csv"             // csv | tsv | psv | ssv | json
}

write_to_drive: deliver a file to a Google Drive folder Entry+

Creates a new file in a Google Drive folder you pre-selected (stays within the non-sensitive drive.file scope). Pick the destination folder once under Data Storage → Google Drive destinations (the picker grant is what allows writing into it); the step references it by destinationId. A new file is created each run — nothing is overwritten.

Config

{
  "source":        "rows",
  "destinationId": "<drive destination id>",
  "filename":      "orders-{date}.csv",   // {run_id} {user_id} {date} {datetime} {timestamp}
  "outputFormat":  "csv"                  // csv | tsv | psv | ssv | json
}

reminders: scheduled reminder emails to your contacts Enterprise

Sends scheduled reminder emails as you (through your connected Gmail account via gmail.send) to contacts who are "due now." Recipients come from a built-in registry (managed through the Reminders endpoints or a public intake link/QR) or, in external mode, from the row output of an upstream load_data step (source.variable). The step owns its own daily cron, always runs async, and is protected by a consent ledger with a forced unsubscribe on every send. Requires Enterprise tier and a connected Gmail account.

Config

{
  "type":             "reminders",
  "subjectTemplate":  "Reminder: your appointment",
  "bodyTemplate":     "Hi ${name} — ${days_until_text} until your appointment.",
  "milestoneOffsets": [-6, -2, 0],   // signed days vs the date (− before, 0 = on the day, + after)
                                     //   ...or "repeatEveryDays": 7 for a recurring cadence
  "sendHour":         9,             // hour (workflow timezone) the daily check runs
  "scheduleActive":   true,
  "testRecipients":   ["you@example.com"], // these send for real while the global live-send gate is off
  "bccAddresses":     ["archive@yourco.com"], // verified recipients only; BCC, never CC
  "perRunBudget":     400,           // max sends per run; 0 pauses sending
  "resultVariable":   "remindersResult",   // optional run-summary output variable
  "source": {                        // external mode (optional) — drive off an upstream step's rows
    "variable":    "loadedContacts",
    "mapping":     { "email": "Email", "anchor_date": "Due", "dateFormat": "iso" },
    "attestation": { "consented": true }
  }
}

Template tokens: ${days_until} / ${days_until_text} (e.g. "today", "in 3 days"), ${reminder_reason}, and any recipient field as ${field} (HTML-escaped). The registry CRUD, management view, intake link/QR, and consent ledger live under the Reminders tag in the API reference.

Outbound `webhook` step

The webhook step takes one of your variables, serializes it to JSON, and POSTs the bytes to a URL you control. The body is the variable's value as-is, there is no payload templating layer; if you want a custom shape, build it upstream and feed that variable in.

Config

{
  "id":   "notify",
  "type": "webhook",
  "name": "Notify ops API",
  "config": {
    "source":          "summary",                              // path to the variable to POST
    "url":             "https://ops.acme.com/orkidata-events",  // https only, http:// is rejected
    "method":          "POST",                                  // POST | PUT
    "headers":         { "X-Source": "orkidata", "Authorization": "Bearer …" },
    "outputVariable":  "ops_response",                          // optional, captures {statusCode, responseSnippet}
    "maxPayloadRows":  10000                                    // optional, caps row-count when source is a list (default 10,000)
  }
}

Free tier is gated out, webhook destinations require Entry tier or higher (egress reputation hygiene).

Authentication on your receiver

Put any auth your receiver needs (Bearer tokens, basic auth, custom keys) into the headers object. The platform doesn't add its own signature header; what you send is what your receiver sees. Headers matching authorization, api-key, token, secret are redacted from execution history so credentials don't leak through the run UI.

SSRF guard

URLs resolving to private CIDR ranges (10/8, 172.16/12, 192.168/16, 127/8, 169.254/16, IPv6 loopback / link-local / ULA) are rejected before any request goes out. http:// URLs are also rejected, HTTPS only.

Retry behavior

Tier 1 (sync, ≤10 MB), 1 attempt, no retries (the API Gateway 29 s budget can't accommodate them).
Tier 2/3 (async worker), up to 3 attempts with 1 s / 2 s / 4 s exponential backoff on connection errors and 5xx responses.
4xx responses, terminal failure, no retry. The step fails immediately.

The per-attempt timeout is 30 seconds. Final-attempt failure fails the workflow step; the response status code and body snippet land in execution history (and in outputVariable if you set one).

API reference (interactive)

Every public endpoint above (plus the rest of the surface) lives in a dedicated full-width interactive reference, try requests inline, see exact request/response schemas, copy generated curl commands.

Open API reference Download spec (.yaml)

Automate data transformations from your terminal.

Quickstart

1. Mint a Bearer token

2. First authenticated request

Authentication

Token lifecycle

Connect AI assistants (MCP)

Claude (claude.ai)

ChatGPT (developer mode)

Claude Code, IDEs & other API clients

Behavior & limits

Rate limits & tiers

Inspect your current usage

Error responses

Example: missing token

Tutorial: build & introspect a complete pipeline via API

1. List what's in your account

2. Upload a tiny CSV (data for the workflow)

3. Add the workflow to your tree

4. Save the step graph

5. Inspect the saved definition

6. Run it

7. Inspect run history (per-step results)

8. Async variant (for larger data)

9. Download the export

You can see your entire account from curl

Step types reference

Config

Example

Config

Config

Config

Config

Config

Config

Config

Config: Data Storage, specific file

Config: Data Storage, glob pattern

Config: SFTP

Config

Config

Config

Send via Gmail Enterprise

Config

Config

Config

Config

Outbound webhook step

Config

Authentication on your receiver

SSRF guard

Retry behavior

API reference (interactive)

You can see your entire account from `curl`

Outbound `webhook` step