all systems operationalv0.17.10
stech/

Audit log

Org-scoped, read-only view over every privileged action, every SCIM call, and every webhook delivery in your org. Three event sources, three tabs, one CSV export, one filter convention. Cursor-paginated, default 30-day window, RFC 4180 export.

Why audit logs #

SOC 2 / GDPR / ISO 27001 evidence — auditors ask "show me every privileged change in the last 90 days" and the CSV export is the answer. Post-incident forensics — "who removed Alice on Tuesday and when?". Provisioning sanity — "is the IdP actually pushing user updates, or did it stop without telling us?".

What gets logged #

Three append-only tables, three different writers, three tabs on the viewer.

Event source Trigger Where it shows up
admin_actions every privileged write — sso_providers / mcp_sources / cli_sources / sso_jit toggle / scim_token mint / webhook_endpoint mutation / etc. /settings/audit?tab=admin
scim_event_log every SCIM IdP call — Users + Groups CRUD, list, get, bearer-auth check (every row, including 4xx auth failures) /settings/audit?tab=scim
webhook_deliveries every webhook delivery attempt by the worker (initial POST + every retry) /settings/audit?tab=webhooks

Three tables, three tabs, by intent. Not every audit-shaped event lives in admin_actions: SCIM IdP traffic doesn't fit the admin_user_id-centred shape (no human actor, the IdP itself is the caller), and the webhook dispatch surface is its own log altogether (per-endpoint retry state, response codes, no human actor either). The viewer keeps them on separate tabs because the schemas and the natural filter sets don't overlap — mixing them in one render lands somewhere between "messy" and "useless".

Where to look #

Three surfaces, in decreasing scope:

  • /settings/audit — org admin viewer. Any-member access (audit transparency to all members is a deliberate product choice — mutations remain elevated-role gated at their source). Three tabs: admin, scim, webhooks.
  • /settings/webhooks/<id>/deliveries — per-endpoint delivery detail. Same data as the webhooks tab on /settings/audit, but scoped to one endpoint and with the truncated response body visible. See webhooks.md.
  • /admin/audit — platform admin viewer. Cross-org, only used by stech-internal staff. Org admins can't reach it — for support cases you can ask us to look up rows you don't have visibility into.

API reference #

All endpoints share an auth gate — bearer token (CLI / scripts) or session cookie (dashboard) — and resolve the caller's org from :slug. Cross-org isolation: every query binds organization_id = <resolved org>. An API key for org A can't query org B's audit even with raw cursor manipulation.

Errors: { "error": "<code>", "detail"?: "<message>" }. 401 unauthorized on missing / invalid auth, 403 forbidden when the caller is not a member of the resolved org, 400 invalid_tab / 400 csv_export_too_large from the CSV route.

GET /v1/orgs/:slug/audit #

admin_actions list — every privileged write in the org. Default window: last 30 days when neither from nor to is supplied.

Query param Type Default Meaning
limit int 50 Page size, clamped to [1, 200]. Malformed → 50.
cursor base64 Opaque {ts, id} cursor; absent → first page.
from ISO 8601 to - 30d Lower bound on created_at (inclusive).
to ISO 8601 now Upper bound on created_at (exclusive).
action csv Comma-separated action filter; trailing * is prefix wildcard (e.g. webhook_endpoints.*).
actorUserId string Filter to a specific actor's user id.

Response:

{
  "events": [
    {
      "id": "ev_3z9...",
      "action": "webhook_endpoints.secret_rotated",
      "actor": {
        "userId": "usr_9q1...",
        "name": "Alice Chen",
        "email": "[email protected]"
      },
      "createdAt": "2026-05-08T14:22:08.554Z"
    }
    // ...
  ],
  "nextCursor": "eyJ0cyI6Ij...",         // null on the last page
  "aggregations": {
    "totalEvents": 4218,
    "uniqueActors": 12,
    "topAction": { "action": "webhook_endpoints.secret_rotated", "count": 41 }
  },
  "window": { "from": "2026-04-08T...", "to": "2026-05-08T..." }
}

actor is null only if the underlying user row was deleted (FK is restrict today so this is essentially impossible at the DB layer; the null-tolerant render preserves forward-compat with a future set null relaxation). Aggregations are computed against the same WHERE as the row fetch minus the cursor — totals don't shrink as you page deeper.

curl -fsSL "https://api.stech.com/v1/orgs/$ORG/audit?action=webhook_endpoints.*&limit=100" \
  -H "Authorization: Bearer $STECH_API_KEY"

GET /v1/orgs/:slug/audit/scim #

scim_event_log list — every SCIM IdP call. Same cursor / window / pagination conventions.

Query param Type Default Meaning
limit int 50 Page size, clamped to [1, 200].
cursor base64 {ts, id} cursor.
from ISO 8601 to - 30d Inclusive.
to ISO 8601 now Exclusive.
resourceType csv Subset of User, Group, ServiceProviderConfig, Schemas, ResourceTypes. Unknown tokens dropped.
operation csv Subset of create, update, replace, patch, delete, list, get.
status csv Mix of class buckets (2xx / 4xx / 5xx) and exact codes (404, 200, ...).
scimTokenLabel string Exact match against the SCIM token's label. Deleted-token rows excluded when this is set.

Response:

{
  "events": [
    {
      "id": "se_7m1...",
      "resourceType": "User",
      "operation": "patch",
      "resourceId": "usr_9q1...",       // null for list/metadata ops
      "statusCode": 200,
      "tokenLabel": "Okta production",  // null when token row was deleted
      "durationMs": 47,                  // null tolerated for forward-compat
      "createdAt": "2026-05-08T14:15:09.001Z"
    }
    // ...
  ],
  "nextCursor": null,
  "aggregations": {
    "total": 1842,
    "byResourceType": { "User": 1612, "Group": 230 },
    "byOperation": { "patch": 988, "create": 412, "list": 348, "get": 94 },
    "byStatusClass": { "2xx": 1799, "4xx": 41, "5xx": 2 },
    "errorRate": 0.023                   // (4xx + 5xx) / total; 0 when total === 0
  },
  "window": { "from": "...", "to": "..." }
}

request_body and response_body are never in the response — see Privacy posture below. tokenLabel is null when the token row was deleted out from under the audit row (the FK is set null so the row survives revocation).

curl -fsSL "https://api.stech.com/v1/orgs/$ORG/audit/scim?resourceType=User&status=4xx,5xx" \
  -H "Authorization: Bearer $STECH_API_KEY"

GET /v1/orgs/:slug/audit/webhook-deliveries #

webhook_deliveries list — every dispatch attempt by the worker, across all endpoints in the org. The per-endpoint view at /v1/orgs/:slug/webhook-endpoints/:id/deliveries (webhooks.md) stays — that's the per-endpoint roll-up; this is the org-wide cross-endpoint view.

Query param Type Default Meaning
limit int 50 Page size, clamped to [1, 200].
cursor base64 {ts, id} cursor.
from ISO 8601 to - 30d Inclusive.
to ISO 8601 now Exclusive.
status csv Subset of pending, delivered, failed, gave_up.
eventType csv Comma-separated event types; trailing * is prefix wildcard (e.g. deployment.*).
responseClass csv Subset of 2xx, 4xx, 5xx, none (see below).
endpointId string Exact match against webhook_deliveries.endpoint_id.

Response:

{
  "events": [
    {
      "id": "wd_4kp...",
      "endpointId": "whe_3z9...",
      "endpointUrl": "https://hooks.acme.example/stech",   // null if endpoint row vanished
      "endpointDescription": "prod alerting",
      "eventType": "deployment.failed",
      "eventId": "0d8c5e44-3e2a-4f8d-9d2e-9b6f8a1f1e25",   // server-minted, stable across retries
      "status": "delivered",
      "attemptCount": 1,
      "lastResponseStatus": 200,                            // null on pending / network error
      "responseClass": "2xx",                               // 2xx | 4xx | 5xx | none
      "deliveredAt": "2026-05-08T14:02:11.482Z",            // null on non-delivered
      "createdAt": "2026-05-08T14:02:10.998Z"
    }
    // ...
  ],
  "nextCursor": null,
  "aggregations": {
    "total": 4218,
    "byStatus":   { "delivered": 4170, "pending": 12, "failed": 28, "gave_up": 8 },
    "byEventType": { "deployment.failed": 41, "agent_run.completed": 4112, "...": 65 },
    "byResponseClass": { "2xx": 4170, "4xx": 36, "5xx": 0, "none": 12 },
    "deliveredRate": 0.9886,                                // delivered / total
    "uniqueEndpoints": 7
  },
  "window": { "from": "...", "to": "..." }
}

responseClass is derived: delivered2xx (always — even on a re-fired delivery whose stale last_response_status would otherwise disagree); 400-499 → 4xx; 500-599 → 5xx; everything else (network error / DNS / TLS / timeout / pending / null) → none. The server and client share the same derivation so there's no drift between the JSON and the UI badge.

curl -fsSL "https://api.stech.com/v1/orgs/$ORG/audit/webhook-deliveries?status=failed,gave_up" \
  -H "Authorization: Bearer $STECH_API_KEY"

GET /v1/orgs/:slug/audit.csv #

Single CSV route that switches on ?tab=. One auth gate, one filename helper, one error path. Filter contract per tab mirrors the JSON endpoints exactly so a customer copy-pasting the page URL into a CSV URL gets the same row set.

Query param Type Default Meaning
tab enum admin admin / scim / webhooks. Unknown → 400 invalid_tab. Empty / absent → admin.
limit int 1000 Capped at MAX_CSV_ROWS = 50000. Beyond → 400 csv_export_too_large.
from / to ISO 8601 last 30 days Same inclusive-exclusive window as the JSON endpoints.
action / actorUserId When tab=admin. Same shape as the JSON endpoint.
resourceType / operation / status / scimTokenLabel When tab=scim.
status / eventType / endpointId / responseClass When tab=webhooks.

Response:

  • Content-Type: text/csv; charset=utf-8
  • Content-Disposition: attachment; filename="audit-<tab>-<orgSlug>-<YYYY-MM-DD>.csv" (the date suffix is the from lower bound, or today's UTC date when from is absent).
  • Body: header row + N data rows, RFC 4180 escaping (", ,, \n, \r quoted; embedded " doubled), CRLF line endings, trailing CRLF.
curl -fsSL -OJ \
  "https://api.stech.com/v1/orgs/$ORG/audit.csv?tab=admin&from=2026-04-08T00:00:00Z" \
  -H "Authorization: Bearer $STECH_API_KEY"
# writes audit-admin-acme-2026-04-08.csv via the server's filename

-OJ tells curl to honor the Content-Disposition filename and write the file with the server-provided name.

CSV export #

Stable header columns per tab. Appending columns to the end is a non-breaking change; reordering or renaming would silently break downstream pipelines, so we don't.

tab=admin (audit-admin-<slug>-<date>.csv):

event_id,created_at_iso,action,actor_user_id,actor_name,actor_email

tab=scim (audit-scim-<slug>-<date>.csv):

event_id,created_at_iso,resource_type,operation,resource_id,status_code,duration_ms,token_label

tab=webhooks (audit-webhooks-<slug>-<date>.csv):

event_id,created_at_iso,endpoint_id,endpoint_url,event_type,scim_event_id,status,attempt_count,last_response_status,response_class,delivered_at_iso

scim_event_id on the webhooks export is the server-minted webhook event uuid (the same one receivers dedupe on — see webhooks.md), not the SCIM event id; the column name is historical.

MAX_CSV_ROWS is 50000. Beyond that we return 400 csv_export_too_large with detail audit CSV export exceeds 50000 rows; narrow the date range rather than streaming an unbounded export that could OOM the api or your analysis tool. Narrow from / to and retry.

Empty result set → header row + trailing CRLF only (so the file isn't zero bytes — useful as a template). Nullable cells render as the empty string, not the literal word null.

Privacy posture #

The list views and CSV exports are metadata-only. We never include the bytes that could carry secrets / PII at the writer's discretion:

  • admin_actionsbefore_json, after_json (full state diffs at write time; some action shapes carry secret hints, e.g. partial api key fingerprints).
  • scim_event_logrequest_body, response_body (full SCIM payloads with email / name / IdP-side custom attributes).
  • webhook_deliveriespayload, last_response_body (the event body sent to the receiver + the receiver's response body).

Customers who want full per-row drilldown (with the redacted columns) can request that via support; we surface it manually for a specific row on demand. A self-serve drilldown is deferred until customers ask. The audit goal is what happened + who triggered it + what status — not "what every byte was".

Cross-org isolation #

Every query binds organization_id = <resolved org>, where the org id comes from the slug + caller membership lookup, not from any caller-supplied value. The cursor only carries {ts, id} — it cannot widen scope. The cross-org isolation tests in api/tests/audit-list.test.ts / audit-scim.test.ts / audit-webhook-deliveries.test.ts exercise this directly.

The webhook deliveries tab is a special case: webhook_deliveries has no organization_id column, so the org-bind lives on the JOIN to webhook_endpoints.organization_id. A delivery whose endpoint is from a different org cannot match even via a leaked id — the JOIN filters it out before any rows return.

Filter conventions #

Pulled from the actual code; same conventions as observability (/agent-runs/metrics).

  • Date bounds are inclusive-exclusive. from is inclusive (events at exactly from are surfaced); to is exclusive (events at exactly to are NOT surfaced). Defaults: to = now, from = to - 30d. A malformed bound falls back to the default rather than 400-ing.
  • Action / event-type prefix wildcards. A trailing * flips a token to prefix mode: webhook_endpoints.* matches webhook_endpoints.created / updated / deleted / secret_rotated / etc. A bare * is dropped (it's a no-op LIKE).
  • Status classes vs exact codes. The SCIM status filter accepts a mix: 2xx / 4xx / 5xx are class buckets; three-digit values (200, 404, 500) are exact matches. Anything else is dropped.
  • responseClass=none. On the webhooks tab, none covers in-flight pending rows AND finished rows that never got an HTTP response back (network error / DNS / TLS handshake failure / timeout). It's the catch-all bucket for "no usable response code".
  • Empty parsed-filter set = match nothing. A ?action= whose every token is bare * / whitespace / unknown short-circuits to an empty result, NOT to "ignore the filter". Mirrors the agent-runs convention so a typo'd filter doesn't silently widen the result set.
  • Cursor pagination over offset. Stable for large windows; doesn't double-count rows that land mid-pagination. The cursor encodes (createdAt, id) so duplicate timestamps tiebreak deterministically. Filter changes drop the cursor (the page resets to row 1) — a stale cursor against a different filter window would point into nowhere.

Compliance use cases #

  • SOC 2 evidence pull. At audit time, export 90 days of admin and SCIM events:
    curl -fsSL -OJ "https://api.stech.com/v1/orgs/$ORG/audit.csv?tab=admin&from=2026-02-08T00:00:00Z" \
      -H "Authorization: Bearer $STECH_API_KEY"
    curl -fsSL -OJ "https://api.stech.com/v1/orgs/$ORG/audit.csv?tab=scim&from=2026-02-08T00:00:00Z" \
      -H "Authorization: Bearer $STECH_API_KEY"
    RFC 4180 + Excel-compatible. Hand the CSVs to the auditor.
  • "Who removed Alice last Tuesday?" Open /settings/audit?tab=admin&action=memberships.deleted&from=2026-04-29 — the row + actor (name, email, user id) is the answer.
  • "Is SCIM provisioning working?" Open /settings/audit?tab=scim&from=<7d ago>. If errorRate is below 5% and byOperation is dominated by create / patch / replace, healthy. A spike in 4xx against delete usually means the IdP is re-pushing a deactivated user into a different group — not a stech problem, but visible here.
  • Live-alerting on flagged actions. This viewer is for retrospective forensics. For real-time, subscribe a webhook endpoint to the audit.flagged event type — the curated subset of admin actions worth alerting on (token revocation, OAuth disconnect, source deletion, SSO updates, webhook secret rotation, plan changes, deployment supersede). See webhooks.md.

Limitations #

  • No real-time streaming. Use webhooks (audit.flagged) for live alerting; this viewer is for retrospective lookups.
  • No long-term archival yet. Rows live in postgres and are subject to the same retention as the rest of the org's data. WORM-backed archival export is filed under #108.
  • No full state-diff in list views. before_json / after_json / request_body / response_body / payload / last_response_body stay redacted from list + CSV. Per-row drilldown is a future surface; customers asking land on support today.
  • System-event audit gap. Non-human actors (the webhook worker's own auto-disable bookkeeping, etc.) don't fan out to webhooks by design and don't always land in admin_actions. Tracked in #266.
  • No charts or heatmaps in the UI. Pull the CSV and pivot in your tool of choice. The aggregations the API returns (byStatusClass, byOperation, deliveredRate, ...) are also exposed for scripting.

Troubleshooting #

I see fewer rows than I expected. Check the date range — the default window is 30 days. The aggregations strip echoes the resolved window so you can confirm what was used. Then check the active filter chips / inputs — clear resets them.

Actor shows as null / "(deleted)". The user row was deleted (possible only after a future set null relaxation; today the FK is restrict). The historical action still references the (now-null) actor for compliance evidence — we keep the audit row even though the attribution is gone. Same posture for tokenLabel on the SCIM tab when the SCIM token was revoked + deleted.

CSV export returns 400 csv_export_too_large. Your row set exceeds 50000. Narrow from / to (or pin a tighter filter), then retry. We deliberately fail closed rather than silently truncate — a truncated export of "all admin actions in the last year" is worse than a clear "narrow the window" message.

I want to see what specifically changed in this admin action. The list view doesn't expose before_json / after_json — see Privacy posture. For now, cross-reference the affected resource's current state against an external snapshot. A self-serve per-row drilldown surface is on the roadmap (no shipped issue yet).

The webhooks tab shows a delivery I redelivered with two rows of different ids but the same event_id. That's intended. Redeliver clones the original delivery row with a fresh delivery id (status = pending, attempt_count = 0, next_attempt_at = now()) but preserves the event_id and the payload bytes. Receivers dedupe on event_id, not delivery id. See webhooks.md § Idempotency.

  • Webhooksaudit.flagged envelope, signing scheme, retry policy. The audit viewer is for retrospective forensics on dispatch failures; subscribe to audit.flagged for live alerting on the curated subset of admin actions.
  • Observability — when an agent crosses the failure-rate threshold the watchdog fires audit.flagged; this log is where you go to ask "who triggered the agent run that crossed it?" after the fact.
  • Billing and usage — when an org crosses a soft / hard usage cap the cost-control worker fires audit.flagged (data.kind = 'usage_soft_cap' / 'usage_hard_cap'); this log is where the corresponding agent_runs.cap_refused admin_actions surface for retrospective forensics.