all systems operationalv0.17.10
stech/

CLI tool sources

A CLI source plugs a command-line binary into an agent the same way an MCP source plugs in a server. The runtime forks the binary at tool-use time and feeds stdout back to the model.

What changed from the curated registry #

The previous design (#207) shipped a maintainer-curated runtime/cli-registry.json — every new CLI required a PR to stech to register the URL + SHA. Epic #220 replaced that with BYO: the org admin registers any binary by URL

  • SHA-256 + version, and the runtime fetches + verifies + installs it at agent boot. No PR to stech, no image rebake. The trade is deliberate — max flexibility, the org admin owns the trust call.

Adding a CLI source #

Three paths, same backing table.

Dashboard/settings/cli-sourcesadd source. Fields:

  • command — the binary name on PATH inside the runtime image. Must match /^[a-z][a-z0-9_-]*$/ (no dots, no slashes).
  • version — free-form, surfaced in the dashboard list. Bump it when you bump the URL + SHA.
  • label — display name on the source card.
  • binary URLhttps://… only (the api's only URL constraint is parseable + https:). The runtime invokes tar -xzf at boot, so the URL must serve a gzipped tarball — .tar.gz in practice. The form's Compute SHA button hits POST /v1/cli-sources/compute-sha (api fetches once, streams the bytes through SHA-256, returns the hex) so you don't have to curl … | sha256sum locally.
  • SHA-256 — 64-char lowercase hex.
  • extract path — relative path of the binary inside the tarball, e.g. gh_2.50.0_linux_amd64/bin/gh. No .. segments, no leading slash.
  • token — optional; injected into the binary's env at exec time. Tri-state on edit: omit to keep, send null to clear, send a string to replace.
  • allowed — optional CSV of allowed subcommand names.

CLI — parity with the dashboard:

stech cli add gh --version=2.50.0 \
  --url=https://github.com/cli/cli/releases/download/v2.50.0/gh_2.50.0_linux_amd64.tar.gz \
  --extract-path=gh_2.50.0_linux_amd64/bin/gh \
  --compute-sha
stech cli list
stech cli remove gh

--compute-sha calls the api helper and pins the result; --sha=<hex> takes a precomputed SHA. stech cli edit is not yet wired — use the dashboard for partial updates.

API — for scripts: POST /v1/orgs/:slug/cli-sources with the same body shape ({ command, version, label, binaryUrl, binarySha256, binaryExtractPath, token?, allowed? }).

Declaring CLI tools (developer) #

In agent.ts:

import { defineAgent, cli } from "@stech/agent";

export default defineAgent({
  name: "code-reviewer",
  model: "claude-sonnet-4-6",
  tools: [
    cli("gh", {
      version: "2.50.0",
      tokenEnv: "GH_TOKEN",
      tools: [
        {
          name: "gh_pr_view",
          description: "Read a GitHub pull request",
          inputSchema: {
            type: "object",
            properties: { prNumber: { type: "integer" } },
            required: ["prNumber"],
          },
          exec: (input) => [
            "pr",
            "view",
            String((input as { prNumber: number }).prNumber),
            "--json",
            "title,body,state",
          ],
        },
      ],
    }),
  ],
});

cli() brands the source. The runtime expander (runtime/src/agent/expand-cli.ts) emits one tool per declared subcommand; each tool's exec builds argv from the model's input. Default token env var is STECH_CLI_TOKEN; override with tokenEnv.

Security model #

The trust boundary is unchanged from MCP and stech deploy: an org admin already has full code-exec on their fly machines. BYO doesn't add a new privilege — it shifts the supply-chain decision from stech to the org admin. The defenses are about keeping that decision honest.

  • HTTPS + non-local hosts at validate time. The api refuses http://, refuses literal IPs in obviously-private ranges (loopback, RFC1918, link-local incl. AWS metadata 169.254.169.254, IPv6 link-local + ULA), and refuses .local / .internal / .lan suffixes.
  • DNS-resolution-time SSRF guard at fetch time. Both the api compute-sha helper and the runtime cli-bootstrap resolve the URL's hostname and reject if any returned address is private. Closes the DNS-rebinding angle the validate-time hostname check alone can't.
  • Redirect rejection. compute-sha sets redirect: "manual" and rejects 3xx outright. An attacker can't register a public URL that 302s to a private IP.
  • SHA-256 verification on every boot. cli-bootstrap downloads the bytes, hashes them, compares against the stored binary_sha256 (constant-time). Mismatch = boot fails loudly; the agent doesn't serve traffic.
  • url and sha move as a pair. PATCH rejects updating one without the other — a stale SHA against a new URL would brick verification at the next boot.
  • Tokens encrypted at rest, env-only at exec. Same AES-256-GCM posture as MCP tokens. Never returned by api routes (hasToken: true is all the dashboard sees), never logged.

How runtime cold start works #

When a fly machine boots, runtime/src/entry.ts does, in order:

  1. Fetch the agent tarball (existing artifact flow).
  2. For each entry in STECH_CLI_SOURCES: bootstrapCliSource does the DNS check, fetches the tarball, verifies the SHA, extracts under /tmp/stech-cli-cache/<sha>/, and copies the binary to /usr/local/bin/<command>. Warm restarts hit the cache and skip the download; cold starts re-fetch (/tmp is fresh on machine creation).
  3. Boot the agent process.

Failure on any source aborts boot — no partial CLI surface.

Operator runbook #

  • Register a new CLI — paste the URL into the dashboard, click Compute SHA, fill version + extract path, save. No PR to stech.
  • Binary released a new versionedit the source, change version + URL + recomputed SHA in one save. The api enforces that URL and SHA move together. The token stays.
  • Token rotated / wrongedit; tri-state token field (omit / null / replace), same UX as MCP.
  • Compute SHA stalls or 502s — usually means the URL redirects (we reject 3xx) or the upstream is slow (30s ceiling, 200MiB cap). Paste the redirect target directly, or compute the SHA locally with curl -fL <url> | sha256sum.

Per-user OAuth (auth_mode='user_oauth') #

The default auth_mode on a CLI source is org_token — one shared token in org_cli_sources.token_ciphertext is injected into the binary's env for every agent run regardless of who triggered it. That fits service-account CLIs: the agent acts as the org, with the same posture for every caller.

It does not fit CLIs whose authorization is per-user. gh, kubectl, aws, gcloud all gate by user identity — a shared token would let any agent run modify any resource that one identity has access to, regardless of who asked. For these, set auth_mode='user_oauth' and each user connects their own credentials through an OAuth flow. At agent run time the runtime injects the caller's token, not a shared one.

Choosing between org_token and user_oauth #

The decision rule is "does the CLI's auth represent the org or the human?":

  • org_tokenoctopus-review (the agent acts as the org's review service account; same posture for every reviewer who triggers a run), Superset MCP-style server tokens, internal CLIs bound to a service account.
  • user_oauthgh, kubectl, aws, gcloud, and anything else where rbac is per-user and "the agent acts as the calling human" is the right semantic.

If you'd be uncomfortable with one user's run mutating something the caller doesn't personally have access to, you want user_oauth.

Setting up a user_oauth source (admin) #

/settings/cli-sourcesadd source. The BYO kernel fields (command, version, label, binary URL, SHA-256, extract path) are unchanged from the org_token flow — fill them in the same way.

For the auth section, pick per-user OAuth and fill in the provider config:

  • OAuth authorize URL — the IdP's authorize endpoint, e.g. https://github.com/login/oauth/authorize.
  • OAuth token URL — the IdP's token-exchange endpoint, e.g. https://github.com/login/oauth/access_token. Must be https://; the api refuses http:// and runs the same SSRF guard at validate time and on every token exchange.
  • client_id — from the OAuth app you registered on the provider's developer portal.
  • client_secret — from the same place. Encrypted at rest, never returned by api routes.
  • scopes — comma- or space-separated, depending on what the provider accepts. Forwarded verbatim to the authorize URL.

To get client_id and client_secret, register an OAuth App on the provider's developer portal:

  • GitHub — Settings → Developer settings → OAuth Apps → New OAuth App.
  • GitLab — Edit profile → Applications.
  • Google Cloud — APIs & Services → Credentials → Create OAuth client ID.

Set the redirect URI on the registration to https://api.stech.com/oauth/cli/callback. Mismatched redirect URIs are the most common reason the token exchange fails (see Troubleshooting).

Save the source. It immediately shows up at /settings/cli-credentials for every member of the org as not connected — admins don't connect on behalf of users.

Connecting a source (end user) #

/settings/cli-credentials lists every user_oauth source across every org you're a member of. For each:

  1. Click Connect. The browser navigates to the api's /oauth/cli/start?source_id=<id>, which generates a CSPRNG state, stores it with a 10-minute TTL, and 302s to the provider's authorize page with redirect_uri pointing at the api's /oauth/cli/callback.
  2. Approve at the provider. The provider redirects back to /oauth/cli/callback with code + state.
  3. The api exchanges the code for an access token, encrypts it, and upserts a row in user_cli_credentials keyed on (user_id, cli_source_id). You bounce to /settings/cli-credentials?connected=<source> and a toast confirms.

After that, agent runs you trigger that fork the source's binary run with your token, not the next user's. The dashboard shows connected, the granted scopes, and the connected-at / expires-at line.

Disconnecting and reconnecting #

Disconnect on the row deletes the user_cli_credentials entry for (you, source). The source itself stays — you'll see it as not connected and can reconnect later. Other users' credentials on the same source are untouched.

When the provider's token expires, the dashboard flips the row to expired (the api compares expires_at to now in the list endpoint). The action button changes to Reconnect and points at the same /oauth/cli/start URL — the callback's upsert overwrites the existing row in place. Until you reconnect, agent runs that need that CLI fail with MissingUserCliCredentialError.

Refresh tokens are stored when the provider issues them, but auto-refresh isn't wired yet — tracked in #245. Until that lands, expired tokens require a manual reconnect.

Security model #

Same posture as MCP source tokens, with two additions for the authorize/exchange round-trip:

  • Tokens encrypted at rest. Access tokens and refresh tokens go through the AES-256-GCM facade keyed on STECH_SECRETS_MASTER_KEY. client_id and client_secret on the source row use the same facade. Plaintext never lands in api responses, logs, or audit rows.
  • State is single-use, 10-minute TTL. 32-byte CSPRNG, hex-encoded. An attacker can't brute-force a valid state inside the window, and consuming it on /callback prevents replay.
  • Multi-layer SSRF guard on the token exchange. Validate-time hostname blocklist (rejects local / RFC1918 / .internal / metadata IPs), DNS-resolution-time guard at fetch time (closes the rebinding gap), redirect: "manual" on the POST so a 3xx can't bounce the request to a private host. Same shape as the Compute SHA helper.
  • State is not trusted for authorization. /callback re-fetches the source row by id and re-checks auth_mode='user_oauth' plus the calling user's org membership. An admin who flips the source to org_token mid-flow, or removes the user from the org between click and approval, blocks the credential from landing.
  • Cross-org isolation. Tokens are scoped to (user, cli_source), and cli_sources are org-scoped, so the same human in two orgs has independent credentials with potentially different scopes. Disconnecting in one org doesn't touch the other.
  • /oauth/cli/start requires a session. Bearer / API-key auth is refused — "API key clicked Connect" has no sensible meaning, and the granted token has to bind to a real user identity.

Troubleshooting #

  • agent run failed with MissingUserCliCredentialError — the user who triggered the run hasn't connected their credentials for one of the run's user_oauth sources. Open /settings/cli-credentials and connect the source. The runtime fails closed rather than fall back to a service account, by design.
  • OAuth state expired / invalid_state — more than 10 minutes passed between Connect and the provider's redirect back, or the state was already consumed. Restart from /settings/cli-credentials.
  • Token exchange returned 401 / token_exchange_status_401 — the source's client_id / client_secret are wrong, or the OAuth App registration's redirect URI doesn't match https://api.stech.com/oauth/cli/callback. Re-check both on the provider's developer portal, then PATCH the source from the dashboard (the oauthClientSecret field is tri-state on edit — same UX as token).
  • source_misconfigured on connect — the source is auth_mode='user_oauth' but missing one of oauth_authorize_url, oauth_token_url, or the encrypted oauth_client_id / oauth_client_secret. Edit the source and re-enter the missing fields.
  • I see my source on /settings/cli-sources but not on /settings/cli-credentials — the source's auth_mode is org_token, so there's no per-user credential to connect. Switch it to user_oauth from the admin page if you actually want per-user delegation. (The reverse is also true — sources you see on the credentials page won't show on a non-admin's /settings/cli-sources.)
  • Provider error on the redirect back — provider-side errors (access_denied, app suspended, etc.) bounce you back to /settings/cli-credentials?error=<reason>. The toast renders the reason; nothing landed in user_cli_credentials.

Walk-through: registering gh as a user_oauth source #

End-to-end recipe an org admin can follow once. Once it's done, every member of the org connects their own GitHub credentials at /settings/cli-credentials and agents can call gh on their behalf.

1. Create a GitHub OAuth App #

  1. Visit github.com/settings/developersOAuth AppsNew OAuth App. (For org-bound apps, do this under Settings → Developer settings → OAuth Apps on the org page instead — same flow.)
  2. Fill in:
    • Application name — anything; users see this on the consent screen. e.g. Acme on stech.
    • Homepage URLhttps://app.stech.com
    • Authorization callback URLhttps://api.stech.com/oauth/cli/callback (or the appropriate UAT host). This must match exactly — GitHub strict-parses; trailing slashes matter.
  3. Click Register application. GitHub gives back a Client ID.
  4. Click Generate a new client secret and copy it immediately — GitHub only shows it once.

2. Run the bootstrap script #

From the api host (so STECH_SECRETS_MASTER_KEY and DATABASE_URL are already in scope via .env):

docker compose exec api sh -c '
  STECH_ORG_SLUG=acme \
  GITHUB_OAUTH_CLIENT_ID=Iv1.xxxxxxxxxxxxxxxx \
  GITHUB_OAUTH_CLIENT_SECRET=ghoxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  bun run register-gh-as-user-oauth
'

What it does:

  1. Resolves the org by slug.
  2. Fetches the gh release tarball, computes the SHA-256.
  3. Encrypts the OAuth Client ID + Secret via SecretsCrypto.
  4. Upserts an org_cli_sources row with auth_mode='user_oauth', command='gh', the OAuth provider URLs, and the encrypted credentials. Idempotent — re-runs update the row in place.

Optional env:

  • GH_CLI_VERSION — default v2.69.0
  • GH_CLI_SCOPES — default repo,read:user (comma-separated)

3. Each member connects their account #

Members visit /settings/cli-credentials, find the gh row marked not connected, click Connect → GitHub authorize page → approve → redirected back. Their token lives encrypted in user_cli_credentials; agents called by them use that token.

4. Use it from agent.ts #

import { defineAgent, cli } from "@stech/agent";

export default defineAgent({
  name: "support",
  model: "claude-sonnet-4-6",
  tools: [
    cli("gh", {
      // The token comes from the calling user's credential at run
      // time. Don't pass `tokenEnv` — `gh` reads `GH_TOKEN`
      // automatically and the runtime sets it to the per-user token.
      allowed: ["api", "issue list", "pr list"],
    }),
  ],
});

Updating the gh version later #

Re-run the bootstrap script with a different GH_CLI_VERSION. The SHA gets recomputed; the row updates atomically. Already-connected user credentials are unaffected (they don't depend on the binary version).

Limitations (v1) #

  • Linux-x64 only — the runtime image is alpine.
  • Tarballs only — package raw binaries as tar -czf before registration.
  • One exec per tool_use — no streaming stdout to the model mid-call.
  • No auto-refresh of OAuth tokens. Refresh tokens are persisted but the worker that consumes them is tracked in #245; expired tokens require a manual reconnect today.
  • Persisted conversations — the streaming chat pane narrates CLI tool calls inline.
  • Magic-link sign-in — same encrypted-token-in-env pattern on the agent side.
  • Webhooks — agent runs that fork these binaries fire agent_run.completed / agent_run.failed like any other run.