Octocode is built for agent workflows where the context window can fill with secrets, tokens, and untrusted paths. Its core security principle:
Every byte that reaches the model is scanned and redacted first. Secrets are stripped on the way in (tool inputs) and on the way out (tool results) — they never reach the LLM or error messages.
Security enforcement is centralized in @octocodeai/octocode-engine: native Rust scanning/minification plus the engine's TypeScript validation wrappers, registries, and path/command guards. The same package is used by the MCP server and the CLI.
Every tool call — over MCP or CLI — follows the same security-first path:
client args
→ validate + sanitize INPUTS (withSecurityValidation → ContentSanitizer.validateInputParameters)
→ run tool (GitHub API / local FS / LSP)
→ sanitize FETCHED CONTENT on read (ContentSanitizer.sanitizeContent per file / response)
→ mask OUTPUT at the boundary (sanitizeCallToolResult / ContentBuilder → ContentSanitizer)
→ compact YAML/JSON result + hints[]
Redaction happens at three points, not one:
| Stage | Where | What it protects against |
|---|---|---|
| Input | withSecurityValidation wraps every tool handler |
A secret pasted into a query argument being echoed back |
| Content | Each reader (localGetFileContent, ripgrep, structural search, binary inspect, find, view-structure, GitHub code/file fetch, npm) sanitizes content as it is read |
A .env, key file, or repo file with embedded credentials being surfaced verbatim |
| Output | callToolResult scans and masks every returned text item |
Any secret that slipped through earlier stages reaching the model |
When content contains secrets, they are masked and the result carries a warning, e.g. Secrets detected and redacted: <types>, so the agent knows redaction occurred rather than silently seeing altered text.
The scanner (RegexSet-based, compiled in Rust, with a TypeScript fallback from the same canonical pattern list) covers 300+ provider-specific credential patterns plus generic high-risk formats:
- Cloud: AWS (access key ID, secret access key, session token, ARNs, account IDs), Azure (AD client secret, storage/Cosmos/Service Bus connection strings, subscription IDs, OpenAI), GCP / Google API keys, Alibaba Cloud.
- AI providers: OpenAI, Anthropic, Azure OpenAI, Amazon Bedrock, AI21, AssemblyAI, and more.
- SaaS / dev tools: GitHub (PAT, fine-grained, OAuth, app), Slack, Stripe, npm, Atlassian, Asana, Airtable, Algolia, Auth0, Adyen, 1Password, Artifactory, and many others.
- Generic / structural: JWTs, PEM / private keys, SSH keys, bearer tokens, passwords in URLs, database connection strings (Postgres, MySQL, MongoDB, Redis, CouchDB, Elasticsearch),
.NETconnection strings, and high-entropy strings.
Detected values are replaced with a masked placeholder; the original is never returned.
Local filesystem access is bounded by a multi-layer validator before any read:
- Resolve relative input from
WORKSPACE_ROOT/ configured workspace root /cwd, then normalize it (resolve./.., expand~). - Prefix-check against the engine's allowed roots: the user's home directory by default,
ALLOWED_PATHSentries, and roots registered by Octocode itself (usingpath + separator, so a sibling like/repo-evilcannot bypass/repo). - Apply the ignore filter (sensitive files/dirs — keys,
.env, credential stores — are blocked by default). - Resolve symlinks with
realpath, then re-validate the real target (symlink escapes are caught).
ALLOWED_PATHS adds roots; an empty list is not unrestricted. The default validator includes HOME, while stricter embedded/test validators can opt out with includeHomeDir:false. MCP local tools are still disabled unless enabled by config; the CLI local surface is enabled by default but uses the same validator.
The ignore filter rejects known secret-bearing paths wherever they appear (a blocked path returns a redacted error, never contents). It matches on both file-name patterns and directory patterns:
- Keys and certs:
*.pem,*.key,*.crt,*.cer,*.csr,*.p12,*.pfx,*.jks,*.keystore,*.ppk; SSH material (id_rsa/id_dsa/id_ecdsa/id_ed25519and.pubvariants,*_rsa,*_ed25519,authorized_keys,known_hosts,.ssh/,.ssh/config). - Credentials and tokens:
.envand.env.*,.netrc/_netrc,.npmrc,.pypirc,.dockercfg,.pgpass,.my.cnf,.git-credentials,.htpasswd,*_token/.token/access_token/refresh_token/bearer_token,client_secret*.json,oauth2*.json,auth.json,*service-account*.json,application_default_credentials.json,.npm-token,.slack-token,.github-token, vault pass files. - Cloud and infra:
.aws/(credentials,config),.azure/,.config/gcloud/,.kube//kubeconfig,.docker/,.terraform/,*.tfstate,*.tfvars,.s3cfg. - OS and application secret stores:
.git/,secrets/,private/,.password-store/; browser login data and key DBs (Chrome/Chromium, Firefox), OS keychains (Library/Keychains/,*.keychain), password managers (*.kdbx,*.kdb, 1Password vaults); shell history (.bash_history,.zsh_history,.*_history, DB shell histories); crypto wallets (wallet.dat,.bitcoin/,.ethereum/,.electrum/); OS account stores (shadow,SAM,NTUSER.DAT); core dumps and DB dumps. - Misc app secrets:
wp-config.php,google-services.json,GoogleService-Info.plist,*.mobileprovision,.idea/dataSources.xml, Mavensettings.xml,gradle.properties,*.ovpn,wireguard.conf,secrets.yml,master.key, andPASSWORDS.md/SECRETS.md/CREDENTIALS.md.
The canonical lists are IGNORED_FILE_PATTERNS and IGNORED_PATH_PATTERNS in the engine (src/security/filePatterns.ts, src/security/pathPatterns.ts).
- Normal local text search runs ripgrep in-process inside
octocode-engine; structural search and file enumeration also run through engine/file APIs rather than user-provided shell strings. - Where Octocode invokes external helpers, command names and arguments are allowlisted. The base command allowlist is
rg,grep,find,ls, and restrictedgit(clone/sparse-checkoutonly); binary/archive modes register a fixed set of format and archive helpers such asfile,tar,unzip,7z, and decompression readers. - External commands run via
child_process.spawn()with an argument array — neverexecwith a shell string — and dangerous shell metacharacters are rejected before execution. include/exclude/excludeDirare glob patterns, not paths, and cannot escape the validated search root.
- GitHub auth resolves in priority order:
OCTOCODE_TOKEN→GH_TOKEN→GITHUB_TOKEN, then encrypted on-disk Octocode OAuth credentials, then theghCLI token. - On-disk OAuth credentials are stored AES-256-GCM encrypted under
OCTOCODE_HOME. - Tokens are read from the environment / secure store at request time and are themselves subject to output masking — they are never echoed in results.
- See Authentication and Credentials Architecture.
ContentSanitizer also bounds untrusted input shape: strings are capped (~10K chars), arrays (~100 items), and object nesting (~20 levels). Agent-facing numeric ranges (depth, context lines, limits, pagination offsets) are clamped at the schema layer.
Octocode protects the agent context boundary — what flows between untrusted code/content and the model. It does not replace repository secret-scanning, OS-level sandboxing, or network egress controls; run those alongside it.
To report a vulnerability, open a private advisory on the repository rather than a public issue.