Secrets Management¶
AI systems have a broader secrets surface than traditional applications. Beyond standard credentials (database passwords, API keys, service tokens), AI workloads introduce model endpoint credentials, data store access tokens, experiment tracker API keys, model registry credentials, and embedding service tokens.
Each secret is a potential attack vector. Compromise one credential, and the blast radius depends on what that credential can access.
The AI secrets surface¶
| Secret type | Where it is used | What compromise enables |
|---|---|---|
| Model API keys | Inference calls to cloud AI services | Unauthorised model access, cost abuse, data exfiltration through prompts |
| Model registry credentials | Pushing/pulling models from registries | Model tampering, supply chain attacks, model theft |
| Training data credentials | Accessing training datasets | Data exfiltration, data poisoning |
| Vector database credentials | RAG pipeline data access | Knowledge base poisoning, data exfiltration |
| Experiment tracker tokens | Logging experiments and metrics | IP theft (hyperparameters, results), experiment manipulation |
| Embedding service keys | Generating embeddings for RAG | Cost abuse, data exfiltration through embedding inputs |
| Pipeline service accounts | CI/CD and MLOps automation | Full pipeline compromise, model and data manipulation |
Core principles¶
Never in code, never in config files¶
Secrets must not appear in:
- Source code (obvious, but still happens)
- Configuration files committed to version control
- Container images or Dockerfiles
- IaC templates (Terraform, CloudFormation, etc.)
- Jupyter notebooks (cell outputs can contain credentials)
- Experiment tracking logs (hyperparameters sometimes include credentials)
- Model training scripts checked into repositories
Use a secrets manager¶
All secrets should be stored in a dedicated secrets manager and injected at runtime:
| Platform | Secrets manager | Notes |
|---|---|---|
| AWS | Secrets Manager, Parameter Store | Use Secrets Manager for rotation; Parameter Store for static config |
| Azure | Key Vault | Integrates with Azure AI services via managed identity |
| GCP | Secret Manager | Integrates with Vertex AI via service accounts |
| Self-hosted | HashiCorp Vault, CyberArk | Requires operational investment but provides full control |
Least privilege for every credential¶
Each credential should grant the minimum access required:
- Inference-only credentials should not allow model upload or modification
- Training job credentials should access only the specific training data needed
- Pipeline credentials should be scoped to the specific actions the pipeline performs
- Human user credentials should not be used for automated processes
Rotate regularly¶
AI secrets are often long-lived because rotation is seen as operationally complex. This is a risk.
- Automate rotation where the secrets manager supports it
- Set rotation schedules appropriate to the sensitivity (model API keys quarterly at minimum)
- Test that rotation does not break running services before deploying to production
- Monitor for use of revoked credentials
Common pitfalls¶
Notebook credentials¶
Jupyter notebooks are the most common place AI credentials leak. Developers connect to data stores, model endpoints, and experiment trackers from notebooks, often hardcoding credentials for convenience.
Mitigations:
- Use environment variables or IAM roles, never hardcoded credentials
- Clear notebook outputs before committing to version control
- Use pre-commit hooks that scan for secrets (tools:
detect-secrets,gitleaks,truffleHog) - Configure notebook environments to pull credentials from a secrets manager
Experiment tracking leaks¶
Experiment trackers (MLflow, Weights & Biases, Neptune) log parameters, metrics, and artefacts. Credentials sometimes end up in logged parameters.
Mitigations:
- Review what is logged before enabling automatic parameter logging
- Filter sensitive keys from parameter logging
- Restrict access to experiment tracking data
Pipeline credential sprawl¶
As AI pipelines grow, credentials accumulate. Each new data source, model endpoint, or service adds a credential. Without discipline, nobody knows what credentials exist, what they access, or when they were last rotated.
Mitigations:
- Maintain an inventory of all credentials used in AI pipelines
- Assign an owner to each credential
- Audit credential usage regularly
- Remove unused credentials promptly
Shared credentials between environments¶
Development and production share the same model API key, the same data store credentials, or the same experiment tracker token. A compromise in development compromises production.
Mitigation: Separate credentials per environment. No exceptions.
Secrets scanning is not optional
Run secrets scanning in your CI/CD pipeline and as pre-commit hooks. Tools like gitleaks, detect-secrets, and truffleHog catch most accidental credential commits. A secret committed to version control is a secret that exists in Git history forever, even after deletion.