CI artifacts and logs: what to keep, and what not to upload to S3 as-is
Retention, PII, test dumps, screenshots, junit.xml, and env archives: how to store CI artifacts safely without turning S3 into a secret archive.
CI artifacts help debug failures: report.html, junit.xml, screenshots, coverage, crash dumps. But if you upload everything to S3 as-is, artifacts quickly become an archive of secrets and personal data.
This guide is for teams that already store artifacts but have not formalized retention, access, and cleanup.
Log vs artifact
A log is the text output of a job. An artifact is a file the job saves after it finishes: a report, archive, dump, or screenshot. Artifacts live longer than logs and are often accessible to more people, so the risk is higher.
What is usually safe to keep
- junit/xml reports without payloads;
- coverage summaries;
- build metadata;
- sanitized screenshots;
- short failure snippets;
- SBOM and dependency reports without secrets.
What not to upload as-is
.env, kubeconfig,.npmrc,.pypirc;- database dumps;
- full request/response payloads;
- screenshots with user data;
- debug archives;
- Docker config and cloud credentials;
- raw logs with tokens.
Retention and access
Ask this for every artifact type:
- who needs it;
- how many days it is needed;
- whether it can be regenerated;
- whether it contains PII or secrets;
- whether it needs a public URL.
For many CI reports, 7–14 days is enough. Compliance artifacts may need different rules, but they should go through a separate cleanup path.
Where Exlogare fits
RCA usually does not need the whole artifact bucket. A short sanitized failure context is enough: final log lines, exit code, job name, and pipeline URL. Exlogare analyzes that context and posts an explanation in the MR/PR. Start with generic ingest, and see Security for data-handling principles.
Related reading
- How to send CI logs to an external service without leaking secrets
- CI log secrets: 10 places where teams usually forget them
Checklist
- Every artifact type has retention.
- Raw dumps are not published automatically.
- PII and secrets are redacted.
- Bucket access is restricted.
- Public links are disabled by default.
- RCA receives a short sanitized context, not the entire archive.