Real sed patterns for log analysis: extract errors, filter time ranges, anonymize PII, parse …
Sed in CI/CD Pipelines: Safe Patterns for GitHub Actions and Jenkins Sed in CI/CD Pipelines: Safe Patterns for GitHub Actions and Jenkins

Summary
The 2 AM Pipeline Failure That Taught Me to Distrust sed
I once worked on a Jenkins pipeline that promoted release candidates from staging to production. One step of the pipeline ran one line of bash:
sed -i 's/replicas: 2/replicas: 6/g' k8s/deployment.yaml
It worked perfectly for a long time. Then late one night, a flaky network call between Jenkins and the artifact registry caused an earlier step to fail. The team retried the job from the failed step. Jenkins, helpfully, also re-ran the sed step. The first run flipped replicas: 2 to replicas: 6. The second run searched for replicas: 2, found nothing, and exited 0. The next run (someone hit retry again) did the same.
That part was fine. The actual incident came later when a junior engineer copied the same pattern into a different stage that updated image: app:1.0 to image: app:1.1. The retry semantics were the same, but this time a hotfix had already been merged that bumped the image to app:1.2. The retry replaced app:1.0 with app:1.1 — except app:1.0 was no longer in the file. Silent no-op. The pipeline went green. Production rolled forward to the wrong image. Customer transactions started 502’ing.
The bug was simple. The fix took most of a working day, mostly because nobody believed sed could “fail successfully.”
CI is not the terminal. In the terminal you run a sed command, look at the output, and move on. In CI you have re-runs, retry-on-flaky plugins, and webhooks that re-trigger workflows. A sed command that “works” on your laptop will run multiple times across a pipeline’s lifecycle. Every single one of those runs has to do the right thing or do nothing — never something different.
This article is the playbook I wrote after that post-mortem. I’ve since used the same patterns across many Jenkins pipelines, a sprawling set of GitHub Actions services, and a fleet of CodeBuild projects. Six rules, in order of how often I see them violated.
Expand your knowledge with Self-Healing Bash: Functions That Recover From Failures
1. Idempotency: The First Rule
Running the same sed command N times against the same file should produce the same result every time. If it doesn’t, you don’t have a deploy script — you have a coin flip.
The naive pattern that broke that pipeline:
sed -i 's/version: 1/version: 2/' app.yaml
This works once. The first run finds version: 1 and replaces it with version: 2. The second run finds nothing and exits 0. If the second run was expected to perform the substitution (because of a retry, or because the file was reset by a previous step), you’ve silently shipped the wrong config.
The fix is to anchor on the field name, not the value:
sed -i -E 's/^version: [0-9]+$/version: 2/' app.yaml
Now any value of version: becomes version: 2. Run it once, twice, or many times — the file always lands in the same state. An app.yaml containing version: 7 becomes version: 2 on the first run and stays at version: 2 on every retry.
The same logic applies to every “set this field to that value” pattern. Anchor the regex to the key, the line start, or both. Match anything for the value. Replace with the desired value:
# Idempotent: replicas always becomes 6
sed -i -E 's/^([[:space:]]*replicas:)[[:space:]]+[0-9]+/\1 6/' deployment.yaml
# Idempotent: image tag always becomes the value of $TAG
sed -i -E 's|^([[:space:]]*image:[[:space:]]+app:)[^[:space:]]+|\1'"$TAG"'|' deployment.yaml
If you can’t write an idempotent sed, you probably want a templating tool (Helm, Kustomize, envsubst) instead. Sed is for surgical edits, not for managing state.
Deepen your understanding in Build and Deploy a Go Lambda Function
2. Exit Codes and set -e Don’t Save You
Most CI shells run with set -e so the job fails on the first non-zero exit code. People assume this protects them from sed failures. It does not.
Sed returns 0 even when the substitution matched zero lines. There is no --fail-if-no-match flag. From sed’s point of view, “I read the file, applied your script, found nothing to change” is a successful run.
This is the silent failure mode that put the wrong image into production in the incident I opened with. The retry ran sed against a file where the source pattern no longer existed, sed shrugged and exited 0, the pipeline carried on.
The pattern I now put at the top of every CI step that uses sed:
set -euo pipefail
CONFIG=k8s/deployment.yaml
VERSION="${GITHUB_REF_NAME#v}"
grep -qE '^[[:space:]]*version:' "$CONFIG" || {
echo "ERROR: no 'version:' field found in $CONFIG"
exit 1
}
sed -i -E "s/^([[:space:]]*version:)[[:space:]]+.*/\1 ${VERSION}/" "$CONFIG"
The grep -q runs first. It exits 0 if the field exists, 1 if it doesn’t. With set -e (or the explicit || clause) the job fails loudly with a useful message before sed ever runs. If the schema of your config file changed and the field you were targeting no longer exists, you find out at the start of the deploy step instead of three steps later when health checks start flapping.
A more defensive variant verifies the substitution actually happened:
before=$(sha256sum "$CONFIG" | cut -d' ' -f1)
sed -i -E "s/^([[:space:]]*version:)[[:space:]]+.*/\1 ${VERSION}/" "$CONFIG"
after=$(sha256sum "$CONFIG" | cut -d' ' -f1)
if [ "$before" = "$after" ]; then
echo "ERROR: sed did not modify $CONFIG (already at version ${VERSION}?)"
# Decide: is this a hard failure or expected on retry?
exit 1
fi
Whether “no change” is an error depends on context. For a one-shot deploy step, it’s a bug. For an idempotent retry, it’s expected. Either way, log it, don’t swallow it.
Explore this further in Bash Error Handling: Patterns for Bulletproof Scripts
3. Dry-Run Before Mutation
In the terminal, you can run sed, eyeball the output, and decide whether to add -i. In CI, the diff between “what I expected” and “what sed produced” lives in a job log nobody reads until the post-mortem.
Run the substitution without -i first, diff against the original, and emit that diff to the build log:
diff -u "$CONFIG" <(sed -E "s/^([[:space:]]*version:)[[:space:]]+.*/\1 ${VERSION}/" "$CONFIG") || true
The || true is important — diff returns 1 when files differ, which is the case you actually want. Without it, set -e will kill the job.
Real example from a GitHub Actions step:
- name: Show planned config change
run: |
diff -u helm/values.yaml \
<(sed -E "s|^( tag: ).*|\1${{ github.ref_name }}|" helm/values.yaml) \
|| true
- name: Apply config change
run: |
sed -i -E "s|^( tag: ).*|\1${{ github.ref_name }}|" helm/values.yaml
A trivial amount of CI time, every deploy. When something goes wrong much later, the diff is right there in the log next to the failing step. I have searched for those diffs in the small hours enough times to never skip them.
For bigger scripts (multiple sed commands, ranges, deletions), write the planned changes to a temp file and diff that. The build log then shows the final state:
sed -E '
s/^(name:).*/\1 my-app/
s/^(replicas:).*/\1 6/
/^debug:/d
' "$CONFIG" > /tmp/planned.yaml
diff -u "$CONFIG" /tmp/planned.yaml || true
mv /tmp/planned.yaml "$CONFIG"
That mv is worth a moment. Writing to a temp file and renaming is atomic on most filesystems; sed -i is not — it deletes and recreates. If the runner gets killed mid-write (OOM, spot reclaim, network drop on a remote filesystem), sed -i can leave you with a truncated file. Temp-file-and-rename doesn’t.
Discover related concepts in Remote Server Configuration: From SSH Loops to a Go Config Tool
4. In-Place with Backup: The GNU vs BSD Trap
This one bites every team eventually. GNU sed and BSD sed disagree on what -i means.
GNU sed (Linux runners, most Docker images):
sed -i 's/foo/bar/' file.txt # works
sed -i.bak 's/foo/bar/' file.txt # works, creates file.txt.bak
BSD sed (macOS runners, FreeBSD agents):
sed -i 's/foo/bar/' file.txt # ERROR: requires an argument after -i
sed -i '' 's/foo/bar/' file.txt # works on BSD, FAILS on GNU
sed -i.bak 's/foo/bar/' file.txt # works on BOTH
That pipeline ran on Linux Jenkins agents. A developer on a MacBook hit the BSD error locally and “fixed” it by adding -i ''. The next CI run blew up because GNU sed parsed '' as the script argument.
The portable form is sed -i.bak with a real backup suffix. It works on GNU sed, BSD sed, and most other implementations. If you don’t want the backup, delete it explicitly:
sed -i.bak -E "s/^(version:).*/\1 2/" config.yaml
rm -f config.yaml.bak
Two lines instead of one, but portable. The backup also makes rollback trivial:
sed -i.bak -E "s/^(version:).*/\1 ${VERSION}/" config.yaml
if ! validate_config config.yaml; then
mv config.yaml.bak config.yaml
exit 1
fi
rm -f config.yaml.bak
If you want to avoid the in-place mode entirely (and I do, in any pipeline that touches a file the build artifact will consume), the portable wrapper writes through a temp file:
safe_sed() {
local script="$1"
local file="$2"
local tmp
tmp=$(mktemp)
sed -E "$script" "$file" > "$tmp" && mv "$tmp" "$file"
}
safe_sed 's/^(version:).*/\1 2/' config.yaml
This works on every sed in existence, gives you atomic replacement, and never leaves a .bak file lying around.
Uncover more details in Sed Gotchas: GNU vs BSD, In-Place Backup, and Safety Patterns
5. Variable Interpolation Without Injection
The moment you put a CI variable inside a sed command, you’ve created two problems: a small attack surface, and a much bigger reliability one.
The naive pattern:
sed -i "s/version: .*/version: $VERSION/" config.yaml
This works when $VERSION is 1.2.3. It breaks when $VERSION is 1.2.3/main (a slash kills the substitution). It does worse things when $VERSION contains & (sed expands & to the entire matched string) or \1 (sed treats it as a backreference). And if $VERSION comes from a PR title, a Git tag from a fork, or a JSON payload — congratulations, you have a sed-injection bug.
A real example I caught in a PR review. The pipeline pulled a “release notes” string from a webhook and tried to put it into a YAML file:
sed -i "s|notes: .*|notes: $NOTES|" release.yaml
$NOTES was Bug fix & improvements. Sed expanded & to the matched string. The file ended up with notes: Bug fix notes: Bug fix & improvements improvements. The deploy still went out, but the changelog was unreadable.
Two layers of escaping matter here: the shell layer (your double-quoted string) and the sed layer (the replacement string). For values from a known-safe source like GITHUB_REF_NAME for tags matching vX.Y.Z, single-quoting and string concatenation is enough:
VERSION="${GITHUB_REF_NAME#v}"
sed -i -E 's/^(version:).*/\1 '"${VERSION}"'/' config.yaml
The single quotes around s/^(version:).*/ keep the shell from interpreting the regex specials. The double-quoted "${VERSION}" lets the shell substitute. The trailing / is back inside single quotes. This works as long as $VERSION doesn’t contain /, &, or \.
For untrusted input — anything from a PR, an external API, a user-controlled file — escape the replacement before it touches sed:
sed_escape_replacement() {
# Escape /, &, and \ for the right side of sed s///
printf '%s' "$1" | sed -e 's/[\/&\\]/\\&/g'
}
NOTES_SAFE=$(sed_escape_replacement "$NOTES")
sed -i "s|^notes: .*|notes: ${NOTES_SAFE}|" release.yaml
When the value can be wild, switch to a tool designed for the format: yq for YAML, jq for JSON. Sed is for known-safe substitutions on known-shape files.
The other half of this is the pattern delimiter. The default s/... uses / as the separator, which breaks on URLs and paths. Switch to |:
# Bad: slashes in URL break the regex
sed -i 's/url: .*/url: https:\/\/api.example.com/' config.yaml
# Good
sed -i 's|^url: .*|url: https://api.example.com|' config.yaml
I default to | for any sed command that handles paths, URLs, or arbitrary strings.
Journey deeper into this topic with envsubst Security: How to Protect Your CI/CD Pipeline
6. Pipeline-Specific Recipes
The patterns above are platform-agnostic. Here are the three I’ve shipped most often.
GitHub Actions: Helm chart version from a tag
When a Git tag like v2.4.1 triggers a release workflow, push that version into the Helm chart before packaging.
jobs:
package:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Update Helm chart
env:
V: ${{ github.ref_name }}
run: |
set -euo pipefail
V="${V#v}"
grep -qE '^version:' charts/myapp/Chart.yaml
grep -qE '^ tag:' charts/myapp/values.yaml
diff -u charts/myapp/Chart.yaml \
<(sed -E "s/^(version:).*/\1 ${V}/" charts/myapp/Chart.yaml) || true
sed -i.bak -E "s/^(version:).*/\1 ${V}/" charts/myapp/Chart.yaml
sed -i.bak -E "s|^( tag:).*|\1 \"${V}\"|" charts/myapp/values.yaml
rm -f charts/myapp/Chart.yaml.bak charts/myapp/values.yaml.bak
- name: Package
run: helm package charts/myapp
The grep -q precheck catches schema drift. The diff line writes a “before/after” record into the workflow log. sed -i.bak with explicit cleanup keeps the step portable across ubuntu-latest and macos-latest runners.
Jenkins: per-environment Jenkinsfile parameters
A single Jenkinsfile that needs different parameters for dev, staging, and prod. Don’t fork the file — rewrite the parameter block in a setup stage.
stage('Configure') {
steps {
sh '''
set -euo pipefail
CONFIG=deploy/config.yaml
grep -qE '^environment:' "$CONFIG"
case "${ENV}" in
dev) REPLICAS=1; DOMAIN=dev.example.com ;;
staging) REPLICAS=2; DOMAIN=staging.example.com ;;
prod) REPLICAS=6; DOMAIN=app.example.com ;;
*) echo "unknown env: ${ENV}"; exit 1 ;;
esac
sed -i.bak -E \
-e "s/^(environment:).*/\\1 ${ENV}/" \
-e "s/^(replicas:).*/\\1 ${REPLICAS}/" \
-e "s|^(domain:).*|\\1 ${DOMAIN}|" \
"$CONFIG"
rm -f "${CONFIG}.bak"
'''
}
}
Note the doubled backslashes — Groovy strings eat one layer before the shell sees them. The first time you use sed inside a Jenkinsfile this looks like a sed bug. It isn’t.
Retry-safety matters more in Jenkins than anywhere else, because Jenkins re-runs failed stages by default and the “Replay” button re-runs the whole thing. Every sed in a Jenkinsfile must be idempotent. If it isn’t, your second-to-third retry will eat the file.
CodeBuild: rewriting appspec.yml before CodeDeploy
In a CodeBuild project that produces a CodeDeploy-compatible artifact, appspec.yml often needs the build’s image URI baked in. The buildspec uses sed in a pre_build phase.
version: 0.2
env:
shell: bash
phases:
pre_build:
commands:
- set -euo pipefail
- SHORT_SHA="${CODEBUILD_RESOLVED_SOURCE_VERSION:0:7}"
- IMAGE_URI="${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/myapp:${SHORT_SHA}"
- grep -qE '^[[:space:]]*image:' deploy/appspec.yml
- |
diff -u deploy/appspec.yml \
<(sed -E "s|^([[:space:]]*image:[[:space:]]+).*|\1${IMAGE_URI}|" deploy/appspec.yml) || true
- |
sed -i.bak -E "s|^([[:space:]]*image:[[:space:]]+).*|\1${IMAGE_URI}|" deploy/appspec.yml
rm -f deploy/appspec.yml.bak
artifacts:
files:
- deploy/appspec.yml
- deploy/scripts/**/*
The image URI uses : characters, so the sed delimiter is |. The short-SHA tag from CODEBUILD_RESOLVED_SOURCE_VERSION is idempotent on retry — the same commit produces the same URI, so a re-run of the build phase rewrites the file to the same content.
For more on the buildspec format around this sed call, see buildspec.yml for AWS CodeBuild: A Practical Tutorial.
Enrich your learning with Advanced Bash Scripting for Automation
The Real Story: What the Fix Looked Like
Back to that incident. The post-mortem produced four specific changes to every pipeline that used sed:
- Every sed call was preceded by a
grep -qprecheck. If the field wasn’t there, the pipeline failed loudly instead of silently no-op’ing. - Every regex was anchored to the field name. No more
s/version: 1/version: 2/patterns. Alwayss/^(version:).*/\1 2/. - Every step that mutated a file emitted a
diff -ufirst. The build logs got slightly bigger. That was fine. - Every
sed -ibecamesed -i.bakwith explicitrmof the backup. Portability and a free rollback hook in one change.
The migration took a small team a few days across the Jenkinsfiles and CodeBuild projects. A while later we had our first sed-related “incident” under the new rules: a developer renamed version to appVersion in a YAML file. The next pipeline run failed at the grep -q step with a clear error message. Total downtime: zero. Total time to root-cause: under a minute.
The other lesson I bake into every CI review: if your sed command is more than two substitutions long, or if the file nests more than two levels deep, you should probably be using yq, jq, or a real templating tool. Sed is a scalpel. The patterns above keep it from slipping. They don’t make sed the right tool for every job.
If you take one thing from this article: never run sed -i in a pipeline without a grep -q precheck and a diff log line. That alone would have caught the original incident.
Gain comprehensive insights from Sed Cheat Sheet: 30 One-Liners from Real Production Logs
Related
Similar Articles
Related Content
More from devops
The sed gotchas that bite in production: GNU vs BSD differences (`-i` syntax, `-E` support, `\b` …
Sed multiline patterns explained: the hold space, the N/D/P commands, address ranges, and how to …
You Might Also Like
No related topic suggestions found.
Contents
- The 2 AM Pipeline Failure That Taught Me to Distrust sed
- 1. Idempotency: The First Rule
- 2. Exit Codes and
set -eDon’t Save You - 3. Dry-Run Before Mutation
- 4. In-Place with Backup: The GNU vs BSD Trap
- 5. Variable Interpolation Without Injection
- 6. Pipeline-Specific Recipes
- The Real Story: What the Fix Looked Like
- Related
