Most Unix command lists are full of novelty tools like cowsay and cmatrix. Fun, but useless in production. The commands that actually save you at 2am during an incident are the ones nobody teaches: pv, xargs, column, watch, tee, and comm.
We’ll go through each one with real production scenarios, then build a Go tool that implements the most useful pattern: watching and processing log files in real time.
Prerequisites
- A Linux terminal (native, WSL, or SSH to a server)
- Go 1.21+ installed (for Steps 6-8)
Step 1: Progress Bars for Long Operations (pv)
What: Add a progress bar to any pipe.
Why: You’re copying a 10GB database dump and have no idea how long it’ll take. cp shows nothing. pv (pipe viewer) shows speed, elapsed time, ETA, and a progress bar.
Install it:
sudo apt install pv # Debian/Ubuntu
sudo yum install pv # RHEL/CentOS
Use it anywhere you’d use cat:
pv database-dump.sql | mysql -u root mydb
Expected output:
2.35GiB 0:01:42 [23.5MiB/s] [========> ] 23% ETA 0:05:38
You can also use it between any two piped commands:
tar cf - /var/log/ | pv -s $(du -sb /var/log/ | awk '{print $1}') | gzip > logs.tar.gz
The -s flag tells pv the total size so it can show a percentage. Without it, you get speed and total transferred but no ETA.
Where this matters in DevOps: database restores, large file transfers, image builds, anything where “is it stuck or just slow?” is the question.
Step 2: Parallel Execution (xargs)
What: Run commands in parallel across multiple inputs.
Why: You need to delete 5,000 old Docker images, curl 200 health endpoints, or restart 50 services. Doing them one at a time is slow. xargs -P runs them in parallel.
Delete old Docker images, one at a time (slow):
docker images -q --filter "before=myapp:latest" | xargs docker rmi
Delete in parallel, 10 at a time (fast):
docker images -q --filter "before=myapp:latest" | xargs -P 10 -n 1 docker rmi
-P 10 runs 10 processes in parallel. -n 1 passes one image ID per command. The difference on 500 images is 5 minutes vs 30 seconds.
Health check 50 services in parallel:
cat endpoints.txt | xargs -P 20 -I {} curl -s -o /dev/null -w "{}: %{http_code}\n" {}
-I {} replaces {} with each line from the input. -P 20 runs 20 curls at once.
Expected output:
https://auth-api.internal:8080/health: 200
https://payment-svc.internal:8080/health: 200
https://user-svc.internal:8080/health: 503
You just built a parallel health checker in one line.
Step 3: Live Monitoring (watch)
What: Repeat any command every N seconds and show the output.
Why: You’re waiting for pods to come up, connections to drain, or disk space to free up. Instead of running the same command over and over, watch does it for you.
Watch Kubernetes pods:
watch -n 2 kubectl get pods
Refreshes every 2 seconds. Press Ctrl+C to stop.
Watch with differences highlighted:
watch -d -n 5 'df -h | grep /dev/sda'
-d highlights what changed between refreshes. Instantly see when disk usage moves.
Watch Docker container count:
watch -n 1 'docker ps -q | wc -l'
During a deploy, you see containers going from 10 → 5 → 0 → 5 → 10 as old ones drain and new ones start.
Step 4: Split Output Two Ways (tee)
What: Send output to both a file and stdout at the same time.
Why: You’re running a deploy script and want to see the output live AND save it to a log file. Without tee, you pick one.
./deploy.sh 2>&1 | tee deploy-$(date +%Y%m%d).log
2>&1 merges stderr into stdout so both go to the file. You see everything in real time and have a log file for later.
Append instead of overwrite:
./deploy.sh 2>&1 | tee -a deploy.log
Chain with other commands:
kubectl logs -f my-pod | tee pod.log | grep ERROR
This saves the full log to pod.log while only showing ERROR lines on screen. Three things happening in one pipe: capture everything, filter for display, save for later.
Step 5: Format Output as Tables (column)
What: Align messy output into readable columns.
Why: Command output is often hard to read because fields don’t align. column -t fixes that instantly.
Before:
echo -e "service status port\nauth-api healthy 8080\npayment-svc degraded 8081\nuser-svc healthy 8082"
service status port
auth-api healthy 8080
payment-svc degraded 8081
user-svc healthy 8082
After:
echo -e "service status port\nauth-api healthy 8080\npayment-svc degraded 8081\nuser-svc healthy 8082" | column -t
service status port
auth-api healthy 8080
payment-svc degraded 8081
user-svc healthy 8082
Use with any command:
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" | column -t
cat /etc/passwd | awk -F: '{print $1, $3, $7}' | column -t
Small tool, big readability improvement in scripts and reports.
Step 6: Build a Log Watcher in Go (The pv + watch Pattern)
What: Build a Go tool that watches a log file in real time, counts lines per second, and shows a summary, combining the pv (progress) and watch (live refresh) patterns.
Why: The Unix commands above work great individually. But when you need custom logic (filtering, counting patterns, or alerting) you need code. Go is perfect for this because it handles file I/O and concurrency well.
Create your project:
mkdir go-logwatch && cd go-logwatch
go mod init go-logwatch
main.go
package main
import (
"bufio"
"fmt"
"log"
"os"
"time"
)
func main() {
if len(os.Args) < 2 {
fmt.Println("usage: go-logwatch <file>")
os.Exit(1)
}
file, err := os.Open(os.Args[1])
if err != nil {
log.Fatal(err)
}
defer file.Close()
// Seek to end of file (like tail -f)
file.Seek(0, 2)
scanner := bufio.NewScanner(file)
lines := 0
start := time.Now()
for {
for scanner.Scan() {
line := scanner.Text()
lines++
fmt.Printf("[%d] %s\n", lines, line)
}
time.Sleep(100 * time.Millisecond)
}
_ = start // we'll use this next step
}
This reads from the end of a file and prints new lines as they appear, like tail -f. But there’s a problem: bufio.Scanner doesn’t re-read the file after reaching EOF. Once it hits the end, Scan() returns false and the loop just sleeps forever.
Run it to see the problem:
go run main.go /var/log/syslog
It prints nothing, even when new log lines appear. The scanner is stuck at EOF.
Step 7: Fix the EOF Problem
What: Make the log watcher actually detect new lines after EOF.
Why: bufio.Scanner caches the EOF state. We need to re-read from the last position. The fix is to track the file offset and re-open or re-seek.
main.go — updated:
package main
import (
"bufio"
"fmt"
"log"
"os"
"time"
)
func main() {
if len(os.Args) < 2 {
fmt.Println("usage: go-logwatch <file>")
os.Exit(1)
}
filename := os.Args[1]
file, err := os.Open(filename)
if err != nil {
log.Fatal(err)
}
defer file.Close()
// Start at end of file
offset, _ := file.Seek(0, 2)
lines := 0
for {
// Seek to where we left off
file.Seek(offset, 0)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
lines++
fmt.Printf("[%d] %s\n", lines, line)
}
// Save current position for next iteration
offset, _ = file.Seek(0, 1)
time.Sleep(200 * time.Millisecond)
}
}
Now we create a new scanner each iteration and seek to the last known offset. New lines after EOF get picked up on the next loop.
Test it — open two terminals:
# Terminal 1: Watch a test file
go run main.go /tmp/test.log
# Terminal 2: Append lines
echo "deploy started" >> /tmp/test.log
echo "building image" >> /tmp/test.log
echo "deploy complete" >> /tmp/test.log
Expected output (Terminal 1):
[1] deploy started
[2] building image
[3] deploy complete
It works. But there’s no summary — we can’t see lines per second or error counts. Let’s add that.
Step 8: Add Live Statistics (The watch Pattern)
What: Show a live stats line that updates every second — total lines, lines/sec, and error count.
Why: This is the watch and pv pattern combined. You see both the log lines and a dashboard of what’s happening.
main.go — updated:
package main
import (
"bufio"
"fmt"
"log"
"os"
"strings"
"sync/atomic"
"time"
)
var (
totalLines atomic.Int64
errorCount atomic.Int64
recentLines atomic.Int64
)
func main() {
if len(os.Args) < 2 {
fmt.Println("usage: go-logwatch <file>")
os.Exit(1)
}
filename := os.Args[1]
file, err := os.Open(filename)
if err != nil {
log.Fatal(err)
}
defer file.Close()
offset, _ := file.Seek(0, 2)
// Stats goroutine — prints summary every second
go func() {
for {
time.Sleep(1 * time.Second)
recent := recentLines.Swap(0)
fmt.Printf("\r\033[K[stats] total=%d errors=%d rate=%d lines/sec",
totalLines.Load(), errorCount.Load(), recent)
}
}()
fmt.Printf("watching %s (Ctrl+C to stop)\n", filename)
for {
file.Seek(offset, 0)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
totalLines.Add(1)
recentLines.Add(1)
// Count errors
lower := strings.ToLower(line)
if strings.Contains(lower, "error") || strings.Contains(lower, "fatal") {
errorCount.Add(1)
fmt.Printf("\n\033[31m[ERROR] %s\033[0m\n", line)
}
}
offset, _ = file.Seek(0, 1)
time.Sleep(200 * time.Millisecond)
}
}
The stats goroutine runs independently, printing a status line every second using \r\033[K to overwrite the previous line (like pv does). Error lines print in red using ANSI codes. We use atomic.Int64 for thread-safe counters between the reader and stats goroutines.
Test with a simulated log stream:
# Terminal 1
go run main.go /tmp/test.log
# Terminal 2: Generate log lines
for i in $(seq 1 100); do
echo "$(date +%T) request processed $i" >> /tmp/test.log
[ $((i % 10)) -eq 0 ] && echo "$(date +%T) ERROR: connection timeout" >> /tmp/test.log
sleep 0.05
done
Expected output (Terminal 1):
watching /tmp/test.log (Ctrl+C to stop)
[ERROR] 14:30:05 ERROR: connection timeout
[ERROR] 14:30:06 ERROR: connection timeout
[stats] total=110 errors=10 rate=22 lines/sec
You now have a custom log watcher that combines tail -f (follow new lines), pv (live throughput stats), grep (error highlighting), and watch (periodic refresh) — all in about 60 lines of Go.
What We Built
Starting from basic Unix commands, we worked up to a custom tool:
- pv — progress bars for long operations (database restores, file transfers)
- xargs -P — parallel execution (health checks, bulk deletes, mass operations)
- watch — live monitoring (pods, disk, containers)
- tee — split output to file and screen simultaneously
- column — instant table formatting for any output
- Go log watcher — combined the patterns: tail -f + pv stats + grep filtering
These aren’t novelty commands. They’re the tools you reach for during incidents, deploys, and debugging sessions.
Cheat Sheet
Progress bar on any pipe:
pv file.sql | mysql mydb
tar cf - /data | pv -s $(du -sb /data | awk '{print $1}') | gzip > data.tar.gz
Parallel execution:
cat urls.txt | xargs -P 20 -I {} curl -s -o /dev/null -w "{}: %{http_code}\n" {}
find . -name "*.log" -mtime +30 | xargs -P 10 rm
Live monitoring:
watch -d -n 2 kubectl get pods
watch -n 1 'docker ps -q | wc -l'
Split output:
./script.sh 2>&1 | tee output.log
kubectl logs -f pod | tee full.log | grep ERROR
Table formatting:
some-command | column -t
docker ps --format "{{.Names}}\t{{.Status}}" | column -t
Key rules to remember:
pvworks anywherecatworks — just replacecatwithpvxargs -P Nruns N processes in parallel — use-n 1for one argument per processwatch -dhighlights changes between refreshes — essential during deploystee -aappends instead of overwriting — use for persistent logscolumn -tauto-detects whitespace delimiters — add-s ','for CSV- In Go,
bufio.Scannercaches EOF — create a new scanner each iteration to detect new lines - Use
atomic.Int64when sharing counters between goroutines — not regular ints
Keep Reading
- Mastering Bash: The Ultimate Guide to Command Line Productivity — go deeper on shell patterns and productivity tricks.
- Sed Cheat Sheet: 30 Essential One-Liners — more text processing power for the command line.
- Nginx Log Analysis: From grep to a Go Log Parser — apply these Unix tools to real log analysis with Go.
What's your go-to Unix one-liner for production debugging? The one you always reach for first during an incident?