Skip to main content
Menu
Home WhoAmI Stack Insights Blog Contact
/user/KayD @ karandeepsingh.ca :~$ cat weird-unix-commands.md

Unix Power Tools Every DevOps Engineer Should Know

Karandeep Singh
• 10 minutes read

Summary

Learn Unix commands that actually matter in production: pv for progress bars, xargs for parallel execution, watch for live monitoring, column for readable output. Then build a Go log watcher using the same patterns.

Most Unix command lists are full of novelty tools like cowsay and cmatrix. Fun, but useless in production. The commands that actually save you at 2am during an incident are the ones nobody teaches: pv, xargs, column, watch, tee, and comm.

We’ll go through each one with real production scenarios, then build a Go tool that implements the most useful pattern: watching and processing log files in real time.

Prerequisites

  • A Linux terminal (native, WSL, or SSH to a server)
  • Go 1.21+ installed (for Steps 6-8)

Step 1: Progress Bars for Long Operations (pv)

What: Add a progress bar to any pipe.

Why: You’re copying a 10GB database dump and have no idea how long it’ll take. cp shows nothing. pv (pipe viewer) shows speed, elapsed time, ETA, and a progress bar.

Install it:

sudo apt install pv   # Debian/Ubuntu
sudo yum install pv   # RHEL/CentOS

Use it anywhere you’d use cat:

pv database-dump.sql | mysql -u root mydb

Expected output:

2.35GiB 0:01:42 [23.5MiB/s] [========>                ] 23% ETA 0:05:38

You can also use it between any two piped commands:

tar cf - /var/log/ | pv -s $(du -sb /var/log/ | awk '{print $1}') | gzip > logs.tar.gz

The -s flag tells pv the total size so it can show a percentage. Without it, you get speed and total transferred but no ETA.

Where this matters in DevOps: database restores, large file transfers, image builds, anything where “is it stuck or just slow?” is the question.

Step 2: Parallel Execution (xargs)

What: Run commands in parallel across multiple inputs.

Why: You need to delete 5,000 old Docker images, curl 200 health endpoints, or restart 50 services. Doing them one at a time is slow. xargs -P runs them in parallel.

Delete old Docker images, one at a time (slow):

docker images -q --filter "before=myapp:latest" | xargs docker rmi

Delete in parallel, 10 at a time (fast):

docker images -q --filter "before=myapp:latest" | xargs -P 10 -n 1 docker rmi

-P 10 runs 10 processes in parallel. -n 1 passes one image ID per command. The difference on 500 images is 5 minutes vs 30 seconds.

Health check 50 services in parallel:

cat endpoints.txt | xargs -P 20 -I {} curl -s -o /dev/null -w "{}: %{http_code}\n" {}

-I {} replaces {} with each line from the input. -P 20 runs 20 curls at once.

Expected output:

https://auth-api.internal:8080/health: 200
https://payment-svc.internal:8080/health: 200
https://user-svc.internal:8080/health: 503

You just built a parallel health checker in one line.

Step 3: Live Monitoring (watch)

What: Repeat any command every N seconds and show the output.

Why: You’re waiting for pods to come up, connections to drain, or disk space to free up. Instead of running the same command over and over, watch does it for you.

Watch Kubernetes pods:

watch -n 2 kubectl get pods

Refreshes every 2 seconds. Press Ctrl+C to stop.

Watch with differences highlighted:

watch -d -n 5 'df -h | grep /dev/sda'

-d highlights what changed between refreshes. Instantly see when disk usage moves.

Watch Docker container count:

watch -n 1 'docker ps -q | wc -l'

During a deploy, you see containers going from 10 → 5 → 0 → 5 → 10 as old ones drain and new ones start.

Step 4: Split Output Two Ways (tee)

What: Send output to both a file and stdout at the same time.

Why: You’re running a deploy script and want to see the output live AND save it to a log file. Without tee, you pick one.

./deploy.sh 2>&1 | tee deploy-$(date +%Y%m%d).log

2>&1 merges stderr into stdout so both go to the file. You see everything in real time and have a log file for later.

Append instead of overwrite:

./deploy.sh 2>&1 | tee -a deploy.log

Chain with other commands:

kubectl logs -f my-pod | tee pod.log | grep ERROR

This saves the full log to pod.log while only showing ERROR lines on screen. Three things happening in one pipe: capture everything, filter for display, save for later.

Step 5: Format Output as Tables (column)

What: Align messy output into readable columns.

Why: Command output is often hard to read because fields don’t align. column -t fixes that instantly.

Before:

echo -e "service status port\nauth-api healthy 8080\npayment-svc degraded 8081\nuser-svc healthy 8082"
service status port
auth-api healthy 8080
payment-svc degraded 8081
user-svc healthy 8082

After:

echo -e "service status port\nauth-api healthy 8080\npayment-svc degraded 8081\nuser-svc healthy 8082" | column -t
service      status    port
auth-api     healthy   8080
payment-svc  degraded  8081
user-svc     healthy   8082

Use with any command:

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" | column -t
cat /etc/passwd | awk -F: '{print $1, $3, $7}' | column -t

Small tool, big readability improvement in scripts and reports.

Step 6: Build a Log Watcher in Go (The pv + watch Pattern)

What: Build a Go tool that watches a log file in real time, counts lines per second, and shows a summary, combining the pv (progress) and watch (live refresh) patterns.

Why: The Unix commands above work great individually. But when you need custom logic (filtering, counting patterns, or alerting) you need code. Go is perfect for this because it handles file I/O and concurrency well.

Create your project:

mkdir go-logwatch && cd go-logwatch
go mod init go-logwatch

main.go

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
	"time"
)

func main() {
	if len(os.Args) < 2 {
		fmt.Println("usage: go-logwatch <file>")
		os.Exit(1)
	}

	file, err := os.Open(os.Args[1])
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	// Seek to end of file (like tail -f)
	file.Seek(0, 2)

	scanner := bufio.NewScanner(file)
	lines := 0
	start := time.Now()

	for {
		for scanner.Scan() {
			line := scanner.Text()
			lines++
			fmt.Printf("[%d] %s\n", lines, line)
		}
		time.Sleep(100 * time.Millisecond)
	}

	_ = start // we'll use this next step
}

This reads from the end of a file and prints new lines as they appear, like tail -f. But there’s a problem: bufio.Scanner doesn’t re-read the file after reaching EOF. Once it hits the end, Scan() returns false and the loop just sleeps forever.

Run it to see the problem:

go run main.go /var/log/syslog

It prints nothing, even when new log lines appear. The scanner is stuck at EOF.

Step 7: Fix the EOF Problem

What: Make the log watcher actually detect new lines after EOF.

Why: bufio.Scanner caches the EOF state. We need to re-read from the last position. The fix is to track the file offset and re-open or re-seek.

main.go — updated:

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
	"time"
)

func main() {
	if len(os.Args) < 2 {
		fmt.Println("usage: go-logwatch <file>")
		os.Exit(1)
	}
	filename := os.Args[1]

	file, err := os.Open(filename)
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	// Start at end of file
	offset, _ := file.Seek(0, 2)
	lines := 0

	for {
		// Seek to where we left off
		file.Seek(offset, 0)
		scanner := bufio.NewScanner(file)

		for scanner.Scan() {
			line := scanner.Text()
			lines++
			fmt.Printf("[%d] %s\n", lines, line)
		}

		// Save current position for next iteration
		offset, _ = file.Seek(0, 1)
		time.Sleep(200 * time.Millisecond)
	}
}

Now we create a new scanner each iteration and seek to the last known offset. New lines after EOF get picked up on the next loop.

Test it — open two terminals:

# Terminal 1: Watch a test file
go run main.go /tmp/test.log

# Terminal 2: Append lines
echo "deploy started" >> /tmp/test.log
echo "building image" >> /tmp/test.log
echo "deploy complete" >> /tmp/test.log

Expected output (Terminal 1):

[1] deploy started
[2] building image
[3] deploy complete

It works. But there’s no summary — we can’t see lines per second or error counts. Let’s add that.

Step 8: Add Live Statistics (The watch Pattern)

What: Show a live stats line that updates every second — total lines, lines/sec, and error count.

Why: This is the watch and pv pattern combined. You see both the log lines and a dashboard of what’s happening.

main.go — updated:

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
	"strings"
	"sync/atomic"
	"time"
)

var (
	totalLines  atomic.Int64
	errorCount  atomic.Int64
	recentLines atomic.Int64
)

func main() {
	if len(os.Args) < 2 {
		fmt.Println("usage: go-logwatch <file>")
		os.Exit(1)
	}
	filename := os.Args[1]

	file, err := os.Open(filename)
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	offset, _ := file.Seek(0, 2)

	// Stats goroutine — prints summary every second
	go func() {
		for {
			time.Sleep(1 * time.Second)
			recent := recentLines.Swap(0)
			fmt.Printf("\r\033[K[stats] total=%d errors=%d rate=%d lines/sec",
				totalLines.Load(), errorCount.Load(), recent)
		}
	}()

	fmt.Printf("watching %s (Ctrl+C to stop)\n", filename)

	for {
		file.Seek(offset, 0)
		scanner := bufio.NewScanner(file)

		for scanner.Scan() {
			line := scanner.Text()
			totalLines.Add(1)
			recentLines.Add(1)

			// Count errors
			lower := strings.ToLower(line)
			if strings.Contains(lower, "error") || strings.Contains(lower, "fatal") {
				errorCount.Add(1)
				fmt.Printf("\n\033[31m[ERROR] %s\033[0m\n", line)
			}
		}

		offset, _ = file.Seek(0, 1)
		time.Sleep(200 * time.Millisecond)
	}
}

The stats goroutine runs independently, printing a status line every second using \r\033[K to overwrite the previous line (like pv does). Error lines print in red using ANSI codes. We use atomic.Int64 for thread-safe counters between the reader and stats goroutines.

Test with a simulated log stream:

# Terminal 1
go run main.go /tmp/test.log

# Terminal 2: Generate log lines
for i in $(seq 1 100); do
  echo "$(date +%T) request processed $i" >> /tmp/test.log
  [ $((i % 10)) -eq 0 ] && echo "$(date +%T) ERROR: connection timeout" >> /tmp/test.log
  sleep 0.05
done

Expected output (Terminal 1):

watching /tmp/test.log (Ctrl+C to stop)

[ERROR] 14:30:05 ERROR: connection timeout

[ERROR] 14:30:06 ERROR: connection timeout
[stats] total=110 errors=10 rate=22 lines/sec

You now have a custom log watcher that combines tail -f (follow new lines), pv (live throughput stats), grep (error highlighting), and watch (periodic refresh) — all in about 60 lines of Go.

What We Built

Starting from basic Unix commands, we worked up to a custom tool:

  1. pv — progress bars for long operations (database restores, file transfers)
  2. xargs -P — parallel execution (health checks, bulk deletes, mass operations)
  3. watch — live monitoring (pods, disk, containers)
  4. tee — split output to file and screen simultaneously
  5. column — instant table formatting for any output
  6. Go log watcher — combined the patterns: tail -f + pv stats + grep filtering

These aren’t novelty commands. They’re the tools you reach for during incidents, deploys, and debugging sessions.

Cheat Sheet

Progress bar on any pipe:

pv file.sql | mysql mydb
tar cf - /data | pv -s $(du -sb /data | awk '{print $1}') | gzip > data.tar.gz

Parallel execution:

cat urls.txt | xargs -P 20 -I {} curl -s -o /dev/null -w "{}: %{http_code}\n" {}
find . -name "*.log" -mtime +30 | xargs -P 10 rm

Live monitoring:

watch -d -n 2 kubectl get pods
watch -n 1 'docker ps -q | wc -l'

Split output:

./script.sh 2>&1 | tee output.log
kubectl logs -f pod | tee full.log | grep ERROR

Table formatting:

some-command | column -t
docker ps --format "{{.Names}}\t{{.Status}}" | column -t

Key rules to remember:

  • pv works anywhere cat works — just replace cat with pv
  • xargs -P N runs N processes in parallel — use -n 1 for one argument per process
  • watch -d highlights changes between refreshes — essential during deploys
  • tee -a appends instead of overwriting — use for persistent logs
  • column -t auto-detects whitespace delimiters — add -s ',' for CSV
  • In Go, bufio.Scanner caches EOF — create a new scanner each iteration to detect new lines
  • Use atomic.Int64 when sharing counters between goroutines — not regular ints

Keep Reading

Question

What's your go-to Unix one-liner for production debugging? The one you always reach for first during an incident?

Contents