Skip to main content
Menu
Home WhoAmI Stack Insights Blog Contact
/user/KayD @ karandeepsingh.ca :~$ cat nodejs-deployments-cicd.md

Deployment Automation: From SSH Scripts to a Go Deploy Tool

Karandeep Singh
• 27 minutes read

Summary

Master deployment automation with Linux commands and Go. From SSH and rsync basics to building a complete deploy tool with health checks, rollbacks, and multi-server support.

Most deployment guides start with a framework or a platform. This one starts with a terminal.

You will deploy an application using SSH, rsync, and shell scripts. Then you will build a Go tool that does the same thing, but with health checks, rollbacks, and parallel execution across multiple servers.

Each step follows the same pattern. Do it with Linux commands first. Build the same thing in Go. Make a mistake. Fix it.

By the end, you will have a working deployment tool written in Go that handles real problems: failed deploys, unhealthy servers, and rollbacks under pressure.


Step 1: Deploying with SSH and rsync

Before any tool or framework, deployment is this: copy files to a server, then run a command. SSH and rsync do both.

SSH: Run a command on a remote server

SSH connects to a remote machine and runs a command. The simplest deployment check is listing what is on the server right now.

ssh user@server 'ls /opt/app'

This connects to server as user and runs ls /opt/app. You see the files on the remote machine printed to your local terminal.

If the directory does not exist, you get an error. That tells you the server is not set up yet.

ssh user@server 'mkdir -p /opt/app'

The -p flag creates the directory and any parent directories that are missing. No error if it already exists.

rsync: Copy files efficiently

rsync copies files from your local machine to the remote server. It only sends files that changed, which makes it fast for repeated deploys.

rsync -avz --delete ./build/ user@server:/opt/app/

Here is what each flag does:

  • -a: archive mode. Preserves permissions, timestamps, symlinks, and directory structure.
  • -v: verbose. Shows each file being transferred.
  • -z: compress data during transfer. Saves bandwidth.
  • --delete: remove files on the server that no longer exist locally. Keeps the remote directory in sync.

The trailing slash on ./build/ matters. With the slash, rsync copies the contents of build/ into /opt/app/. Without it, rsync creates /opt/app/build/ and puts files inside that.

Preview before you deploy

Before pushing files to production, preview what rsync will do.

rsync -avz --delete --dry-run ./build/ user@server:/opt/app/

The --dry-run flag shows what would happen without making any changes. You see which files would be added, updated, or deleted. Run this every time before a real deploy.

Build it in Go

Now build the same thing in Go. This function runs an SSH command on a remote server using os/exec.

package main

import (
	"fmt"
	"os"
	"os/exec"
	"strings"
)

func runSSH(host, command string) (string, error) {
	cmd := exec.Command("ssh", host, command)
	output, err := cmd.CombinedOutput()
	return string(output), err
}

func deploy(host, localPath, remotePath string) error {
	cmd := exec.Command("rsync", "-avz", "--delete", localPath, host+":"+remotePath)
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

func main() {
	host := "user@server"

	output, err := runSSH(host, "ls /opt/app")
	if err != nil {
		fmt.Println("Remote directory check failed:", err)
		os.Exit(1)
	}
	fmt.Println("Current files on server:")
	fmt.Println(output)

	err = deploy(host, "./build/", "/opt/app/")
	if err != nil {
		fmt.Println("Deploy failed:", err)
		os.Exit(1)
	}
	fmt.Println("Deploy complete")
}

The runSSH function calls the ssh binary and captures its output. The deploy function calls rsync and streams output directly to your terminal so you can watch the transfer.

The bug: paths with spaces

Try deploying to a path with a space in it.

ssh user@server ls /opt/my app

This does not run ls /opt/my app on the server. SSH sees three arguments after the hostname: ls, /opt/my, and app. It joins them with spaces and runs them as a shell command on the remote side. In this case it works by accident, but only because the shell re-splits it.

The real problem shows up with commands that are not just simple arguments.

ssh user@server cat /opt/my app/config.txt

This fails. The remote shell tries to run cat /opt/my and then app/config.txt as a separate command.

The fix: quote the remote command

Wrap the entire remote command in quotes so SSH passes it as one argument to the remote shell.

ssh user@server 'cat "/opt/my app/config.txt"'

The single quotes prevent your local shell from interpreting anything. The double quotes inside tell the remote shell to treat the path as one argument.

In Go, this is simpler. When you pass the command as a single string argument to exec.Command("ssh", host, command), SSH already receives it as one argument. But the remote shell still interprets it, so you still need to quote paths inside the command string.

func runSSH(host, command string) (string, error) {
	cmd := exec.Command("ssh", host, command)
	output, err := cmd.CombinedOutput()
	return string(output), err
}

// This works because the command is one string argument
output, err := runSSH("user@server", `cat "/opt/my app/config.txt"`)

Use backtick strings in Go for commands that contain quotes. It avoids escaping issues.

For rsync, paths with spaces need quoting too.

rsync -avz --delete ./build/ user@server:"/opt/my app/"

The double quotes around the remote path tell rsync to pass it correctly over SSH.


Step 2: Pre-deploy and Post-deploy Hooks

Copying files is only part of a deploy. You also need to stop the running application, sync the new files, then start the application again. These steps happen in order. If any step fails, the rest should not run.

Linux: a basic deploy script

Here is a deploy script that stops the service, syncs files, and starts the service again.

#!/bin/bash
SERVER="user@server"
APP_PATH="/opt/app"

echo "Stopping service..."
ssh $SERVER 'systemctl stop myapp'

echo "Syncing files..."
rsync -avz --delete ./build/ $SERVER:$APP_PATH/

echo "Starting service..."
ssh $SERVER 'systemctl start myapp'

echo "Deploy complete"

This works, but there is a problem. The time between systemctl stop and systemctl start is downtime. Your application is not running during the file sync. If the sync takes 30 seconds, your users see errors for 30 seconds.

You can reduce this by syncing files first, then doing a quick restart.

#!/bin/bash
SERVER="user@server"
APP_PATH="/opt/app"

echo "Syncing files..."
rsync -avz --delete ./build/ $SERVER:$APP_PATH/

echo "Restarting service..."
ssh $SERVER 'systemctl restart myapp'

echo "Deploy complete"

The restart is a single command. The downtime is only the few seconds it takes for the process to stop and start again.

Build it in Go

Now build a deployer in Go with pre-deploy, deploy, and post-deploy hooks.

package main

import (
	"fmt"
	"os"
	"os/exec"
)

type Deployer struct {
	Host       string
	LocalPath  string
	RemotePath string
	ServiceName string
}

func (d *Deployer) runRemote(command string) error {
	cmd := exec.Command("ssh", d.Host, command)
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

func (d *Deployer) PreDeploy() error {
	fmt.Println("Pre-deploy: stopping service")
	return d.runRemote(fmt.Sprintf("systemctl stop %s", d.ServiceName))
}

func (d *Deployer) Deploy() error {
	fmt.Println("Deploying: syncing files")
	cmd := exec.Command("rsync", "-avz", "--delete",
		d.LocalPath, d.Host+":"+d.RemotePath)
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

func (d *Deployer) PostDeploy() error {
	fmt.Println("Post-deploy: starting service")
	return d.runRemote(fmt.Sprintf("systemctl start %s", d.ServiceName))
}

func main() {
	d := &Deployer{
		Host:        "user@server",
		LocalPath:   "./build/",
		RemotePath:  "/opt/app/",
		ServiceName: "myapp",
	}

	d.PreDeploy()
	d.Deploy()
	d.PostDeploy()
}

The bug: no error checking between steps

Look at the main function. It calls PreDeploy(), Deploy(), and PostDeploy() in order. But it ignores every error. If PreDeploy fails, Deploy still runs. If Deploy fails, PostDeploy still tries to start the service with broken files.

Run this when the service does not exist.

Pre-deploy: stopping service
Failed to stop myapp.service: Unit myapp.service not loaded.
Deploying: syncing files
...files sync...
Post-deploy: starting service
Failed to start myapp.service: Unit myapp.service not loaded.

All three steps ran. The pre-deploy failed, but the deploy continued anyway.

The fix: check errors and abort on failure

func (d *Deployer) Run() error {
	if err := d.PreDeploy(); err != nil {
		return fmt.Errorf("pre-deploy failed: %w", err)
	}

	if err := d.Deploy(); err != nil {
		fmt.Println("Deploy failed, running rollback")
		d.Rollback()
		return fmt.Errorf("deploy failed: %w", err)
	}

	if err := d.PostDeploy(); err != nil {
		fmt.Println("Post-deploy failed, running rollback")
		d.Rollback()
		return fmt.Errorf("post-deploy failed: %w", err)
	}

	return nil
}

func (d *Deployer) Rollback() {
	fmt.Println("Rolling back: restarting previous version")
	err := d.runRemote(fmt.Sprintf("systemctl restart %s", d.ServiceName))
	if err != nil {
		fmt.Println("Rollback also failed:", err)
	}
}

func main() {
	d := &Deployer{
		Host:        "user@server",
		LocalPath:   "./build/",
		RemotePath:  "/opt/app/",
		ServiceName: "myapp",
	}

	if err := d.Run(); err != nil {
		fmt.Println("Deployment failed:", err)
		os.Exit(1)
	}
	fmt.Println("Deployment succeeded")
}

Now if any step fails, the pipeline stops. If the deploy or post-deploy fails, it runs a rollback. The %w verb wraps the original error so you can see exactly what went wrong.


Step 3: Health Checks After Deploy

A deploy can succeed in every step and still leave your application broken. The files synced. The service started. But the application crashes on startup, or it starts but cannot connect to its database.

You need a health check. After the deploy, ask the application if it is actually working.

Linux: curl the health endpoint

Most applications expose a /health endpoint. Use curl to check it.

curl -sf http://server:8080/health

The flags:

  • -s: silent. No progress bar or error messages.
  • -f: fail silently on HTTP errors. Returns exit code 22 for 4xx/5xx responses instead of printing the HTML error page.

If the application is healthy, you get the response body and exit code 0. If it is not, you get exit code 22 and no output.

Retry loop

The application might take a few seconds to start. One failed health check does not mean the deploy failed. You need to retry.

for i in 1 2 3 4 5; do
    curl -sf http://server:8080/health && echo "Healthy" && break
    echo "Attempt $i failed, waiting..."
    sleep 2
done

This tries five times with two seconds between attempts. If curl succeeds, it prints “Healthy” and exits the loop. If all five attempts fail, the loop ends without the “Healthy” message.

Check just the status code

Sometimes you want the HTTP status code without the response body.

STATUS=$(curl -o /dev/null -w "%{http_code}" -s http://server:8080/health)
echo "Health check returned: $STATUS"

The -o /dev/null sends the body to nowhere. The -w "%{http_code}" prints just the status code. You can use this in a script to decide what to do based on the response.

Build it in Go

Now build a health checker in Go.

package main

import (
	"fmt"
	"net/http"
	"time"
)

type HealthChecker struct {
	URL        string
	MaxRetries int
	Interval   time.Duration
	Timeout    time.Duration
}

func (h *HealthChecker) Check() error {
	client := &http.Client{
		Timeout: h.Timeout,
	}

	for i := 0; i < h.MaxRetries; i++ {
		resp, err := client.Get(h.URL)
		if err != nil {
			fmt.Printf("  Attempt %d: connection failed: %v\n", i+1, err)
			time.Sleep(h.Interval)
			continue
		}
		resp.Body.Close()

		if resp.StatusCode == http.StatusOK {
			fmt.Printf("  Attempt %d: healthy (status %d)\n", i+1, resp.StatusCode)
			return nil
		}

		fmt.Printf("  Attempt %d: unhealthy (status %d)\n", i+1, resp.StatusCode)
		time.Sleep(h.Interval)
	}

	return fmt.Errorf("health check failed after %d attempts", h.MaxRetries)
}

func main() {
	hc := &HealthChecker{
		URL:        "http://server:8080/health",
		MaxRetries: 5,
		Interval:   2 * time.Second,
		Timeout:    5 * time.Second,
	}

	if err := hc.Check(); err != nil {
		fmt.Println("Health check failed:", err)
	} else {
		fmt.Println("Application is healthy")
	}
}

This creates an HTTP client with a timeout, then polls the health endpoint. If it gets a 200 response, the check passes. Otherwise it retries.

The bug: trusting any 200 response

The health checker only looks at the HTTP status code. But many applications return 200 even when they are not ready. They respond with a body like this:

{"status": "starting"}

The status code is 200. The health checker says “healthy.” But the application is still initializing. It is not ready to serve traffic.

Run it against an application that is starting up.

  Attempt 1: healthy (status 200)
Application is healthy

It passed on the first try. But the application was still loading its configuration. Requests to other endpoints would fail.

The fix: parse the response body

Check the actual content of the response. Only pass the health check when the body says the application is ready.

package main

import (
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"time"
)

type HealthResponse struct {
	Status string `json:"status"`
}

type HealthChecker struct {
	URL        string
	MaxRetries int
	Interval   time.Duration
	Timeout    time.Duration
}

func (h *HealthChecker) Check() error {
	client := &http.Client{
		Timeout: h.Timeout,
	}

	for i := 0; i < h.MaxRetries; i++ {
		resp, err := client.Get(h.URL)
		if err != nil {
			fmt.Printf("  Attempt %d: connection failed: %v\n", i+1, err)
			time.Sleep(h.Interval)
			continue
		}

		body, err := io.ReadAll(resp.Body)
		resp.Body.Close()
		if err != nil {
			fmt.Printf("  Attempt %d: failed to read body: %v\n", i+1, err)
			time.Sleep(h.Interval)
			continue
		}

		if resp.StatusCode != http.StatusOK {
			fmt.Printf("  Attempt %d: unhealthy (status %d)\n", i+1, resp.StatusCode)
			time.Sleep(h.Interval)
			continue
		}

		var health HealthResponse
		if err := json.Unmarshal(body, &health); err != nil {
			fmt.Printf("  Attempt %d: invalid response body: %s\n", i+1, string(body))
			time.Sleep(h.Interval)
			continue
		}

		if health.Status == "healthy" {
			fmt.Printf("  Attempt %d: healthy\n", i+1)
			return nil
		}

		fmt.Printf("  Attempt %d: not ready (status: %s)\n", i+1, health.Status)
		time.Sleep(h.Interval)
	}

	return fmt.Errorf("health check failed after %d attempts", h.MaxRetries)
}

func main() {
	hc := &HealthChecker{
		URL:        "http://server:8080/health",
		MaxRetries: 5,
		Interval:   2 * time.Second,
		Timeout:    5 * time.Second,
	}

	if err := hc.Check(); err != nil {
		fmt.Println("Health check failed:", err)
	} else {
		fmt.Println("Application is healthy")
	}
}

Now the health checker reads the response body and parses it as JSON. It only passes when the status field is exactly "healthy". If the application returns "starting" or "degraded", the check fails and retries.

  Attempt 1: not ready (status: starting)
  Attempt 2: not ready (status: starting)
  Attempt 3: healthy
Application is healthy

This is the correct behavior. The application was not ready on the first two attempts. The health checker waited and tried again. On the third attempt, the application was ready.


Step 4: Rollback on Failure

When a deploy fails, you need to go back to the previous version. Fast. The simplest way is to keep old releases around and switch between them with a symlink.

Instead of deploying directly to /opt/app, create a numbered release directory for each deploy.

RELEASE="v$(date +%Y%m%d%H%M%S)"
RELEASE_DIR="/opt/app/releases/$RELEASE"

ssh user@server "mkdir -p $RELEASE_DIR"
rsync -avz --delete ./build/ user@server:$RELEASE_DIR/

Each deploy gets its own directory. The directory name includes a timestamp, so releases sort chronologically.

Now point a symlink at the current release.

ssh user@server "ln -sfn /opt/app/releases/$RELEASE /opt/app/current"
ssh user@server "systemctl restart myapp"

The -sfn flags:

  • -s: create a symbolic link.
  • -f: remove the existing destination if it exists.
  • -n: treat the destination as a normal file if it is a symlink. Without this, ln would create a link inside the target directory instead of replacing it.

Your application’s systemd service points at /opt/app/current/. When you change the symlink, the next restart uses the new release.

Rollback

To roll back, point the symlink at the previous release.

ssh user@server "ln -sfn /opt/app/releases/v20250128020100 /opt/app/current"
ssh user@server "systemctl restart myapp"

This takes less than a second. No file copying. No downloading. Just a symlink change and a restart.

List the available releases to find the one you want.

ssh user@server "ls -la /opt/app/releases/"

Clean up old releases

Keep the last five releases. Delete everything else.

ssh user@server 'cd /opt/app/releases && ls -t | tail -n +6 | xargs rm -rf'

The ls -t sorts by modification time, newest first. tail -n +6 skips the first five entries and prints the rest. xargs rm -rf deletes them.

Build it in Go

Now build a release manager in Go.

package main

import (
	"fmt"
	"os"
	"os/exec"
	"strings"
	"time"
)

type ReleaseManager struct {
	Host        string
	ReleasesDir string
	CurrentLink string
	ServiceName string
	KeepReleases int
}

func (r *ReleaseManager) runRemote(command string) (string, error) {
	cmd := exec.Command("ssh", r.Host, command)
	output, err := cmd.CombinedOutput()
	return strings.TrimSpace(string(output)), err
}

func (r *ReleaseManager) CreateRelease(localPath string) (string, error) {
	release := fmt.Sprintf("v%s", time.Now().Format("20060102150405"))
	releaseDir := fmt.Sprintf("%s/%s", r.ReleasesDir, release)

	fmt.Printf("Creating release %s\n", release)

	_, err := r.runRemote(fmt.Sprintf("mkdir -p %s", releaseDir))
	if err != nil {
		return "", fmt.Errorf("failed to create release dir: %w", err)
	}

	cmd := exec.Command("rsync", "-avz", "--delete",
		localPath, r.Host+":"+releaseDir+"/")
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	if err := cmd.Run(); err != nil {
		return "", fmt.Errorf("failed to sync files: %w", err)
	}

	return release, nil
}

func (r *ReleaseManager) Activate(release string) error {
	releaseDir := fmt.Sprintf("%s/%s", r.ReleasesDir, release)
	fmt.Printf("Activating release %s\n", release)

	_, err := r.runRemote(fmt.Sprintf("ln -sfn %s %s", releaseDir, r.CurrentLink))
	if err != nil {
		return fmt.Errorf("failed to update symlink: %w", err)
	}

	_, err = r.runRemote(fmt.Sprintf("systemctl restart %s", r.ServiceName))
	if err != nil {
		return fmt.Errorf("failed to restart service: %w", err)
	}

	return nil
}

func (r *ReleaseManager) Rollback() error {
	output, err := r.runRemote(fmt.Sprintf("ls -t %s", r.ReleasesDir))
	if err != nil {
		return fmt.Errorf("failed to list releases: %w", err)
	}

	releases := strings.Split(output, "\n")
	if len(releases) < 2 {
		return fmt.Errorf("no previous release to roll back to")
	}

	previous := releases[1]
	fmt.Printf("Rolling back to %s\n", previous)
	return r.Activate(previous)
}

func (r *ReleaseManager) Cleanup() error {
	command := fmt.Sprintf("cd %s && ls -t | tail -n +%d | xargs rm -rf",
		r.ReleasesDir, r.KeepReleases+1)
	_, err := r.runRemote(command)
	return err
}

func main() {
	rm := &ReleaseManager{
		Host:         "user@server",
		ReleasesDir:  "/opt/app/releases",
		CurrentLink:  "/opt/app/current",
		ServiceName:  "myapp",
		KeepReleases: 5,
	}

	release, err := rm.CreateRelease("./build/")
	if err != nil {
		fmt.Println("Failed to create release:", err)
		os.Exit(1)
	}

	if err := rm.Activate(release); err != nil {
		fmt.Println("Activation failed, rolling back:", err)
		if rbErr := rm.Rollback(); rbErr != nil {
			fmt.Println("Rollback also failed:", rbErr)
		}
		os.Exit(1)
	}

	if err := rm.Cleanup(); err != nil {
		fmt.Println("Warning: cleanup failed:", err)
	}

	fmt.Println("Deploy complete")
}

The ReleaseManager creates timestamped release directories, manages the symlink, and handles rollbacks by listing releases and switching to the previous one.

The ln -sfn command does two operations internally. It removes the old symlink, then creates the new one. If your process crashes between those two operations, there is no symlink at all. The application is down.

This is unlikely but possible. On a busy system, another process could try to read the symlink in the gap between delete and create.

Create a temporary symlink with a different name, then rename it over the old one. The rename system call on Linux is atomic on the same filesystem.

# Create temp symlink
ln -s /opt/app/releases/v20250128020100 /opt/app/current.tmp
# Atomically replace the old symlink
mv -T /opt/app/current.tmp /opt/app/current

The -T flag on mv treats the destination as a file, not a directory. Without it, mv would move current.tmp inside current/ if current is a directory.

In Go, use os.Symlink and os.Rename.

func (r *ReleaseManager) ActivateAtomic(release string) error {
	releaseDir := fmt.Sprintf("%s/%s", r.ReleasesDir, release)
	tmpLink := r.CurrentLink + ".tmp"

	fmt.Printf("Activating release %s (atomic)\n", release)

	// Create the atomic swap command
	// 1. Remove any leftover temp symlink
	// 2. Create new temp symlink
	// 3. Atomically rename it over the current symlink
	command := fmt.Sprintf(
		"rm -f %s && ln -s %s %s && mv -T %s %s",
		tmpLink, releaseDir, tmpLink, tmpLink, r.CurrentLink,
	)

	_, err := r.runRemote(command)
	if err != nil {
		return fmt.Errorf("failed to swap symlink: %w", err)
	}

	_, err = r.runRemote(fmt.Sprintf("systemctl restart %s", r.ServiceName))
	if err != nil {
		return fmt.Errorf("failed to restart service: %w", err)
	}

	return nil
}

The sequence is: remove any leftover temp link, create the new symlink under a temporary name, then atomically rename it. If the process crashes at any point, either the old symlink is still in place (safe) or the rename has completed (new version is active). There is no gap where neither exists.


Step 5: Multi-Server Parallel Deployment

Deploying to one server is straightforward. Deploying to three or ten servers needs to happen in parallel. Doing it one at a time is too slow.

Linux: parallel SSH with background processes

Run deploys in the background with & and wait for all of them.

#!/bin/bash
SERVERS="web1 web2 web3"

for SERVER in $SERVERS; do
    echo "Deploying to $SERVER..."
    ssh user@$SERVER '/opt/deploy.sh' &
done

echo "Waiting for all deploys to finish..."
wait
echo "All deploys complete"

The & at the end of each ssh command runs it in the background. The wait command pauses until all background processes finish.

The problem: which server failed?

If one server fails, the script still says “All deploys complete.” You have no idea which server had the problem.

Check exit codes individually.

#!/bin/bash
SERVERS="web1 web2 web3"
PIDS=""
FAILED=""

for SERVER in $SERVERS; do
    echo "Deploying to $SERVER..."
    ssh user@$SERVER '/opt/deploy.sh' &
    PIDS="$PIDS $!:$SERVER"
done

for ENTRY in $PIDS; do
    PID=$(echo $ENTRY | cut -d: -f1)
    SERVER=$(echo $ENTRY | cut -d: -f2)
    wait $PID
    if [ $? -ne 0 ]; then
        echo "FAILED: $SERVER"
        FAILED="$FAILED $SERVER"
    else
        echo "OK: $SERVER"
    fi
done

if [ -n "$FAILED" ]; then
    echo "Deploy failed on:$FAILED"
    exit 1
fi
echo "All deploys succeeded"

This tracks each background process ID alongside its server name. After waiting for each one, it checks the exit code and records which servers failed.

Build it in Go

Now build the same thing in Go using goroutines and channels.

package main

import (
	"fmt"
	"os"
	"os/exec"
	"sync"
	"time"
)

type ServerResult struct {
	Server  string
	Success bool
	Error   error
	Duration time.Duration
}

type MultiDeployer struct {
	Servers    []string
	LocalPath  string
	RemotePath string
}

func (m *MultiDeployer) deployToServer(server string) ServerResult {
	start := time.Now()

	cmd := exec.Command("rsync", "-avz", "--delete",
		m.LocalPath, server+":"+m.RemotePath)
	output, err := cmd.CombinedOutput()
	if err != nil {
		return ServerResult{
			Server:   server,
			Success:  false,
			Error:    fmt.Errorf("%v: %s", err, string(output)),
			Duration: time.Since(start),
		}
	}

	// Restart the service
	cmd = exec.Command("ssh", server, "systemctl restart myapp")
	output, err = cmd.CombinedOutput()
	if err != nil {
		return ServerResult{
			Server:   server,
			Success:  false,
			Error:    fmt.Errorf("restart failed: %v: %s", err, string(output)),
			Duration: time.Since(start),
		}
	}

	return ServerResult{
		Server:   server,
		Success:  true,
		Duration: time.Since(start),
	}
}

func (m *MultiDeployer) DeployAll() []ServerResult {
	results := make([]ServerResult, len(m.Servers))
	var wg sync.WaitGroup

	for i, server := range m.Servers {
		wg.Add(1)
		go func(index int, srv string) {
			defer wg.Done()
			results[index] = m.deployToServer(srv)
		}(i, server)
	}

	wg.Wait()
	return results
}

func main() {
	md := &MultiDeployer{
		Servers:    []string{"user@web1", "user@web2", "user@web3"},
		LocalPath:  "./build/",
		RemotePath: "/opt/app/",
	}

	fmt.Println("Deploying to all servers...")
	results := md.DeployAll()

	failed := 0
	for _, r := range results {
		if r.Success {
			fmt.Printf("  OK:     %s (%v)\n", r.Server, r.Duration)
		} else {
			fmt.Printf("  FAILED: %s (%v) - %v\n", r.Server, r.Duration, r.Error)
			failed++
		}
	}

	if failed > 0 {
		fmt.Printf("\n%d of %d servers failed\n", failed, len(results))
		os.Exit(1)
	}
	fmt.Println("\nAll servers deployed successfully")
}

Each server gets its own goroutine. The sync.WaitGroup waits until all goroutines complete. Results are collected in a slice.

The bug: race condition on shared slice

Look at this version of DeployAll that has a subtle problem.

func (m *MultiDeployer) DeployAllBuggy() []ServerResult {
	var results []ServerResult
	var wg sync.WaitGroup

	for _, server := range m.Servers {
		wg.Add(1)
		go func(srv string) {
			defer wg.Done()
			result := m.deployToServer(srv)
			results = append(results, result)
		}(server)
	}

	wg.Wait()
	return results
}

Multiple goroutines call append on the same slice at the same time. The append function is not safe for concurrent use. It reads the slice length, writes to the next position, and updates the length. If two goroutines do this at the same time, one result overwrites the other.

Run this with the Go race detector.

go run -race main.go

You get output like this:

WARNING: DATA RACE
Write at 0x00c0000a4018 by goroutine 7:
  main.(*MultiDeployer).DeployAllBuggy.func1()
Previous write at 0x00c0000a4018 by goroutine 8:
  main.(*MultiDeployer).DeployAllBuggy.func1()

The race detector confirms it. Two goroutines wrote to the same memory at the same time.

The fix: use index-based assignment or channels

The earlier correct version avoids this by using index-based assignment. Each goroutine writes to its own index in a pre-allocated slice. No two goroutines write to the same index, so there is no race.

func (m *MultiDeployer) DeployAll() []ServerResult {
	results := make([]ServerResult, len(m.Servers))
	var wg sync.WaitGroup

	for i, server := range m.Servers {
		wg.Add(1)
		go func(index int, srv string) {
			defer wg.Done()
			results[index] = m.deployToServer(srv)
		}(i, server)
	}

	wg.Wait()
	return results
}

The key is make([]ServerResult, len(m.Servers)). This creates a slice with the exact size needed. Each goroutine gets its own index. No append. No race.

Another approach is to use a channel.

func (m *MultiDeployer) DeployAllChan() []ServerResult {
	ch := make(chan ServerResult, len(m.Servers))

	for _, server := range m.Servers {
		go func(srv string) {
			ch <- m.deployToServer(srv)
		}(server)
	}

	var results []ServerResult
	for range m.Servers {
		results = append(results, <-ch)
	}
	return results
}

The channel approach has only one goroutine reading from the channel (the main goroutine), so append is safe. Each deploy goroutine sends its result into the channel. The main goroutine collects all results.

Both approaches work. The index-based version preserves the order of servers. The channel version returns results in completion order.


Step 6: Complete Deploy Tool

Now combine everything from the previous steps into a single, complete deployment tool. This tool handles SSH deployment, release management, health checks, rollbacks, and multi-server parallel execution.

Configuration

The tool needs to know where to deploy, how to check health, and how many retries to allow.

package main

import (
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
	"os/exec"
	"strings"
	"sync"
	"time"
)

type Config struct {
	Servers      []string
	LocalPath    string
	ReleasesDir  string
	CurrentLink  string
	ServiceName  string
	HealthURL    string
	HealthPath   string
	MaxRetries   int
	RetryWait    time.Duration
	HealthTimeout time.Duration
	KeepReleases int
}

Health check

The health checker from Step 3, integrated into the tool.

type HealthResponse struct {
	Status string `json:"status"`
}

func checkHealth(url string, maxRetries int, interval, timeout time.Duration) error {
	client := &http.Client{Timeout: timeout}

	for i := 0; i < maxRetries; i++ {
		resp, err := client.Get(url)
		if err != nil {
			fmt.Printf("      health attempt %d: connection failed\n", i+1)
			time.Sleep(interval)
			continue
		}

		body, err := io.ReadAll(resp.Body)
		resp.Body.Close()
		if err != nil {
			fmt.Printf("      health attempt %d: failed to read response\n", i+1)
			time.Sleep(interval)
			continue
		}

		if resp.StatusCode != http.StatusOK {
			fmt.Printf("      health attempt %d: status %d\n", i+1, resp.StatusCode)
			time.Sleep(interval)
			continue
		}

		var health HealthResponse
		if err := json.Unmarshal(body, &health); err != nil {
			fmt.Printf("      health attempt %d: invalid response\n", i+1)
			time.Sleep(interval)
			continue
		}

		if health.Status == "healthy" {
			return nil
		}
		fmt.Printf("      health attempt %d: %s\n", i+1, health.Status)
		time.Sleep(interval)
	}

	return fmt.Errorf("health check failed after %d attempts", maxRetries)
}

Remote execution

A helper to run commands on remote servers.

func runRemote(server, command string) (string, error) {
	cmd := exec.Command("ssh", server, command)
	output, err := cmd.CombinedOutput()
	return strings.TrimSpace(string(output)), err
}

Per-server deployment

Each server goes through the full cycle: create release, sync files, swap symlink, restart, health check. If any step fails, roll back.

type ServerResult struct {
	Server   string
	Success  bool
	Error    error
	Duration time.Duration
	RolledBack bool
}

func deployToServer(server string, release string, cfg Config) ServerResult {
	start := time.Now()
	releaseDir := fmt.Sprintf("%s/%s", cfg.ReleasesDir, release)
	result := ServerResult{Server: server}

	// Step 1: Create release directory
	fmt.Printf("  [%s] creating release directory\n", server)
	_, err := runRemote(server, fmt.Sprintf("mkdir -p %s", releaseDir))
	if err != nil {
		result.Error = fmt.Errorf("mkdir failed: %w", err)
		result.Duration = time.Since(start)
		return result
	}

	// Step 2: Sync files
	fmt.Printf("  [%s] syncing files\n", server)
	cmd := exec.Command("rsync", "-az", "--delete",
		cfg.LocalPath, server+":"+releaseDir+"/")
	if output, err := cmd.CombinedOutput(); err != nil {
		result.Error = fmt.Errorf("rsync failed: %v: %s", err, string(output))
		result.Duration = time.Since(start)
		return result
	}

	// Step 3: Atomic symlink swap
	fmt.Printf("  [%s] activating release\n", server)
	tmpLink := cfg.CurrentLink + ".tmp"
	swapCmd := fmt.Sprintf(
		"rm -f %s && ln -s %s %s && mv -T %s %s",
		tmpLink, releaseDir, tmpLink, tmpLink, cfg.CurrentLink,
	)
	_, err = runRemote(server, swapCmd)
	if err != nil {
		result.Error = fmt.Errorf("symlink swap failed: %w", err)
		result.Duration = time.Since(start)
		return result
	}

	// Step 4: Restart service
	fmt.Printf("  [%s] restarting service\n", server)
	_, err = runRemote(server, fmt.Sprintf("systemctl restart %s", cfg.ServiceName))
	if err != nil {
		result.Error = fmt.Errorf("restart failed: %w", err)
		result.Duration = time.Since(start)
		rollback(server, cfg)
		result.RolledBack = true
		return result
	}

	// Step 5: Health check
	fmt.Printf("  [%s] checking health\n", server)
	healthURL := fmt.Sprintf("http://%s%s", server, cfg.HealthPath)
	err = checkHealth(healthURL, cfg.MaxRetries, cfg.RetryWait, cfg.HealthTimeout)
	if err != nil {
		result.Error = fmt.Errorf("health check failed: %w", err)
		result.Duration = time.Since(start)
		fmt.Printf("  [%s] health check failed, rolling back\n", server)
		rollback(server, cfg)
		result.RolledBack = true
		return result
	}

	fmt.Printf("  [%s] healthy\n", server)
	result.Success = true
	result.Duration = time.Since(start)
	return result
}

Rollback

Roll back to the previous release by listing releases and switching the symlink.

func rollback(server string, cfg Config) {
	fmt.Printf("  [%s] rolling back\n", server)

	output, err := runRemote(server, fmt.Sprintf("ls -t %s", cfg.ReleasesDir))
	if err != nil {
		fmt.Printf("  [%s] rollback failed: cannot list releases: %v\n", server, err)
		return
	}

	releases := strings.Split(output, "\n")
	if len(releases) < 2 {
		fmt.Printf("  [%s] rollback failed: no previous release\n", server)
		return
	}

	previous := releases[1]
	previousDir := fmt.Sprintf("%s/%s", cfg.ReleasesDir, previous)
	tmpLink := cfg.CurrentLink + ".tmp"

	swapCmd := fmt.Sprintf(
		"rm -f %s && ln -s %s %s && mv -T %s %s",
		tmpLink, previousDir, tmpLink, tmpLink, cfg.CurrentLink,
	)
	_, err = runRemote(server, swapCmd)
	if err != nil {
		fmt.Printf("  [%s] rollback failed: symlink swap: %v\n", server, err)
		return
	}

	_, err = runRemote(server, fmt.Sprintf("systemctl restart %s", cfg.ServiceName))
	if err != nil {
		fmt.Printf("  [%s] rollback warning: restart failed: %v\n", server, err)
		return
	}

	fmt.Printf("  [%s] rolled back to %s\n", server, previous)
}

Cleanup

Remove old releases, keeping only the most recent ones.

func cleanup(server string, cfg Config) {
	command := fmt.Sprintf("cd %s && ls -t | tail -n +%d | xargs rm -rf",
		cfg.ReleasesDir, cfg.KeepReleases+1)
	_, err := runRemote(server, command)
	if err != nil {
		fmt.Printf("  [%s] cleanup warning: %v\n", server, err)
	}
}

Parallel execution

Deploy to all servers at once. Collect results.

func deployAll(cfg Config) []ServerResult {
	release := fmt.Sprintf("v%s", time.Now().Format("20060102150405"))
	results := make([]ServerResult, len(cfg.Servers))
	var wg sync.WaitGroup

	fmt.Printf("Starting deploy: release %s\n", release)
	fmt.Printf("Servers: %s\n\n", strings.Join(cfg.Servers, ", "))

	for i, server := range cfg.Servers {
		wg.Add(1)
		go func(index int, srv string) {
			defer wg.Done()
			results[index] = deployToServer(srv, release, cfg)
		}(i, server)
	}

	wg.Wait()
	return results
}

Putting it together

The main function sets up the configuration, runs the deploy, prints results, and cleans up.

func printResults(results []ServerResult) {
	fmt.Println("\n--- Deploy Results ---")
	succeeded := 0
	failed := 0
	rolledBack := 0

	for _, r := range results {
		if r.Success {
			fmt.Printf("  OK:         %-20s  %v\n", r.Server, r.Duration.Round(time.Millisecond))
			succeeded++
		} else if r.RolledBack {
			fmt.Printf("  ROLLED BACK %-20s  %v  %v\n", r.Server, r.Duration.Round(time.Millisecond), r.Error)
			rolledBack++
			failed++
		} else {
			fmt.Printf("  FAILED:     %-20s  %v  %v\n", r.Server, r.Duration.Round(time.Millisecond), r.Error)
			failed++
		}
	}

	fmt.Printf("\nTotal: %d succeeded, %d failed (%d rolled back)\n",
		succeeded, failed, rolledBack)
}

func main() {
	cfg := Config{
		Servers:       []string{"user@web1", "user@web2", "user@web3"},
		LocalPath:     "./build/",
		ReleasesDir:   "/opt/app/releases",
		CurrentLink:   "/opt/app/current",
		ServiceName:   "myapp",
		HealthPath:    ":8080/health",
		MaxRetries:    5,
		RetryWait:     2 * time.Second,
		HealthTimeout: 5 * time.Second,
		KeepReleases:  5,
	}

	results := deployAll(cfg)
	printResults(results)

	// Clean up old releases on successful servers
	for _, r := range results {
		if r.Success {
			cleanup(r.Server, cfg)
		}
	}

	// Exit with error if any server failed
	for _, r := range results {
		if !r.Success {
			os.Exit(1)
		}
	}
}

What a deploy looks like

Here is the output from running this tool against three servers. Server web2 has a broken application that fails the health check.

Starting deploy: release v20250128143022
Servers: user@web1, user@web2, user@web3

  [user@web1] creating release directory
  [user@web2] creating release directory
  [user@web3] creating release directory
  [user@web1] syncing files
  [user@web2] syncing files
  [user@web3] syncing files
  [user@web1] activating release
  [user@web3] activating release
  [user@web2] activating release
  [user@web1] restarting service
  [user@web3] restarting service
  [user@web2] restarting service
  [user@web1] checking health
  [user@web3] checking health
  [user@web2] checking health
      health attempt 1: starting
  [user@web1] healthy
  [user@web3] healthy
      health attempt 2: starting
      health attempt 3: starting
      health attempt 4: starting
      health attempt 5: starting
  [user@web2] health check failed, rolling back
  [user@web2] rolling back
  [user@web2] rolled back to v20250128130000

--- Deploy Results ---
  OK:         user@web1               1.823s
  ROLLED BACK user@web2               12.451s  health check failed: health check failed after 5 attempts
  OK:         user@web3               1.917s

Total: 2 succeeded, 1 failed (1 rolled back)

Server web1 and web3 deployed successfully. Server web2 failed the health check after five attempts and was automatically rolled back to the previous release. The tool exits with a non-zero code so your CI pipeline knows the deploy was not fully successful.

What you built

This tool does the same thing that large deployment platforms do, using standard Linux tools and about 300 lines of Go:

  • SSH and rsync for file transfer. No agents to install on servers.
  • Release directories with atomic symlink swaps. No downtime during the switch.
  • Health checks that parse the response body. Not just status codes.
  • Automatic rollbacks when health checks fail. The server goes back to the previous working version.
  • Parallel deployment across multiple servers. Each server reports its own status.
  • Error handling at every step. No silent failures.

You can extend this further. Add a configuration file instead of hardcoded values. Add a --dry-run flag that shows what would happen. Add a --rollback flag that rolls back all servers to a specific release. Add timeouts using context.WithTimeout so a stuck SSH connection does not block the entire deploy.

The point is not the tool itself. The point is that deployment automation is not magic. It is SSH, file copying, symlinks, health checks, and error handling. Once you understand these pieces, you can build or debug any deployment system.

Keep Reading

Contents