Logging Exit Codes and Alerts

Your scripts should log enough detail to understand failures quickly.

Quick Summary

If a backup job fails but nobody notices, the job is not useful. At minimum you need:

a log file
a non-zero exit code on failure
an alert when the exit code is non-zero

If you already alert on failed rsync/rclone jobs, use the same idea for Restic.

Logging Pattern

run-restic-job.sh
LOG_FILE="/var/log/restic/app-01.log"

{
  echo "- $(date -Is) restic backup start -"
  restic backup /srv/app --tag daily --host app-01
  echo "- $(date -Is) restic backup ok -"
} >> "$LOG_FILE" 2>&1

note

If your environment does not have grep/tail available in the job context, keep the script simple and rely on command exit codes.

Alert Triggers

Non-zero exit code from backup, forget, or check
No successful snapshot in expected time window
Large unexpected growth from restic stats

Common Mistakes

Mistake	Result	Fix
Logs only on failure	you miss trends	log every run with timestamps
No snapshot freshness check	jobs can silently stop	alert if no snapshot in 24h
No restore drills	you do not know restores work	schedule a monthly restore drill

Logging Pattern​

Alert Triggers​

Common Mistakes​

Logging Pattern

Alert Triggers

Common Mistakes