Skip to main content

Logging Exit Codes and Alerts

Your scripts should log enough detail to understand failures quickly.

Quick Summary

If a backup job fails but nobody notices, the job is not useful. At minimum you need:

  • a log file
  • a non-zero exit code on failure
  • an alert when the exit code is non-zero

If you already alert on failed rsync/rclone jobs, use the same idea for Restic.

Logging Pattern

run-restic-job.sh
LOG_FILE="/var/log/restic/app-01.log"

{
echo "- $(date -Is) restic backup start -"
restic backup /srv/app --tag daily --host app-01
echo "- $(date -Is) restic backup ok -"
} >> "$LOG_FILE" 2>&1
note

If your environment does not have grep/tail available in the job context, keep the script simple and rely on command exit codes.

Alert Triggers

  • Non-zero exit code from backup, forget, or check
  • No successful snapshot in expected time window
  • Large unexpected growth from restic stats

Common Mistakes

MistakeResultFix
Logs only on failureyou miss trendslog every run with timestamps
No snapshot freshness checkjobs can silently stopalert if no snapshot in 24h
No restore drillsyou do not know restores workschedule a monthly restore drill