Logging Exit Codes and Alerts
Your scripts should log enough detail to understand failures quickly.
Quick Summary
If a backup job fails but nobody notices, the job is not useful. At minimum you need:
- a log file
- a non-zero exit code on failure
- an alert when the exit code is non-zero
If you already alert on failed rsync/rclone jobs, use the same idea for Restic.
Logging Pattern
run-restic-job.sh
LOG_FILE="/var/log/restic/app-01.log"
{
echo "- $(date -Is) restic backup start -"
restic backup /srv/app --tag daily --host app-01
echo "- $(date -Is) restic backup ok -"
} >> "$LOG_FILE" 2>&1
note
If your environment does not have grep/tail available in the job context, keep the script simple and rely on command exit codes.
Alert Triggers
- Non-zero exit code from
backup,forget, orcheck - No successful snapshot in expected time window
- Large unexpected growth from
restic stats
Common Mistakes
| Mistake | Result | Fix |
|---|---|---|
| Logs only on failure | you miss trends | log every run with timestamps |
| No snapshot freshness check | jobs can silently stop | alert if no snapshot in 24h |
| No restore drills | you do not know restores work | schedule a monthly restore drill |