← trebben.dk

How to monitor systemd timers

March 2026

Systemd timers are replacing cron on most modern Linux systems. They're more powerful — dependencies, randomized delays, persistent timers that catch up after downtime. But they have the same fundamental monitoring problem: when a timer stops firing or its service fails silently, nobody gets told.

Here's how to fix that, from built-in systemd features to external monitoring.

First: check what's running

Before setting up monitoring, know what you have. systemctl list-timers is the equivalent of crontab -l:

# Show all active timers:
systemctl list-timers --all

# Output looks like:
NEXT                        LEFT          LAST                        PASSED   UNIT
Tue 2026-03-25 02:00:00 UTC 4h left       Mon 2026-03-24 02:00:00 UTC 20h ago  backup.timer
Tue 2026-03-25 06:00:00 UTC 8h left       Mon 2026-03-24 06:00:00 UTC 16h ago  certbot.timer
n/a                         n/a           Mon 2026-03-24 03:15:00 UTC 19h ago  cleanup.timer

The n/a in the NEXT column is your first red flag — it means the timer is loaded but won't fire again. Usually because the timer unit is disabled or the OnCalendar= expression is wrong.

# Check a specific timer's status:
systemctl status backup.timer

# Check if the associated service succeeded:
systemctl status backup.service

# See recent logs for the service:
journalctl -u backup.service --since "24 hours ago" --no-pager

Three ways to get alerted

Method 1

OnFailure= (built-in systemd)

Systemd has a native failure handler: the OnFailure= directive. When a service unit fails, systemd starts a specified "failure" unit. You can use this to send an email, a Slack message, or any notification.

First, create a notification template service:

# /etc/systemd/system/[email protected]
[Unit]
Description=Send failure notification for %i

[Service]
Type=oneshot
ExecStart=/usr/local/bin/notify-failure.sh %i
#!/bin/bash
# /usr/local/bin/notify-failure.sh
UNIT="$1"
HOST=$(hostname)
STATUS=$(systemctl status "$UNIT" --no-pager 2>&1 | head -20)

# Email:
echo "$STATUS" | mail -s "FAILED: $UNIT on $HOST" [email protected]

# Or Slack webhook:
curl -s -X POST "$SLACK_WEBHOOK" \
  -H 'Content-type: application/json' \
  -d "{\"text\":\"systemd unit failed: *${UNIT}* on ${HOST}\"}"

Then add OnFailure= to any service you want to monitor:

# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup
OnFailure=notify-failure@%n.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
# Reload after editing:
systemctl daemon-reload

Catches

  • Non-zero exit code
  • Timeout (if TimeoutStartSec is set)
  • OOM kills (service goes to failed state)

Misses

  • Timer never fires (disabled, wrong schedule)
  • Timer unit not started after reboot
  • Entire server is down

Good for: catching failures inside the job. Doesn't help when the timer itself is the problem.

Method 2

Journald monitoring

Every systemd service logs to journald. You can monitor these logs for failure patterns, either with a script or a log aggregation tool.

# Find all failed units in the last 24 hours:
journalctl --since "24 hours ago" -p err -u "*.service" --no-pager

# Check for specific failure patterns:
journalctl -u backup.service --since "24 hours ago" | grep -i "fail\|error\|exit"

# Watch for failures in real time:
journalctl -f -p err

You can automate this with a checker script that runs on its own timer:

# /usr/local/bin/check-timers.sh
#!/bin/bash
# Alert if any critical timer hasn't run in expected window

CRITICAL_TIMERS="backup certbot db-vacuum"

for timer in $CRITICAL_TIMERS; do
  status=$(systemctl show "${timer}.service" -p ActiveState --value)
  result=$(systemctl show "${timer}.service" -p Result --value)

  if [ "$result" = "failed" ]; then
    echo "FAILED: ${timer}.service"
    # send alert...
  fi
done

Useful for auditing. But monitoring infrastructure with more infrastructure on the same server is fragile — if the server has problems, your monitoring has the same problems.

Method 3

Heartbeat monitoring (external)

Different approach: your service pings an external endpoint after each successful run. If the ping stops arriving, the external service alerts you. The key advantage: the alert comes from outside your server.

# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup
OnFailure=notify-failure@%n.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
ExecStartPost=/usr/bin/curl -fsS --max-time 10 https://ping.trebben.dk/YOUR_TOKEN

ExecStartPost only runs if ExecStart succeeds (exits 0). So the ping is proof that the job completed successfully.

Tip: If you want the ping to fire even when the job fails (to track that the timer is at least triggering), prefix it with -: ExecStartPost=-/usr/bin/curl .... The dash tells systemd to ignore the curl exit code.

Catches

  • Service failures (no ping sent)
  • Timer not firing (no ping sent)
  • Timer disabled or deleted
  • Server offline or unreachable
  • Disk full, OOM, network down

Misses

  • Job runs but produces bad data
  • Monitoring service itself is down

The only method that catches timers which never fire. Combined with OnFailure=, you have complete coverage.

CronPulse does exactly this.

20 monitors free. No agents, no containers, no config files.
Add ExecStartPost=curl to your service unit — get alerted within minutes when a timer stops.

Start monitoring →

The recommended setup

For any systemd timer that matters, use two layers:

# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup
# Layer 1: immediate alert on failure (local)
OnFailure=notify-failure@%n.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
# Layer 2: heartbeat ping on success (external)
ExecStartPost=/usr/bin/curl -fsS --max-time 10 https://ping.trebben.dk/YOUR_TOKEN
# /etc/systemd/system/backup.timer
[Unit]
Description=Run backup nightly

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target
# Enable and start:
systemctl daemon-reload
systemctl enable --now backup.timer

OnFailure= gives you immediate, detailed failure notifications. The heartbeat ping catches everything else — timer disabled, server down, schedule misconfiguration. Between them, there's no silent failure mode left.

Common systemd timer failure modes

Systemd timers vs cron

If you're deciding between cron and systemd timers, here's the short version: systemd timers are better for anything that needs dependencies, logging, resource limits, or persistent scheduling. Cron is simpler for quick one-liners.

The monitoring problem is identical. Both schedulers can fail silently. Both need external heartbeat monitoring to catch the "job never started" case. The only difference is syntax — && curl in your crontab vs ExecStartPost=curl in your service unit.

How to get alerted when a cron job fails →  ·  Monitoring cron jobs (step-by-step) →  ·  The cron monitoring landscape →  ·  ← trebben.dk