six demon bag

Wind, fire, all that kind of thing!

2020-04-30

Shell Patterns (4) - Limiting Execution Time

This is a short series describing some Bash constructs that I frequently use in my scripts.

Sometimes you want a script to give up on what it's trying to do after some period of time. The simplest way for limiting the time a given statement may take for execution is the timeout command.

$ timout 2 ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.031 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.012 ms
$ echo $?
124

However, timeout is only useful for limiting the execution time of a single (blocking) command. Consider for instance a situation where you deployed a VM or an LXD container, and need to wait for cloud-init to complete on that system. Or a situation where you sent an asynchronous request to a REST API. timeout won't help you there. You need to poll the system or API repeatedly until the respective "operation completed" indicator appears.

If you're doing the operation as part of an automated process you cannot wait indefinitely, though, since an unforeseen issue may have occurred that prevented the operation from completing. Instead you want to run the check for a limited period of time and report a timeout to the caller if the "operation completed" indicator didn't appear until then. To do so you first add the timeout period to the current timestamp. That gives you the point in time after which you want to cancel the operation.

timeout='5'  # minutes
end_time="$(date -d "now + ${timeout} minutes" +%s)"

Using the epoch format (number of seconds since 1970-01-01 00:00:00) ensures that timestamps can be compared numerically, e.g. like this:

if [ "$(date +%s)" -gt "$end_time" ]; then
  echo 'operation timed out'
  exit 1
fi

Add the conditional to the body of a loop that uses the polling statement as the loop condition and you have a control structure that will terminate when either the polling statement returns "success" or the timeout expires, whichever comes first.

until lxc exec mycontainer -- test -f /var/lib/cloud/instance/boot-finished >/dev/null 2>&1; do
  if [ "$(date +%s)" -gt "$end_time" ]; then
    echo 'operation timed out'
    exit 1
  fi
  sleep 1
done

Use break instead of exit if you just want to exit from the loop rather than terminate the script. The sleep statement is to yield CPU time to other processes, so your loop doesn't consume the entire CPU.

This can also be made into a reusable function

waitfor() {
  local timeout="${1:-5}"  # minutes
  local delay="${2:-1}"    # seconds
  local end_time="$(date -d "now + ${timeout} minutes" +%s)"
  shift 2

  until "$@" >/dev/null 2>&1; do
    if [ "$(date +%s)" -gt "$end_time" ]; then
      echo "operation timed out: $*"
      return 1
    fi
    sleep "$delay"
  done
}

waitfor 10 5 lxc exec mycontainer -- test -f /var/lib/cloud/instance/boot-finished

However, there are some limitations to that, due to how Bash evaluates commands. For instance, passing statements with pipes to waitfor() isn't going to work (unless you want to use eval, which you probably shouldn't). In a statement

waitfor echo 'foo' | grep 'bar'

the shell will not call waitfor() with the arguments echo, 'foo', |, grep, and 'bar'. Instead it will first evaluate waitfor echo 'foo' and then pass the output of that to grep 'bar' via the pipe. If you want to execute a pipelined statement you need to wrap the statement in a function of its own and then pass that function call to waitfor().

myfunc() {
  echo "$1" | grep "$2"
}

waitfor 2 '' myfunc 'foo' 'bar'

Posted 22:35 [permalink]