UnixReview.com
September 2006

Shell Corner: Logeasy and Logez — Handling Log Files

Hosted by Ed Schaefer

I'm always intrigued by problems solved by using recursion — especially in the shell. This month, Julie and Michael Wang use recursion in their logez function — one of their two methods for handling log files.

logeasy and logez: Handling Log Files

by Julie Wang and Michael Wang

When writing a new shell program, we often need to handle common logistic tasks such as creating and cleaning up log files. Instead of duplicating the same number of lines again and again, it is ideal to use a generic, reusable function. We call this function in the main program and then are done with it.

In this article, we will introduce two such functions, "logeasy" and "logez". They provide the same functionality but use different implementations. This is a continuation of our effort to to use functions to simplify common shell tasks. We have written "my_getopts" to process command line arguments, and lock_unlock to create a lock file.

However, in order for this simple idea to work, we have to standardize the format and the location of the log file.

This is our scheme — let the log file be located in:

  <script_dir>/log/<script_basename>/<script_basename>.log.<date>
where <date> is presented in ISO 8601 date format [3], i.e. 2006-01-27T15:26:31.

This way, knowing the script location sets the log file's location.

"logeasy" function

To use the "logeasy" function (Listing 1), embed the following code in your script:

  type shopt >/dev/null 2>&1 && shopt -s extglob
  . /path/to/logeasy
  . /path/to/trapadd
  logeasy path=$0 <options>

  # <rest of your script>
where shopt -s extglob is executed only for a bash script.

Logeasy has one mandatory argument and three optional arguments:

The following is this function's main considerations:

Passing the path of the calling program

In ksh93, $0 inside the function defined with the function name {} syntax is the function name itself instead of the calling script path.

  function foo { echo $0; } 
  foo
would produce foo. Since we need the script path to construct the log file location, we choose to pass the path to the function. This is a a more portable solution.

The path=$0 is a fixed, mandatory argument.

Using the function name {} syntax allows the use of local variables, and creates a modular, reusable utility.

What to log

It is not only tedious to selectively log output from certain commands, but it is easy to miss critical debugging information coming from unexpected sources. Consider this example:

  if true; then
    ls > /tmp/my_log_file
  end if
The stub above tries to log the output of the ls command, but my_log_file contains no ls output nor any other errors as the end if syntax error is not logged.

For these reasons, it is easier and safer to log everything. The function uses the exec utility to redirect the stdout and stderr file descriptors to the log file.

In an interactive shell, both the stdout and stderr output goes to the terminal by default. The function redirects them to the same place emulating the interactive environment. What goes to the log file is what would go to the terminal without logging.

What device the log goes to

Of the four values in device=<...> option, both is worth a special mention; in this case, the log goes to a file as well as standard output — exactly what the tee command does. This feature is required when the parent program needs the stdout for processing, and when you need to save a copy.

An example of this is Legato's Networker Module for Oracle nsrnmo. You can modify it to save a copy of the log file, yet you do not want to change the communication between it and the remote Legato server that requires the output.

Another situation is when you are debugging a program. You want to see the output in real time while still logging the output.

The function implements this feature using background process tail -f <log file>. You could add filters to trim the output that goes to the display, i.e.:

  tail -f <log file> | grep -E "^(INFO|ERROR)" &
Obviously, the background process or processes need to be killed when the program terminates. For the case shown above, both the grep process ($!) and its child tail process needs to be located and killed.

However, it is not safe to simply add a kill command at the end of the program; the program may terminate prematurely before it reaches the end due to an error or various other reasons. It is safer to use an exit trap to kill the process.

The use of an exit trap creates another layer of difficulty. The function needs to work with code around the function that may have a different exit trap set. We do not want your trap overwriting ours, nor ours overwriting yours. For this reason, we created function trapadd ( Listing 2), which prepends [4] or appends the trap action to an existing trap. The description of this function is not in the scope of this article. We think it sufficient to know what it does.

Armed with this function, we still need to know if the trap before the function is visible inside the function. We test it. If it is, then we append our trap inside the function, otherwise we do it outside the function.

Another trap difficulty is whether the trap defined inside the function executes at the completion of the function (local trap, ksh93 behavior), or executes at the completion of the calling program (global trap, bash behavior). We test the behavior and let the local trap — which is executed at the completion of the function — set up another trap that will be executed at the end of the calling program.

With all these difficult maneuvers, we arrived at logeasy.

logez

logez (Listing 3) performs the same task as logeasy, but uses a different approach. Unlike logeasy, which is called with only one line, logez requires four lines:

  . /path/to/logez

  if [[ $1 == "--use-logez" ]]
  then shift
  else logez <options> -- $0 "$@"; exit $?
  fi

  # <rest of your script>
However, it does not need shopt and trapadd. The options keep=<n>, silent=no|yes, device=stdout|null|file|both are exactly the same as in logeasy function.

With logez, the calling program passes its execution information (command $0 and arguments $@) to the logez function. The function starts another instance of the calling program, adding --use-logez as its first argument. As in any recursive program, we provide an exit. The second instance recognizes the --use-logez argument, and enters the then branch, which restores the positional parameters with shift. From here, it passes control to the rest of the script. When the second instance finishes, the function finishes and so does the first instance of the program. The exit code is passed from the second instance of the function, and then to the first instance of the program.

The process can be illustrated with a diagram as:

  21898 12883 ./logez.ksh a b c                  # first instance
    21903 21898 ./logez.ksh a b c                # logez function
      21904 21903 ./logez.ksh --use-logez a b c  # second instance
To implement the device=both option, we use tee. As with the logeasy version, filters can be added to customize the display.

The only tricky thing is that the exit code for <prog> | tee is tee, which is almost always true, not the <prog>. We save the exit code for <prog> and use that as the function's return code.

Summary

logez is a lot easier to implement because it forks a copy of the original program, and the handling of logs is done outside of the program. The performance hit is nil on modern computer hardware. With logeasy, the maneuvering is done within the original program trying to co-exist peacefully with the rest of the code and coping with the differences in shell behavior.

Both versions seem to work well. The one you choose depends on which version you like. No matter which version you use, we hope it helps to simplify your life.

Error fixes and enhancements of these two functions, if any, can be found at: http://www.unixlabplus.com/unix-prog/logez/.

Reference

  1. Wang, Michael. "Processing Command-line Arguments with my_getopts." January 2003. Unix Review.
  2. Wang, Michael. "lock_unlock - Creating and Removing a Lock File." February 2004. Unix Review. Also see http://www.unixlabplus.com/unix-prog/lock_unlock/.
  3. "ISO 8601". Wikipedia, the free encyclopedia.
  4. "prepend", the opposite of "append". Hacker dictionary."prepend" is not an English word (yet).
Julie Wang works for Intellidyne LLC in Falls Church, Virginia as a senior system engineer. She specializes in Oracle database, Unix systems, and Lawson enterprise systems. She can be reached at: Julie.Wang@flyi.com.

Michael Wang studied Physics at (Beijing University, MS, 1987) and Statistics at (Columbia, MA, 2001). Currently, he is an independent consultant working in the area of Unix Systems, Oracle database, and Unix system programing. He can be reached at: xw73@columbia.edu.

Their past technical writings can be obtained from: http://www.unixlabplus.com/unix-prog/Publication.txt.
Copyright 2005 UnixReview.com, UnixReview.com's Privacy Policy. Comments about the Web site: webmaster@unixreview.com