![]() |
UnixReview.com
September 2005
This month, Julie Wang and Michael Wang weigh in with five date-related shell
functions based on the unix cal calendar command.
Date Related Shell Functions
by Michael Wang and Julie Wang
Many shell programs need to compute dates, for example, to retrieve yesterday's backup, to create Oracle table partitions for next week, or to run a job the first Saturday every month. In this article, we present the following date-related functions written in shell:
pn_month — Previous and next x months relative to the given month
end_month — End of month of the given month
pn_day — Previous and next x days of the given day
cur_weekday — Day of week for the given day
pn_weekday — Previous and next x day of weeks relative
to the given day
Review of Current Tools
To begin, we will survey what tools are currently available; then we'll explain why we created our shell functions and how they work.
Most languages have expansive built-in date functions. For example, to compute the date of the day before 20050801, we do this in Perl:
use Time::Local;
my $a=timelocal(0,0,12,01,07,105);
my ($mday, $mon, $year) = (localtime($a-86400))[3,4,5];
$mon++;
$year += 1900;
printf("%04d%02d%02d\n", $year, $mon, $mday);
In Perl, months are represented by 0 to 11 with 0 indicating January, and years
are represented as the number of years since 1900. Thus, August is represented
by 07, and the year 2005 is represented by 105. The day is 1. Since we are not
concerned with hours, minutes, and seconds, any time will do. We use 12:00:00.
The code simply converts the date and time to Epoch seconds, subtracts number of seconds in 1 day, and converts it back to date and time.
Date functions in other scripting languages that have their roots in C, such as PHP, work similarly to Perl. Databases such as Oracle and MySQL support their own date functions. For example, in MySQL, we use:
select date_format(date_sub('20050801', interval 1 day), '%Y%m%d');
Shell does not have built-in date/time handling. It does not have to, because
all the Unix commands are at its disposal, though normally it does not call
PHP, Oracle, or MySQL.
The dedicated command to do date manipulation is the GNU date. The same job can be done using GNU date like this:
$ date -d '20050801 1 day ago' +"%Y%m%d" 20050731Unfortunately, GNU date is not universally available. While it is the default date command on Linux, it is generally not available on traditional Unix machines, such as Solaris, HP-UX, AIX, etc., unless the Unix admin installs it.
When we first wrote pman, a utility to manage Oracle tables partitioned by date [1], we utilized GNU date. When we deployed the utility on a failover cluster, we had to ensure GNU date was available on all nodes of the cluster, as well as the disaster-recovery site. This increases the cost of maintenance and simply is not possible in an environment that does not support open source software. That was when we started to look for more portable date-related functions.
Perl is less of a problem in this regard. However, it is more than what we need for the tasks at hand. We chose to use the old, simple, omnipresent Unix tool "cal" and shell arithmetic to implement the functions. It is simply easier for us to write and for others to understand (especially for those who do not speak Perl).
For these functions, we use a four-digit number YYYY to represent the year, a six-digit number YYYYMM to represent the year and month (with January as 01), and an eight-digit number YYYYMMDD representing the year, month, and day. The day of the week is represented by 0-6, with Sunday as 0.
pn_month YYYYMM (+|-)x
The pn_month function calculates the previous or next x months from the given month.
It takes two parameters. The first one is the given month in YYYYMM format, and the second number x is the previous (with minus sign) or the next x months.
Common-Sense Version
The month after December is the January of the next year, and the month before January is December of the previous year. An implementation using this common-sense method is shown below:
function pn_month {
typeset ym=$1 pn=$2
(( m = ym % 100 ))
(( y = ym / 100 ))
while (( pn != 0 )); do
if (( pn > 0 )); then
if (( m == 12 ))
then (( m = 1 )); (( y = y + 1 ))
else (( m = m + 1 ))
fi
(( pn = pn - 1 ))
else
if (( m == 1 ))
then (( m = 12 )); (( y = y - 1 ))
else (( m = m - 1 ))
fi
(( pn = pn + 1 ))
fi
done
printf "%s\n" $(( 100*y + m ))
}
Consider this example:
$ for i in -9 -8 -7 0 3 4 5; do pn_month 200508 $i; done 200411 200412 200501 200508 200511 200512 200601This function uses pure shell arithmetic. The modulus of YYYYMM over 100 calculates the MM portion. The integer division of YYYYMM over 100 delivers the YYYY portion. 100*YYYY+MM gives the YYYYMM representation.
Formula Version
Note that to jump from 200512 to the next month 200601, simply add 89. To jump from 200601 back to the previous month, subtract 89. This is true for any given year, proven with this equation:
100*(YYYY+1)+01 - (100*YYYY + 12) = 89Thus, to compute previous and next month, we must subtract or add (respectively) an additional 88 when it goes over the year boundary.
An implementation with this method is shown below:
function pn_month {
typeset ym=$1 pn=$2 x n
(( x = ym % 100 + pn ))
if (( x > 0 ))
then (( n = (x-1) / 12 ))
else (( n = - (12-x) / 12 ))
fi
printf "%s\n" $(( ym + pn + 88*n ))
}
First, we get the given month and add or subtract the month offset. If the resulting
number x is between 1 and 12, obviously it did not go over the year boundary.
However, if x is between 13 and 24, it goes over the year boundary once; if
x is between 25 and 36, it goes over twice; and so on. The number of times it
goes over the year boundary n is the integer division of x-1 over 12.
Similarly, for x <= 0, if x is between 0 and -11, it goes over the year boundary once. If x is between -12 and -23, it goes over twice, and so on. The number of times it goes over the year boundary n is the integer division of 12-x over 12, in this case. We add a minus sign because it goes back in years.
Add or subtract an additional 88 each time it goes over the year boundary to get the result.
Let us verify the function against GNU date:
$ pn_month 200508 -835 193601 $ date -d '20050801 835 month ago' +%Y%m 193601 $ pn_month 200508 -836 193512 $ date -d '20050801 836 month ago' +%Y%m 203801The pn_month and GNU date agree within certain ranges. On our Linux box, GNU suffers from the "Year 2038" problem (2,147,483,647 seconds after the epoch, on Jan 19 03:14:07 2038 CVT, a long integer overflows and present Unix systems will fail), while our shell function does not.
end_month YYYYMM
end_month takes 1 parameter, YYYYMM, and outputs the date at the end of the month in the format YYYYMMDD.
Here is how it works:
$ end_month 200501 20050131 $ end_month 200502 20050228Here is the function:
function end_month {
typeset ym=$1 y m ld
(( y = ym / 100 ))
(( m = ym % 100 ))
for ld in $(cal $m $y); do :; done
printf "%s\n" $(( ym*100 + ld ))
}
Simply loop through every element in the "cal" output for the given month to
get the last day, and add it to the year and month to produce the output.
Instead of using the for loop, we could use this:
set -- $(cal $m $y)
eval ld=\${$#}
or this:
set -- $(cal $m $y)
ld=${@:$#}
The for loop is probably more portable.
pn_day YYYYMMDD (+|-)x
pn_day computes the previous or next x days from the given date. It takes two parameters. The first one is the given date in YYYYMMDD format, and the second one is the previous x days for a negative number and the next x days for a positive number. It outputs the resulting date in the YYYYMMDD format.
Here is how it works:
$ pn_day 20050102 -1 20050101 $ pn_day 20050102 -2 20041231 $ pn_day 20050102 -3 20041230Here is the function:
function pn_day {
typeset ymd=$1 pn=${2:-0} ym y m d x
(( d = ymd % 100 ))
(( ym = ymd / 100 ))
(( y = ym / 100 ))
(( m = ym % 100 ))
if (( pn < 0 )); then
if (( d > 1 )); then
(( x = ymd - 1 ))
(( x > 17520902 && x < 17520914 )) && (( x = 17520902 ))
pn_day $x $(( pn + 1 ))
else
pn_day $(end_month $(pn_month $ym -1)) $(( pn + 1 ))
fi
elif (( pn > 0 )); then
if (( ymd < $(end_month $ym) )); then
(( x = ymd + 1 ))
(( x > 17520902 && x < 17520914 )) && (( x = 17520914 ))
pn_day $x $(( pn - 1 ))
else
pn_day $(( 100*$(pn_month $ym +1) + 1 )) $(( pn - 1 ))
fi
else
printf "%s\n" $ymd
return 0
fi
}
Consider the case for x=-1 - the previous day. If the given date
is not the beginning of the month, then the answer is simply the given date
decremented by 1. If the date is the beginning of the month, then the answer
is the end_month of previous month. Both end_month and pn_month functions are
available, so nothing is new.
A similar analysis can be done for the case for x=1 - the next day.
If the given date is not the end of the month, then the answer is simply the
given date incremented by 1. If the given date is the end of the month, then
the answer is the beginning of the next month. Again, the pn_month function
is used, so nothing is new.
If x is greater than 1, we call the function recursively each time with |x| decremented by 1, until x becomes 0. Unlike pn_month, it is difficult to build a formula. This is where the calendar becomes handy.
There is an abnormality in the calendar in September 1752:
$ cal 9 1752
September 1752
Su Mo Tu We Th Fr Sa
1 2 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
1752 is the year when England (and its American colonies) switched from the
Julian calendar to the Gregorian calendar. The "cal" man page states
"The Gregorian Reformation is assumed to have occurred in 1752 on the 3rd
of September... Ten days following that date were eliminated by the reformation,
so the calendar for that month is a bit unusual. Thus, those 11 days were eliminated
from human history."
We could have built an array for calendar dates indexed by the positions. To get the previous or next date within the given month, we decrement or increment the index instead of the value of the dates. This is how we read the calendar. However, since this is the only exception, we simply added two lines to blackout those 11 days.
Here is how it works:
$ pn_day 17520902 +1 17520914 $ pn_day 17520914 -1 17520902
cur_weekday YYYYMMDD
This function takes YYYYMMDD as an input and outputs the day of week represented by 0-6 with 0 being Sunday.
Here is how it works:
$ cur_weekday 20050815 1This is the function:
function cur_weekday {
typeset ymd=$1 ym y m d i
(( ymd >= 17520914 && ymd <= 17520930 )) && (( ymd = ymd - 11 ))
(( d = ymd % 100 ))
(( ym = ymd / 100 ))
(( y = ym / 100 ))
(( m = ym % 100 ))
cal $m $y | while read i; do
set -- $i
[[ $1 == 1 ]] && {
printf "%s\n" $(( ( 6 + d - $# ) % 7 ))
break
}
done
}
The day of week loops through 0 to 6. Thus it is the modulus of the date plus
an adjusting number x over 7. Since the last date in the first row, y, is Saturday
(6), the adjusting factor can be chosen as 6 - y, as shown in the function code.
We take care of the September 1752 issue by eliminating 11 days:
$ cur_weekday 17520914 4September 14, 1752 is Thursday instead Monday if the 11 days were present.
pn_weekday (+)YYYYMMDD W (+|-)x
pn_weekday computes the previous or next xth occurrence of the specified weekday from the given date. The pn_weekday function takes 3 parameters. The first number is the given date, the second is the weekday to find, the third number x is the x'th occurrence of the weekday either from previous dates (negative number) or next dates (positive number). If the given date includes a + at the beginning, +YYYYMMDD, the given date should be included in the search as well.
For example, to find the third Monday in August 2005, use this:
$ pn_weekday +20050801 1 3 20050815This is the pn_weekday function:
function pn_weekday {
typeset ymd=$1 weekday=$2 pn=${3:-0} i x found=0 IN=0
[[ $ymd == +* ]] && IN=1
if (( pn < 0 ))
then (( sign = -1 ))
elif (( pn > 0 ))
then (( sign = +1 ))
else (( sign = 0 ))
fi
(( i = pn*sign*7 ))
while (( i > 0 )); do
(( IN == 0 )) && ymd=$(pn_day $ymd $sign)
x=$(cur_weekday $ymd)
(( x == weekday )) && {
(( found = ymd ))
}
(( IN == 1 )) && ymd=$(pn_day $ymd $sign)
(( i = i - 1 ))
done
printf "%s\n" $found
}
Simply use pn_day already built to walk through the calendar and use the cur_weekday
to check the weekday.
Summary
In this article, we introduced five functions: pn_month, end_month, pn_day, cur_weekday, and pn_weekday. pn_month, end_month, and cur_weekday are independent of the rest of the functions. pn_day is built on top of pn_month, and end_month, and pn_weekday is built on top of pn_day and cur_weekday.
There are three ways to use these functions:
These functions demonstrate shell arithmetic. We hope you find the technique interesting and these functions useful. See you next time.
This article and the five date-related functions introduced are also available from http://www.unixlabplus.com/unix-prog/date_function/. Error fixes and enhancements, if any, will be available at the URL.
References
[1] "pman - Oracle partition manager." Michael Wang. Retrieved 14 August 2005.
Julie Wang works for Independence Air. She manages Oracle databases, Unix operating systems, and Lawson enterprise systems. She can be reached at: Julie.Wang@flyi.com.
Michael Wang earned a Master's Degrees in Physics (Peking, 1987) and Statistics (Columbia, 2001). Currently, he is studying Unix, Oracle, and corporation politics. He can be reached at: xw73@columbia.edu.
Their past technical writings are listed here: http://www.unixlabplus.com/unix-prog/Publication.txt.
| Copyright © 2005 UnixReview.com, UnixReview.com's Privacy Policy. Comments about the Web site: webmaster@unixreview.com
|