find: Updating A Timestamp File
10.3 Updating A Timestamp File
==============================
Suppose we have a directory full of files which is maintained with a set
of automated tools; perhaps one set of tools updates them and another
set of tools uses the result. In this situation, it might be useful for
the second set of tools to know if the files have recently been changed.
It might be useful, for example, to have a 'timestamp' file which gives
the timestamp on the newest file in the collection.
We can use 'find' to achieve this, but there are several different
ways to do it.
10.3.1 Updating the Timestamp The Wrong Way
-------------------------------------------
The obvious but wrong answer is just to use '-newer':
find subdir -newer timestamp -exec touch -r {} timestamp \;
This does the right sort of thing but has a bug. Suppose that two
files in the subdirectory have been updated, and that these are called
'file1' and 'file2'. The command above will update 'timestamp' with the
modification time of 'file1' or that of 'file2', but we don't know which
one. Since the timestamps on 'file1' and 'file2' will in general be
different, this could well be the wrong value.
One solution to this problem is to modify 'find' to recheck the
modification time of 'timestamp' every time a file is to be compared
against it, but that will reduce the performance of 'find'.
10.3.2 Using the test utility to compare timestamps
---------------------------------------------------
The 'test' command can be used to compare timestamps:
find subdir -exec test {} -nt timestamp \; -exec touch -r {} timestamp \;
This will ensure that any changes made to the modification time of
'timestamp' that take place during the execution of 'find' are taken
into account. This resolves our earlier problem, but unfortunately this
runs much more slowly.
10.3.3 A combined approach
--------------------------
We can of course still use '-newer' to cut down on the number of calls
to 'test':
find subdir -newer timestamp -and \
-exec test {} -nt timestamp \; -and \
-exec touch -r {} timestamp \;
Here, the '-newer' test excludes all the files which are definitely
older than the timestamp, but all the files which are newer than the old
value of the timestamp are compared against the current updated
timestamp.
This is indeed faster in general, but the speed difference will
depend on how many updated files there are.
10.3.4 Using '-printf' and 'sort' to compare timestamps
-------------------------------------------------------
It is possible to use the '-printf' action to abandon the use of 'test'
entirely:
newest=$(find subdir -newer timestamp -printf "%A%p\n" |
sort -n |
tail -n1 |
cut -d: -f2- )
touch -r "${newest:-timestamp}" timestamp
The command above works by generating a list of the timestamps and
names of all the files which are newer than the timestamp. The 'sort',
'tail' and 'cut' commands simply pull out the name of the file with the
largest timestamp value (that is, the latest file). The 'touch' command
is then used to update the timestamp,
The '"${newest:-timestamp}"' expression simply expands to the value
of '$newest' if that variable is set, but to 'timestamp' otherwise.
This ensures that an argument is always given to the '-r' option of the
'touch' command.
This approach seems quite efficient, but unfortunately it has a
problem. Many operating systems now keep file modification time
information at a granularity which is finer than one second. Findutils
version 4.3.3 and later will print a fractional part with %A@, but older
versions will not.
10.3.5 Solving the problem with 'make'
--------------------------------------
Another tool which often works with timestamps is 'make'. We can use
'find' to generate a 'Makefile' file on the fly and then use 'make' to
update the timestamps:
makefile=$(mktemp)
find subdir \
\( \! -xtype l \) \
-newer timestamp \
-printf "timestamp:: %p\n\ttouch -r %p timestamp\n\n" > "$makefile"
make -f "$makefile"
rm -f "$makefile"
Unfortunately although the solution above is quite elegant, it fails
to cope with white space within file names, and adjusting it to do so
would require a rather complex shell script.
10.3.6 Coping with odd filenames too
------------------------------------
We can fix both of these problems (looping and problems with white
space), and do things more efficiently too. The following command works
with newlines and doesn't need to sort the list of filenames.
find subdir -newer timestamp -printf "%A@:%p\0" |
perl -0 newest.pl |
xargs --no-run-if-empty --null -i \
find {} -maxdepth 0 -newer timestamp -exec touch -r {} timestamp \;
The first 'find' command generates a list of files which are newer
than the original timestamp file, and prints a list of them with their
timestamps. The 'newest.pl' script simply filters out all the filenames
which have timestamps which are older than whatever the newest file is:
#! /usr/bin/perl -0
my @newest = ();
my $latest_stamp = undef;
while (<>) {
my ($stamp, $name) = split(/:/);
if (!defined($latest_stamp) || ($tstamp > $latest_stamp)) {
$latest_stamp = $stamp;
@newest = ();
}
if ($tstamp >= $latest_stamp) {
push @newest, $name;
}
}
print join("\0", @newest);
This prints a list of zero or more files, all of which are newer than
the original timestamp file, and which have the same timestamp as each
other, to the nearest second. The second 'find' command takes each
resulting file one at a time, and if that is newer than the timestamp
file, the timestamp is updated.