[tclug-list] Finding the date of the newest file in a directory tree

Thu Jan 10 09:15:26 CST 2008

Actually, I just wanted to find the time, not the file, so by removing the file name, life gets really simple.

I agree, there are just so many tricks out there.  Yes, Google is a good place to start, but at times I've spent hours researching, given up, then asked this list and get an answer in minutes.  The amount of Linux experience we have online in TCLUG is huge and just having someone to help reduce the subject area helps.  Besides, we all learn a little with these inquiries.  I consider myself a pretty experienced programmer and admin, but sometimes even I don't know the answer, or like to hear how others have solved the problem.  There is always some new trick, that's what makes Linux/UNIX so fun.

Thanks all.

--- 
Wayne Johnson,                         | There are two kinds of people: Those 
3943 Penn Ave. N.          | who say to God, "Thy will be done," 
Minneapolis, MN 55412-1908 | and those to whom God says, "All right, 
(612) 522-7003                         | then,  have it your way." --C.S. Lewis

----- Original Message ----
From: Mike Miller <mbmiller at taxa.epi.umn.edu>
To: TCLUG List <tclug-list at mn-linux.org>
Sent: Wednesday, January 9, 2008 11:40:37 PM
Subject: Re: [tclug-list] Finding the date of the newest file in a directory tree

On Wed, 9 Jan 2008, Florin Iucha wrote:

> And the Oscar goes to:
>
> find /some/dir -type f -printf "%h/%f %T@\n" | awk '{ if ($2 >
 the_max) { the_max = $2; file_name = $1; } }
> END { print file_name }'
>
> I would like to thank Google for its search engine and to the find
 man 
> page for its thorough description of the million options and
 switches...

This is the stuff I like most on LUG lists -- learning all the cool
 tricks 
with GNU/UNIX/Linux commands.  So much can be done but it takes years
 to 
learn all the efficient ways of doing things.  I've used awk/gawk a 
gazillion times but only in a few ways, so using it to find a maximum
 was 
not in my repertoire, but that is an excellent idea.  I always would
 have 
sorted the file even though I knew that couldn't be the best way to go.

That said, there are still some problems with the one-liner above.
  First 
and foremost, if any file in the tree contains a space in the filename,

the command will fail.  At first I was going to say that the problem is
 in 
the printf argument because it doesn't uses a space as delimiter
 between 
the file name and date stamp:

$ find . -type f -printf "%h/%f %T@\n"
./Lee, Alvin - I'm Going Home.txt 1182200822
./0_TABLATURE_EXPLANATION.txt 1118104853
./Semisonic - FNT.txt 1153491460
./Animals - House of the Rising Sun.tab.txt 1142214281
[snip]

But maybe it is better to say that the problem is with the awk command.

If we replace $2 with $NF and replace $1 with $0, we get this:

find /some/dir -type f -printf "%h/%f %T@\n" | awk '{ if ($NF >
 the_max) { the_max = $NF; file_name = $0; } }
END { print file_name }'

But the problem with that is that it retains the date stamp at the end 
like so:

./Lee, Alvin - I'm Going Home.txt 1182200822

But that can be removed by adding a little perl (or sed) regexp thingy
 at 
the end:

find /some/dir -type f -printf "%h/%f %T@\n" | awk '{ if ($NF >
 the_max) { the_max = $NF; file_name = $0; } }
END { print file_name }' | perl -pe 's/^(.+) [0-9]+$/$1/'

That will run almost exactly as fast as the earlier suggestion because
 the 
perl bit at the end is very fast and it is only done on the single line
 of 
output at the end.  On the other hand, you didn't say that you wanted
 the 
filename, you said that you wanted the date.  That simplifies things a 
bit!  You can do this:

find /some/dir -type f -printf "%T@\n" | awk '{ if ($1 > the_max) {
 the_max = $1; } } END { print the_max }'

That returns the modification date of the newest file in seconds since 
1970-01-01 00:00:00 UTC.  If you want a different date format, we can 
discuss that.  There must be a good trick.  You can get the current
 time 
in that format using the date command as follows:

date +%s

There are other forms of weirdness with UNIX filenames, like they can 
include a newline, and that will also mess you up, but maybe that never

happens on your system (and if you and your users and your software are

all sane, it won't happen!).

Do you want to find the newest file as of the moment your script starts

running, or will you want to detect new files that are created after
 the 
script starts running but before it finishes?  Maybe this isn't an 
important consideration for you, but you should be aware that what you 
mean by the "newest file" isn't defined precisely by the method you are

using to identify it.

Best,

Mike

_______________________________________________
TCLUG Mailing List - Minneapolis/St. Paul, Minnesota
tclug-list at mn-linux.org
http://mailman.mn-linux.org/mailman/listinfo/tclug-list

      ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mn-linux.org/pipermail/tclug-list/attachments/20080110/79e02a99/attachment-0001.htm