On Wed, 10 Jan 2007, G. Scott Walters wrote: > I've got a couple hundred PDF files that have been malformed with some > extra lines AFTER the EOF. This keeps them from being doing important > things like printing, or displaying properly on some versions of > Acrobat. Not all PDFs are necessarily effected with this issue... > > Since these files are hosted on a linux server, I figured the proper > tool to solve this problem would be PERL. The question is, how....if I > open the file with a standard open function, won't it read the file til > the EOF and not beyond? > > I understand that SED might be helpful, but I'm sed-impaired, but I'm > working on that. This should do it: perl -pi -e 'BEGIN{undef $/} ; s/\A(.+?%%EOF).*\z/$1\n/gs' *.pdf That will remove everything after the newline following the first %%EOF in all .pdf files in the default directory. I tested it on some files and it worked. It can be used if the file is not corrupted -- it will then leave the file unchanged except that it will change the date stamp. It is pretty fast. Best, Mike