[tclug-list] Problem,

Thu Jan 11 14:54:36 CST 2007

You may want to take a look at pdftk (http://www.accesspdf.com/pdftk/)
and see if it can read and write the problem files which hopefully gives
you a valid pdf on output.

There are also some Perl modules that can read and write pdf files,
http://search.cpan.org/~antro/PDF-111/PDF.pm

I've used both fairly successfully to create/manage/manipulate a couple
of million pages of survey results and they are both pretty solid.

I think Ghostscript could also 'print' the pdfs to another pdf which
might clear out the errors as well. Finally, Imagemagick can do a
conversion from pdf -> ps -> pdf that might fix things as well, although
it is a lot slower than the other options.

--rick

G. Scott Walters wrote:
> I've got a couple hundred PDF files that have been malformed with some
> extra lines AFTER the EOF. This keeps them from being doing important
> things like printing, or displaying properly on some versions of
> Acrobat. Not all PDFs are necessarily effected with this issue...
> 
> Since these files are hosted on a linux server, I figured the proper
> tool to solve this problem would be PERL. The question is, how....if I
> open the file with a standard open function, won't it read the file
> til the EOF and not beyond?
> 
> I understand that SED might be helpful, but I'm sed-impaired, but I'm
> working on that.
> 
> Any thoughts or ideas?
>