Data extraction from /Filter /FlateDecode PDF stream in PHP

Since you didn’t tell if you need to access one decompressed stream only or if you need all streams decompressed, I’ll suggest you a simple commandline tool which does it in one go for the complete PDF: Jay Berkenbilt’s qpdf.

Example commandline:

 qpdf --qdf --object-streams=disable in.pdf out.pdf

out.pdf can then be inspected in a text editor (only embedded ICC profiles, images and fonts could still be binary).

qpdf will also automatically re-order the objects and display the PDF syntax in a normalized way (and telling you in a comment what the original object ID of the de-compressed object was).

Should you require to re-compress the file again (maybe after you edited it), just run this command:

 qpdf out-edited.pdf out-recompressed.pdf

(You may see some warning message, telling that the utility was attempting to repair a damaged file….)

qpdf is multi-platform and available from Sourceforge.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)