|
Hello iText users,
I have attached a redacted PDF I created with PDFTron. I run through the objects in this file with iText with the following code. ArrayList<String> ids = new ArrayList<String>(reader.getXrefSize()); for (int i = 0; i < reader.getXrefSize(); i++) { PdfObject pdfObject = reader.getPdfObject(i); } When getPdfObject(133) is called the following exception is thrown: java.io.IOException: Unexpected end of file at file pointer 33542 I have had a look at the PDF in a hex editor but couldn't figure out if the PDF is malformed, or iText is behaving incorrectly. Does anyone know the PDF specifications well enough to say where the problem lies. All the best, Keith ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ iText-questions mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php |
|
Keith,
This is funny. Just yesterday I learned about the trouble an integer as the final object in an object stream can cause (cf. PDFReader fails when loading the PDF), and already there is the next file demonstrating that trouble... ;) The troubling content is located in the object stream 66 0: 67 0 ... 133 32959 <</DA (/Helv 0 Tf 0 g ) ... <</SM 0.1/TR2 /Default>>4791 The behavior of iText depends on which iText version you use (I compared 4.2 and trunk) and on which PdfReader constructor you use (I compared PdfReader(String filename) and PdfReader(RandomAccessFileOrArray raf, byte ownerPassword[])). iText 4.2, PdfReader(String): Failure reading the object stream during PdfReader construction resulting in a InvalidPdfException: trailer not found. iText 4.2, PdfReader(RandomAccessFileOrArray, byte[]): Failure reading the single object 133 0 resulting in a java.io.IOException: Unexpected end of file. iText TRUNK, PdfReader(String): Reading the single object 133 0 returns null. iText TRUNK, PdfReader(RandomAccessFileOrArray, byte[]): Reading the single object 133 0 returns 4791. So I assume you are using an older version of iText in which integers in object streams still meant trouble. Looking at the above results you might want to consider updating. Regards, Michael PS: The different results of the two iText trunk runs are due to the PdfReader(String) constructor eventually removing all unused objects while in the other case the data is read just-in-time without usage check. In the light of this you might want to look into your choice of redaction tool: If it leaves behind unused objects, these objects might contain data which was hoped to be redacted out of the document. |
| Powered by Nabble | Edit this page |
