|
Hi,
For history, see here: http://itext-general.2136553.n4.nabble.com/5-1-3-PDF-parsing-changes-td4145786.html part4.chapter15.MyImageRenderListener => http://itextpdf.com/examples/iia.php?id=283 part3.chapter10.ImageTypes => http://itextpdf.com/examples/iia.php?id=183 Tested with iTextsharp 5.2.0.0, don't know about the java version. The C# version still tries to extract the JBIG2 image from the sample PDF "image_types.pdf" created by part3.chapter10.ImageTypes. A corrupt file named "Image18.jbig2" is still being created, even though an Exception is thrown: >> part4.chapter15.ExtractImages.exe [UnsupportedPdfException: The color depth 4 is not supported.] at iTextSharp.text.pdf.parser.PdfImageObject.DecodeImageBytes() at iTextSharp.text.pdf.parser.ImageRenderInfo.PrepareImageObject() at iTextSharp.text.pdf.parser.ImageRenderInfo.GetImage() at part4.chapter15.MyImageRenderListener.RenderImage(ImageRenderInfo renderInfo) Thanks ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ iText-questions mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php |
|
jbig2 requires special decoders that aren't part of iText. At some point, I intend to make the decoders extensible (so folks can register their own decoders, even if iText doesn't provide it out of the box). At this stage, the stream bytes that PdfImageObject returns are valid jbig2 content (not complete though, b/c JBig2 decoders have to also handle the 'globals' data - see JBIG2GLOBALS, which is stored in a separate stream in the PDF).
Long and short, the simple extraction you are trying to do isn't going to work with Jbig2. You'll need to configure a jbig2 decoder with the globals, then use PdfImageObject.getImageAsBytes() to get the raw bytes and run them through the decoder. |
|
Hi Kevin,
On Wed, Mar 7, 2012 at 10:14 PM, Kevin Day <[hidden email]> wrote: > jbig2 requires special decoders that aren't part of iText. At some point, I > intend to make the decoders extensible (so folks can register their own > decoders, even if iText doesn't provide it out of the box). At this stage, > the stream bytes that PdfImageObject returns are valid jbig2 content (not > complete though, b/c JBig2 decoders have to also handle the 'globals' data - > see JBIG2GLOBALS, which is stored in a separate stream in the PDF). > > Long and short, the simple extraction you are trying to do isn't going to > work with Jbig2. You'll need to configure a jbig2 decoder with the globals, > then use PdfImageObject.getImageAsBytes() to get the raw bytes and run them > through the decoder. Sorry, I think I may not have explained well enough again... According to the 5.1.3 changelog: http://itextpdf.com/history/?branch=51&node=513 <quote> Parsing PDF for images: add jbig2 streams to pass through </quote> Maybe I'm misinterpreting/reading to much into that, but to me that means jbig2 streams should be ignored, since parsing **isn't** fully supported yet. I understand that part - no problem :) But even if you try and catch the Exception, there's no way to stop the stream from being written. Or very possibly, I'm not smart enough to figure out how to ignore the jbig2 content :( In other words, don't have a need to process jbig2, I'm just trying to understand the correct/reccomended process.of dealing with parsing images in general. Thanks for the great work on all the parsers. ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ iText-questions mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php |
|
The ExtractImages code you are executing is a *sample* program. All it does it take the image contents for each image it finds and write them to file. If you want to do something different with some of the images, put in an if statement in your code and handle the images differently. The effective type of the image bytes is made available via PdfImageObject#getImageBytesType().
|
|
On Thu, Mar 8, 2012 at 2:49 PM, Kevin Day <[hidden email]> wrote:
> The ExtractImages code you are executing is a *sample* program. All it does > it take the image contents for each image it finds and write them to file. > If you want to do something different with some of the images, put in an if > statement in your code and handle the images differently. The effective > type of the image bytes is made available via > PdfImageObject#getImageBytesType(). Sorry about that, my bad. Looked at the source code - ImageBytesType has been available since 5.0.4. What confused me was the 5.1.3 changelog reference to jbig2 pass though, along with the fact that the *book examples* (commonly referenced on the mailing list as part of the documentation) prior to 5.1.3 did *explicitly* test what kind of image was being parsed: http://itext.svn.sourceforge.net/viewvc/itext/book/src/part4/chapter15/MyImageRenderListener.java?revision=4547&view=markup while the book examples from 5.1.3 are no longer checking the type. So I incorrectly assumed that from 5.1.3 and up you no longer need to test file type. Thanks for the tip! ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ iText-questions mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php |
| Powered by Nabble | Edit this page |
