Quantcast

One table in PDF contains HTML and needs processed

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

One table in PDF contains HTML and needs processed

jamesbury
This post was updated on .
I need help figuring out how to parse one PdfPTable that contains a Paragraph with one Phrase in it that may contain html. I have tried using the XMLHelper.parseXHmtl but it isn't working as I expect. It leaves the html tags as is and then puts the processed html at the end of the document.



My code is as follows:

document.open();

// generateEventDescriptionPdfTable returns a PdfPTable object
document.addTable(manager.generateEventDescriptionPdfTable(fieldMaster));

// call this right before the .close()
XMLWorkerHelper.getInstance().parseXHtml(
                PdfWriter.getInstance(document, response.getOutputStream()), document, new ByteArrayInputStream( manager.getSourcingEvent().getDescription().getBytes("UTF-8")));
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: One table in PDF contains HTML and needs processed

jamesbury
I was able to get it to process the text using the following code:

HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
                htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
                htmlContext.autoBookmark(false);

                // Pipelines
                ElementList elements = new ElementList();
                ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
                HtmlPipeline html = new HtmlPipeline(htmlContext, end);

                // XML Worker
                XMLWorker worker = new XMLWorker(html, true);
                XMLParser p = new XMLParser(worker);
                p.parse(new ByteArrayInputStream( text.getBytes("UTF-8")));

                Paragraph para = new Paragraph();
                para.setFont(font);

                for (Element elem : elements){
                para.add(elem);
                }

        cell = new PdfPCell( para );
        cell.setBorder(border);

                cell.setColspan(colSpan);

The problem now is that formatting doesn't "stick". Bold text isn't bold, bullet lists are plain text, etc... Is something obviously wrong?
Loading...