Quantcast

Re: pdf extractpages function

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: pdf extractpages function

Stephen LeCompte

To Whom It May Concern:

 

Question:  How come Adobe rectangle does not get extracted with iTextSharp?

 

Version:  itextsharp-all-5.5.0

 

I have an Adobe PDF that has a rectangle highlighting some data on the 2nd page (provided as an attachment) and yet when I do an ExtractPages using the following code – the rectangle is not transferred to the new document.

The highlight of the rectangle is important to differentiate a different item.

 

Otherwise, great job – thank you sincerely for the .dll!

 

If the following code does not show- I was pulling ExtractPages function from the following websites:

 

http://forums.asp.net/t/1630140.aspx?extracting+pdf+pages+using+itextsharp

Thanks again,

 

Stephen

 

private static void ExtractPages(string inputFile, string outputFile, int start, int end)

        {

            // get input document

            PdfReader inputPdf = new PdfReader(inputFile);

 

            // retrieve the total number of pages

            int pageCount = inputPdf.NumberOfPages;

 

            if (end < start || end > pageCount)

            {

                end = pageCount;

            }

 

            // load the input document

            Document inputDoc =

                new Document(inputPdf.GetPageSizeWithRotation(1));

 

            // create the filestream

            using (FileStream fs = new FileStream(outputFile, FileMode.Create))

            {

                // create the output writer

                PdfWriter outputWriter = PdfWriter.GetInstance(inputDoc, fs);

                inputDoc.Open();

                PdfContentByte cb1 = outputWriter.DirectContent;

 

                // copy pages from input to output document

                for (int i = start; i <= end; i++)

                {

                    inputDoc.SetPageSize(inputPdf.GetPageSizeWithRotation(i));

                    inputDoc.NewPage();

 

                    PdfImportedPage page =

                        outputWriter.GetImportedPage(inputPdf, i);

                    int rotation = inputPdf.GetPageRotation(i);

 

                    if (rotation == 90 || rotation == 270)

                    {

                        cb1.AddTemplate(page, 0, -1f, 1f, 0, 0,

                            inputPdf.GetPageSizeWithRotation(i).Height);

                    }

                    else

                    {

                        cb1.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);

                    }

                }

 

                inputDoc.Close();

            }

            inputPdf.Close();

        }

 

Kind Regards,

 

Stephen

 

 

Stephen LeCompte

Application Developer

 

Shen Milsom & Wilke LLC

712 Main Street Suite 730, Houston, TX 77002

Main 713.278.8228  |  Fax 713.278.5330 | Mobile 832.607.8659

[hidden email]  www.smwllc.com

 

Acoustics | Audiovisual | Information Technology | Security | Medical Equipment Planning

 

Description: Description: SM&W_logo_fin_1

 

Description: cid:image002.jpg@01CEF73C.FE6B33E0Description: cid:image003.jpg@01CEF73C.FE6B33E0Description: cid:image004.png@01CEF73C.FE6B33E0

www.smwllc.com/DISCLOSURE-NOTICE.htm

 


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

Example.pdf (326K) Download Attachment
mkl
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: pdf extractpages function

mkl
Stephen LeCompte wrote
Question:  How come Adobe rectangle does not get extracted with iTextSharp?

Version:  itextsharp-all-5.5.0

I have an Adobe PDF that has a rectangle highlighting some data on the 2nd page (provided as an attachment) and yet when I do an ExtractPages using the following code - the rectangle is not transferred to the new document.
You use code which only copies page content. That rectangle, though, is an annotation, and annotations are not part of the page content but instead extra data.

You'll find some background information in this stackoverflow answer: http://stackoverflow.com/a/15945467/1729265

In essence: If you want to copy all aspects of a page, you have to use PdfCopy*. If you merely want to insert the contents of a page, you may use PdfWriter.

Regards,   Michael
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: pdf extractpages function

info-2
In reply to this post by Stephen LeCompte
This question was answered on StackOverflow:
http://stackoverflow.com/questions/23256944/itextsharp-when-extracting-a-page-it-fails-to-carry-over-adobe-rectangle-highl
You're probably throwing away an annotation (and you shouldn't copy/paste from bad examples).

On 4/24/2014 1:04 AM, Stephen LeCompte wrote:

To Whom It May Concern:

 

Question:  How come Adobe rectangle does not get extracted with iTextSharp?

 

Version:  itextsharp-all-5.5.0

 

I have an Adobe PDF that has a rectangle highlighting some data on the 2nd page (provided as an attachment) and yet when I do an ExtractPages using the following code – the rectangle is not transferred to the new document.

The highlight of the rectangle is important to differentiate a different item.

 

Otherwise, great job – thank you sincerely for the .dll!

 

If the following code does not show- I was pulling ExtractPages function from the following websites:

 

http://forums.asp.net/t/1630140.aspx?extracting+pdf+pages+using+itextsharp

Thanks again,

 

Stephen

 

private static void ExtractPages(string inputFile, string outputFile, int start, int end)

        {

            // get input document

            PdfReader inputPdf = new PdfReader(inputFile);

 

            // retrieve the total number of pages

            int pageCount = inputPdf.NumberOfPages;

 

            if (end < start || end > pageCount)

            {

                end = pageCount;

            }

 

            // load the input document

            Document inputDoc =

                new Document(inputPdf.GetPageSizeWithRotation(1));

 

            // create the filestream

            using (FileStream fs = new FileStream(outputFile, FileMode.Create))

            {

                // create the output writer

                PdfWriter outputWriter = PdfWriter.GetInstance(inputDoc, fs);

                inputDoc.Open();

                PdfContentByte cb1 = outputWriter.DirectContent;

 

                // copy pages from input to output document

                for (int i = start; i <= end; i++)

                {

                    inputDoc.SetPageSize(inputPdf.GetPageSizeWithRotation(i));

                    inputDoc.NewPage();

 

                    PdfImportedPage page =

                        outputWriter.GetImportedPage(inputPdf, i);

                    int rotation = inputPdf.GetPageRotation(i);

 

                    if (rotation == 90 || rotation == 270)

                    {

                        cb1.AddTemplate(page, 0, -1f, 1f, 0, 0,

                            inputPdf.GetPageSizeWithRotation(i).Height);

                    }

                    else

                    {

                        cb1.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);

                    }

                }

 

                inputDoc.Close();

            }

            inputPdf.Close();

        }

 

Kind Regards,

 

Stephen

 

 

Stephen LeCompte

Application Developer

 

Shen Milsom & Wilke LLC

712 Main Street Suite 730, Houston, TX 77002

Main 713.278.8228  |  Fax 713.278.5330 | Mobile 832.607.8659

[hidden email]  www.smwllc.com

 

Acoustics | Audiovisual | Information Technology | Security | Medical Equipment Planning

 

Description: Description: SM&W_logo_fin_1

 

Description: cid:image002.jpg@01CEF73C.FE6B33E0Description: cid:image003.jpg@01CEF73C.FE6B33E0Description: cid:image004.png@01CEF73C.FE6B33E0

www.smwllc.com/DISCLOSURE-NOTICE.htm

 



------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform


_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [SPAM] Re: pdf extractpages function

info-2
In reply to this post by mkl
On 4/24/2014 8:21 AM, mkl wrote:
> You'll find some background information in this stackoverflow answer:
> http://stackoverflow.com/a/15945467/1729265
>
> In essence: If you want to copy all aspects of a page, you have to use
> PdfCopy*. If you merely want to insert the contents of a page, you may use
> PdfWriter.
OK, I upvoted that answer.
In some cases (if only one PDF document is at play), one should use
PdfStamper.

------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
mkl
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [SPAM] Re: pdf extractpages function

mkl
iText Info wrote
On 4/24/2014 8:21 AM, mkl wrote:
> You'll find some background information in this stackoverflow answer:
> http://stackoverflow.com/a/15945467/1729265
In some cases (if only one PDF document is at play), one should use PdfStamper.
You are right of course.

Unfortunately Stephen was not 100% clear about his task in his post here. The stackoverflow post you found (and which very well might be by him, too) more clearly indicates that he wants "to produce a new PDF based on range of page numbers", and in that case a PdfStamper in combination with a PdfReader and a selected page range most likely is the solution of choice.

Regards,   Michael
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [SPAM] Re: [SPAM] Re: pdf extractpages function

info-2
mkl schreef op 24/04/2014 8:50:
> Unfortunately Stephen was not 100% clear about his task in his post here.
> The stackoverflow post you found (and which very well might be by him, too)
> more clearly indicates that he wants "to produce a new PDF based on range of
> page numbers", and in that case a PdfStamper in combination with a PdfReader
> and a selected page range most likely is the solution of choice.

And the answer on StackOverflow was edited to reflect that ;-)

Also read the advice given on Meta StackOverflow:
http://meta.stackoverflow.com/questions/251946/duplicate-questions-versus-rtfm

We probably should extend the iText Wiki as well as create a question
that covers many specific cases and then give an answer we can refer to
when we mark a question as duplicate.

------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Loading...