Quantcast

PDF tagging document tag

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

PDF tagging document tag

Jones Tim
Hi,

Version 5.4.3 iText(Sharp)

I'm generating tags using the composite way by using MarkedContentSequence etc.  This all works fine, however when the method Document.SetTagged() is ran this creates a "<Document>" tag at some point.  I think this might occur on Document.Close().  I assume this because this happens after producing all of my tags.

My issue is that I add my own Element structure PdfName.Document tag to the root element and generate tags to hook onto this.  But iText(Sharp)  generates an additional "<Document>" tag which contains less than desirable tagged content and a reading order that is thrown a little.

Please could we have functionality to set the tagging of the document at the very basic level of setting user properties and creating the default root structure of the document and NOT to create the "<Document>" tag and it's other tags.  Maybe to add an overload to SetTagged or another property to omit generating iText's own "<Document>" tag.  I'd like to create to add this to the root structure myself and add only my pdf tags.

This would be super useful if I can have full control! (Apart from the fundamentals required behind the scenes to create and then add to a StructureTreeRoot).

Thank You

Tim Jones

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

info-2
Op 19/09/2013 16:46, Jones Tim schreef:
> Please could we have functionality to set the tagging of the document
> at the very basic level of setting user properties and creating the
> default root structure of the document and NOT to create the
> "<Document>" tag and it's other tags.

Adding the <Document> tag to the document is really important. It's
being added not only as a root to the structure tree, but to the page
content of every page as well. Without having <Document> as a root
element you can't have a proper PDF/UA document. That's why we have
decided to add the <Document> tag as root element by default.

It's not trivial to localize the method which adds this tag:
- part of the job is done on opening the document,
- part is done on opening and closing pages,
- the rest is done on closing document.
As you can see: it's not as simple as one would think.

Unfortunately, it's not possible now to tell iText to "skip writing the
root element". You could use a workaround in some cases. For instance,
you could redefine the document role. Let's say you could call
Document.setRole(PdfName.DIV), then the <Document> tag would be replaced
with <Div> tag in both structure tree and page content

For the next release, we can add an option which allows to skip writing
certain tags. Currently it's possible to call Document.setRole(null). It
means that <Document> tag will not be written. But in this case all
internal tags will not be written as well. I think we can extend this
functionality a bit so that a certain tag is not written, but all
internal tags are written.

Also I have a question what do you want to achieve with redefining
<Document> tag? What extra functionality do you need? Maybe we can
advise you something or push this functionality into next release.


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

Jones Tim
Hi,

I think this is more of my misunderstanding.  I'm basically writing the tags myself including a <Document> tag to the root.  I wish to have full control over what tags to write and what tags not to write.  I had problems with the default auto generated tags.  As much as I can remember, the logical reading order of the PDF not being quite right and the tagging not passing accessibility checks in Adobe.

I'm now left with a PDF with 2 <Document> tags.  1 of which is auto generated by iText and one which is defined by me in code.  I do use the Role = null to stop tags being set.  But I'm left with a root document tag with a bunch of path:path (etc) tags.  Maybe I don't understand how to achieve what I need.  Defining the logical reading order would be good because this sometimes doesn't quite work in particular scenarios.

Tim



From: iText Info <[hidden email]>
To: Jones Tim <[hidden email]>; Post all your questions about iText here <[hidden email]>
Sent: Wednesday, 2 October 2013, 17:01
Subject: Re: [iText-questions] PDF tagging document tag

Op 19/09/2013 16:46, Jones Tim schreef:
> Please could we have functionality to set the tagging of the document
> at the very basic level of setting user properties and creating the
> default root structure of the document and NOT to create the
> "<Document>" tag and it's other tags.

Adding the <Document> tag to the document is really important. It's
being added not only as a root to the structure tree, but to the page
content of every page as well. Without having <Document> as a root
element you can't have a proper PDF/UA document. That's why we have
decided to add the <Document> tag as root element by default.

It's not trivial to localize the method which adds this tag:
- part of the job is done on opening the document,
- part is done on opening and closing pages,
- the rest is done on closing document.
As you can see: it's not as simple as one would think.

Unfortunately, it's not possible now to tell iText to "skip writing the
root element". You could use a workaround in some cases. For instance,
you could redefine the document role. Let's say you could call
Document.setRole(PdfName.DIV), then the <Document> tag would be replaced
with <Div> tag in both structure tree and page content

For the next release, we can add an option which allows to skip writing
certain tags. Currently it's possible to call Document.setRole(null). It
means that <Document> tag will not be written. But in this case all
internal tags will not be written as well. I think we can extend this
functionality a bit so that a certain tag is not written, but all
internal tags are written.

Also I have a question what do you want to achieve with redefining
<Document> tag? What extra functionality do you need? Maybe we can
advise you something or push this functionality into next release.




------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

info-2
I'm attaching 2 pdf files.

1.pdf is the one from our test suite. It contains root Document tag which includes Header (H1) and Paragraph tags. This is absolutely correct PDF/UA document.

2.pdf is generated in the same way as 1.pdf with the only exception. I've set role=null to Document. In java code it looks like:

document = new Document();
document.setRole(null);

You can see that 2.pdf contains no tags. In this case you can control tagged output manually. No auto generated tags. Is this what you want to achieve? If not, please share your code and sample files. It will help me to understand the problem.


On 10/4/2013 11:21 AM, Jones Tim wrote:
Hi,

I think this is more of my misunderstanding.  I'm basically writing the tags myself including a <Document> tag to the root.  I wish to have full control over what tags to write and what tags not to write.  I had problems with the default auto generated tags.  As much as I can remember, the logical reading order of the PDF not being quite right and the tagging not passing accessibility checks in Adobe.

I'm now left with a PDF with 2 <Document> tags.  1 of which is auto generated by iText and one which is defined by me in code.  I do use the Role = null to stop tags being set.  But I'm left with a root document tag with a bunch of path:path (etc) tags.  Maybe I don't understand how to achieve what I need.  Defining the logical reading order would be good because this sometimes doesn't quite work in particular scenarios.

Tim



From: iText Info [hidden email]
To: Jones Tim [hidden email]; Post all your questions about iText here [hidden email]
Sent: Wednesday, 2 October 2013, 17:01
Subject: Re: [iText-questions] PDF tagging document tag

Op 19/09/2013 16:46, Jones Tim schreef:
> Please could we have functionality to set the tagging of the document
> at the very basic level of setting user properties and creating the
> default root structure of the document and NOT to create the
> "<Document>" tag and it's other tags.

Adding the <Document> tag to the document is really important. It's
being added not only as a root to the structure tree, but to the page
content of every page as well. Without having <Document> as a root
element you can't have a proper PDF/UA document. That's why we have
decided to add the <Document> tag as root element by default.

It's not trivial to localize the method which adds this tag:
- part of the job is done on opening the document,
- part is done on opening and closing pages,
- the rest is done on closing document.
As you can see: it's not as simple as one would think.

Unfortunately, it's not possible now to tell iText to "skip writing the
root element". You could use a workaround in some cases. For instance,
you could redefine the document role. Let's say you could call
Document.setRole(PdfName.DIV), then the <Document> tag would be replaced
with <Div> tag in both structure tree and page content

For the next release, we can add an option which allows to skip writing
certain tags. Currently it's possible to call Document.setRole(null). It
means that <Document> tag will not be written. But in this case all
internal tags will not be written as well. I think we can extend this
functionality a bit so that a certain tag is not written, but all
internal tags are written.

Also I have a question what do you want to achieve with redefining
<Document> tag? What extra functionality do you need? Maybe we can
advise you something or push this functionality into next release.





------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

1.pdf (26K) Download Attachment
2.pdf (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

nbedf
In reply to this post by Jones Tim
Hello,

Could you please tell me how did you obtain your 1.pdf ?

Regards,

Nicolas
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

info-2
nbedf schreef op 11/02/2014 13:35:
> Could you please tell me how did you obtain your 1.pdf ?
I'll ask sales to answer that question personally.

------------------------------------------------------------------------------
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

windowslm
This post has NOT been accepted by the mailing list yet.
Was an answer/code example ever given for this?

I am having the same issue, I can generate a PDF with the tags like number 2, but I would like it to look like number 1.

Thanks.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

blowagie
This post has NOT been accepted by the mailing list yet.
See http://itextpdf.com/nabble
(Your question didn't make it to the mailing-list.)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

windowslm
This post has NOT been accepted by the mailing list yet.
This post was updated on .
Thanks, hopefully I got it right this time.

I'll re ask the question once I get approved.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

windowslm
iText mailing list wrote
> nbedf schreef op 11/02/2014 13:35:
>> Could you please tell me how did you obtain your 1.pdf ?
> I'll ask sales to answer that question personally.

Hopefully I got email list set up correctly this time.

Was an answer/code example ever given for this?

I am having the same issue, I can generate a PDF with the tags like number
2, but I would like it to look like number 1.

Thanks.




--
View this message in context: http://itext-general.2136553.n4.nabble.com/PDF-tagging-document-tag-tp4659177p4659972.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PDF tagging document tag

Dejan Milosavljevic
This is part of the itext test suite in

http://svn.code.sf.net/p/itext/code/trunk/itext/src/test/java/com/itextpdf/text/pdf/TaggedPdfTest.java

The test method is createTaggedPdf1()

-----Original Message-----
From: windowslm [mailto:[hidden email]]
Sent: Friday, May 02, 2014 12:24 PM
To: [hidden email]
Subject: Re: [iText-questions] PDF tagging document tag

iText mailing list wrote
> nbedf schreef op 11/02/2014 13:35:
>> Could you please tell me how did you obtain your 1.pdf ?
> I'll ask sales to answer that question personally.

Hopefully I got email list set up correctly this time.

Was an answer/code example ever given for this?

I am having the same issue, I can generate a PDF with the tags like number 2, but I would like it to look like number 1.

Thanks.




--
View this message in context: http://itext-general.2136553.n4.nabble.com/PDF-tagging-document-tag-tp4659177p4659972.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos.  Get unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Loading...