Publications by topic: PDF, Tagging, Accessibility
ACM Symposium on Document Engineering 2024 (DocEng 2024) San Jose, USA
Automatically producing accessible and reusable PDFs with LaTeX
- Frank Mittelbach
- David Carlisle
- Ulrike Fischer
- Joseph Wright
- ACM Symposium for Document Engineering (DocEng 2024), San Jose, 2024
- Abstract
In this application note we outline the goals of the “LaTeX Tagged PDF” project, describe its current status, show how it can already now been used to create accessible and reusable PDFs, and outline our future plans for a successful completion. Further information can be found at https://latex3.github.io/tagging-project/.
This application note was presented at the ACM Symposium for Document Engineering (DocEng 2024); the official version is available in the ACM Digital Library.
While, as described in the paper, it is now possible to automatically generate accessible and PDF/UA-2 compliant documents with LaTeX, this is not necessarily the case when special journal classes are required by the publisher.
The acmart
class needed for DocEng proceeding does not support tagging yet, which is one of the reasons why the ACM DL contains only an inaccessible PDF of the paper.
It may take some time to make the acmart
class fully compatible with the tagging extensions in all situations, because the class supports various journals (all with different frontmatter requirement), which is an area that the project hasn’t yet fully addressed.
However, for the current article (which is fairly simple from a strutural point of view) only a few modifications to the class were necessary to make it work. Thus, the version of the paper available from this site here is compliant with PDF/UA-2 and the Well-Tagged PDF (WTPDF-1.0) standard.
It has been produced using lualatex-dev
(instead of pdflatex
) and a patched version of the class to support tagging as far as necessary for this article. Other than that, no modifications to the LaTeX source were made.
LaTeX Tagged PDF project progress report for summer 2024
- Frank Mittelbach
- Ulrike Fischer
- TUGboat 45:2, 2024
- Abstract
The LaTeX Tagged PDF project was started in spring 2020 and announced to the TeX community by the LaTeX team at the (online) 2020 TUG conference. This short report describes some of the progress in this multi-year project made during 2024.
Digitization and E-Inclusion in Mathematics and Science 2024” (DEIMS 2024) Tokyo, Japan
Enhancing LaTeX to automatically produce tagged and accessible PDF
- Frank Mittelbach
- Ulrike Fischer
- TUGboat 45:1, 2024
- Abstract
At the TUG 2020 online conference the LaTeX Project Team announced the start of a multi-year project to enhance LaTeX so that it will fully and naturally support the creation of structured document formats, in particular the “tagged PDF” format as required by accessibility standards such as PDF/UA.
In this talk we present the current achievements of this project and the issues we encountered along the way. We also outline open areas of research and the future steps that we shall take to automatically produce well-tagged PDF that supports accessible standards (in particular, the recently finalized PDF/UA-2) as well as general reuse and further conversions. This will be achieved by embedding in the PDF a comprehensive description of the document structure.
The paper was originally presented at the DEIMS 2024 conference in Tokyo. A video of the talk, including a semi-live demonstration, is available on YouTube.
Automated tagging of LaTeX documents what is possible today, in 2023?
- Ulrike Fischer and Frank Mittelbach
- TUGboat 44:2, 2023
- Keywords: LaTeX, tagging, accessibility
- Abstract
The LaTeX Tagged PDF project was started in spring 2020 and announced to the TeX community by the LaTeX Team at the Online TUG Conference 2020. This short report describes the progress and status of this multi-year project achieved with the LaTeX summer release 2023.
Report on the LaTeX Tagged PDF workshop, TUG 2023
- David Carlisle, Ulrike Fischer and Frank Mittelbach
- TUGboat 44:2, 2023
- Keywords: LaTeX, tagging, accessibility, table tagging
- Abstract
On the afternoon before the formal conference program, the LaTeX project held a workshop, led by Ulrike Fischer, on generating tagged PDF from LaTeX. The workshop was well attended with more than thirty people participating — a good mix of package developers and end users. We thank DANTE e.V. for very generous financial support.
The workshop was split into three parts. Firstly, a general introduction to tagging in PDF. Secondly, a demonstration of the process that a class or package maintainer should take to modify the code to produce well-tagged PDF. The acmart class was used for the example as its author, Boris Veytsman, was attending the workshop. Finally, we had a more open discussion on issues and desired syntax for structured tables.
TUG Conference 2023 (Bonn, Germany)
Automated tagging of LaTeX documents—what is possible today?
- Ulrike Fischer
- Video of the TUG 2023 Bonn, Germany
- Keywords: LaTeX, tagging, accessibility
- Abstract
With the summer 2023 release of the LaTeX format it is now possible to create tagged PDF in an automated way from many “Lamport documents”: documents using the commands described in the LaTeX manual from Leslie Lamport.
In this talk I will show what is possible and what still needs manual intervention. I will also describe some of the challenges we faced on the technical side and when designing the mapping between LaTeX structures and the set of PDF tags.
From the PDF days Europe, September 2022 (Berlin)
Tagged and Accessible PDF with LaTeX – project state, achievements, and plans for the future
- Frank Mittelbach and Ulrike Fischer
- Video of the talk presented at PDF Days Europe September 2022
- Keywords: LaTeX, tagging, accessibility, project status
- Abstract
In Summer 2020 the LaTeX Project Team announced the start of a multi-year project [1, 2] to produce tagged and accessible PDF from existing LaTeX sources with no or only minimal configuration adjustments. In this talk we describe the current state of the project, the existing achievements, and our plans for future.
References
[1] Frank Mittelbach, Ulrike Fischer, and Chris Rowley: LaTeX Tagged PDF Feasibility Evaluation Study. LaTeX Project, Sept. 2020. [2] Frank Mittelbach and Chris Rowley: LaTeX Tagged PDF — A blueprint for a large project. TUGboat 41(3):292–298, 2020.
The talk was recorded and is available on the PDFA website. The slides of the presentation are available here.
The LaTeX Tagged PDF project — A status and progress report
- Frank Mittelbach and Ulrike Fischer
- TUGboat 43:3, 2022
- Keywords: LaTeX, tagging, accessibility, project status
- Abstract
The LaTeX Tagged PDF project was started in spring 2020 and announced to the TeX community by the LaTeX Team at the (online) 2020 TUG conference. This short report describes the progress and status of this multi-year project.
Adding XMP metadata in LaTeX
- Ulrike Fischer and Frank Mittelbach
- TUGboat 43:3, 2022
- Keywords: LaTeX, tagging, accessibility, XMP metadata
- Abstract
One task of the “LaTeX Tagged PDF Project” is to evaluate existing solutions to add XMP metadata to a PDF, and if needed, to design and implement a new standard interface for this. In this article we will describe the current state of this task.
From the TUG Conference 2021 (Online conference)
Taming the beast — Advances in paragraph tagging with pdfTeX and XeTeX
- Frank Mittelbach
- Video of the talk at the TUG 2021 online conference
- Keywords: LaTeX, tagging, paragraph handling
- Abstract
In this talk I demonstrate and describe our solution for automatically tagging paragraphs when using engines such as pdfTeX or XeTeX. The situation with LuaTeX is different, and simpler, and therefore not the subject of this talk. I briefly touch on the problems one encounters and explain the approaches we used to overcome them. This will be done with a number of demonstrations intermixed with theoretical explanations.
This work is part of our multi-year journey to gradually modernize LaTeX so that it can automatically produce high-quality tagged and “accessible” PDF without the need to post-process the result of the LaTeX run.
On the road to Tagged PDF: About StructElem, Marked Content, PDF/A and Squeezed Bärs
- Ulrike Fischer
- TUGboat 42:2, 2021
- Abstract
In this article I present two packages as part of the LaTeX Project’s “Tagged PDF” effort:
- tagpdf which contains the core code to create a tagged PDF and is used by the LaTeX team to test new code.
- pdfmanagement-testphase which contains a large number of PDF-related commands and tools and installs a new management command for central PDF dictionaries.
I will show how to use these packages and the benefits they will bring for the average user, while also mentioning resulting incompatibilities and required changes in documents.
There is also a video from the talk given at the TUG online conference 2021 at YouTube on this topic.
LaTeX Tagged PDF — A blueprint for a large project
- Frank Mittelbach
- Chris Rowley
- TUGboat 41:3, 2020
- Abstract
In Frank’s talk at the TUG 2020 online conference we announced the start of a multi-year project to enhance LaTeX to fully and naturally support the creation of structured document formats, in particular the “tagged PDF” format as required by accessibility standards such as PDF/UA.
In this short article we outline the background to this project and some of its history so far. We then describe the major features of the project and the tasks involved, of which more details can be found in the Feasibility Study that was prepared as the first part of our co-operation with Adobe.
This leads on to a description of how we plan to use the study as the basis for our work on the project and some details of our planned working methodologies, illustrated by what we have achieved so far and leading to a discussion of some of the obstacles we foresee.
Finally there is also a summary of recent, current and upcoming activities on and around the project.
LaTeX Tagged PDF Feasibility Evaluation Study
- Frank Mittelbach
- Ulrike Fischer
- Chris Rowley
- Written: December 2019 with minor updates September 2020
This forty-page document contains information about a multi-year project, started by the LaTeX Project Team in 2020, that will extend LaTeX to produce tagged, and hence accessible, PDF with minimal manual intervention. It explains in detail both the project goals and the tasks that need to be undertaken, concluding with a detailed project plan. It is our blueprint for how we think the project should be undertaken.
The Introduction contains an overview of the benefits of the project and explains why LaTeX documents make a good starting point for the production of tagged PDF. More information about this blueprint and the project can be found in the article “LaTeX Tagged PDF — A blueprint for a large project” TUGboat, Volume 41-3 (2020), which will appear shortly.
The original version of this study dates from late 2019 and was addressed primarily to an audience within Adobe which consisted of engineers and managers with a wide knowledge of digital typography and electronic publishing but not necessarily much background within the specialized world of TeX, LaTeX and friends. This version of the study was updated in September 2020 with some minor redactions, corrections and clarifications.
TUG Conference 2020 (Online conference)
Quo vadis LaTeX(3) Team — A look back and at the upcoming years
- Frank Mittelbach
- TUGboat 41:2, 2020
- Abstract
This is a brief write-up of a talk given by the author at the TUG’20 online conference.
The talk touches briefly on the questions “where we are coming from” (we being the LaTeX Project Team), “where we are now” and then focusses on the LaTeX Project’s plans for the upcoming years, which will primarily be focussed on providing an out-of-the box solution for generating tagged PDF with LaTeX and will include gentle refactoring of parts of the core LaTeX and providing important functionality, such as extended standard support for color, hyperlinks etc., as part of the kernel.
This is a multi-year journey that we have just started and we will briefly explain the places this will take us through. At its end we expect that LaTeX users are able to produce tagged and “accessible” PDF without the need to post-process the result of their LaTeX run.
A video of the presentation given by Frank is available on the TUG YouTube channel.
Creating accessible pdfs with LaTeX
- Ulrike Fischer
- TUGboat 41:1, 2020
- Abstract
This article describes the current state and planned actions to improve accessibility of pdfs created with LaTeX, as currently undertaken by the LaTeX Team.
Accessibility in the LaTeX kernel — experiments in Tagged PDF
- Chris Rowley
- Ulrike Fischer
- Frank Mittelbach
- TUGboat 40:2, 2019
- Abstract
This is a brief summary of a talk given by the first author at the TUG’19 conference, together with some references for further reading and viewing.
TUG Conference 2019 (Palo Alto, USA)
Accessibility in the LaTeX kernel — experiments in tagged PDF (slides)
- Chris Rowley and Ulrike Fischer
- TUG Conference 2019 (Palo Alto, USA)
Publications by topic
Under each topic you will find relevant articles and papers on related subjects published by the LaTeX3 project as well as links to videos of their conference presentations.
Publications by year
A alternative view of all publications ordered by year is given on the Publications by Year page.
Books by project members and others
A list of books that we think are useful is given on the Books Page. By buying documentation through this website you support the volunteer work of project members to keep LaTeX useful for you.
- Current LaTeX (LaTeX2e)
- LaTeX -> LaTeX3
- PDF, Tagging, Accessibility
- Coding, Testing & Support
- Other topics independent of the LaTeX version