How do you make Acrobat DC preserve alt text from web pages?

by Will Martin   Last Updated April 15, 2019 23:16 PM

We are converting a large number of old HTML-formatted newsletters to PDF format for long-term archiving. We want to ensure that the resulting PDFs are accessible. We're using Adobe Acrobat DC to produce the PDFs.

Using Acrobat's internal converter (File > Create > PDF from web page) picks up much of the HTML markup when the "Create PDF tags" option on the conversion settings is enabled. However, it does not seem to recognize ALT text for images linked in the HTML source code.

We do not want to have to manually correct the ALT text for hundreds upon hundreds of images after the PDF is created. Particularly since many of them are repeated decorative elements that we could fix much more easily by running search-and-replace operations on the original HTML.

So is there some way to make Acrobat DC preserve the ALT text from web pages? My attempts at finding an answer via Google have come up dry.

We might also entertain the notion of using different software if there is some better option out there.

