mush42, 6 months ago @scruss For this project, I'd forgo parsing the PDF stream, and extract symantic structure using a visual rendition. Then I'd use this symantic metadata to parse the PDF stream and extract text.
@scruss For this project, I'd forgo parsing the PDF stream, and extract symantic structure using a visual rendition. Then I'd use this symantic metadata to parse the PDF stream and extract text.