alcinnz,
@alcinnz@floss.social avatar

This morning I'm skimming over the rest of LibPopplers core library & describing what I missed in using the pdftohtml command as a guide in studying it.

Which includes:

  • Parsed & binary searched Unicode Mapping tables, used in various Output Devs & the Global Params.
  • Various dataheaders, especially related to Unicode & builtin fonts. May be paired with lookup routines.
  • Convert between UTF8, UTF16, & UCS4.
  • UTF8-handling utils.
  • Parsed viewer preferences.

1/2?

alcinnz,
@alcinnz@floss.social avatar
  • Parsed sounds; used by links, annotations, & embedding frameworks.
  • Stream base-class & various subclasses.
  • Catalogs include a dynamically-typed tree structure.
  • An OutputDev which extracts rearranged & decoded text from a PDF file, much harder than it needs to be!
  • Key-value caches used in graphics & cross-reference.
  • An OutputDev which strips away degenerate PDF data.
  • An OutputDev outputting PostScript files utilizing an extra prolog.

2/3!

alcinnz,
@alcinnz@floss.social avatar
  • Parsed parameters for media playback, used by internal links.
  • Parsed PageTransitions, called by embedders.
  • Parsed Movies & their activation parameters, called by embedders.
  • Lookup character names, called by Global Parameters.
  • Stream subclasses to help decode JPEG images.
  • Log info on link-triggered JS.
  • An OutputDev extracting your annotations.
  • Utils for embedding images, used by the PDFDoc class itself.
  • Decode compact arithmetic instructions, used by JPG decoding.

3/4!!

alcinnz,
@alcinnz@floss.social avatar
  • Python script for generating Unicode dataheaders.
  • General logic for parsing out files (fonts, images, video, audio) embedded in the PDF)
  • Filepath parsing.
  • ZLib compression, used by PostScript & the PDF doc itself. With a Stream subclass.
  • Parsed FontInfo, used by Text Output & the global params.
  • Loading PDFs via LibCURL
  • Reading & writing dates.
  • Discrete Cosine Transformation streams.
  • Hashmaps & arrays.
  • "Distinguished name" parsing, used by cryptography.

4/4.5!!

alcinnz,
@alcinnz@floss.social avatar
  • Error reporting.
  • Gathering font metrics, used by Forms.
  • Parsing char maps; used by graphics, global params, & text extraction.
  • An Output Dev for computing the bbox.
  • OutputDev hooking LibPoppler up to LibCairo Vector Graphics, with box rescaling util.

And I think that pretty much covers everything else in LibPoppler's core library!

4.5/4.5! Fin for today! Tomorrow: LibSplash! Amongst its vendored libs.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • rosin
  • thenastyranch
  • tacticalgear
  • ethstaker
  • InstantRegret
  • DreamBathrooms
  • ngwrru68w68
  • magazineikmin
  • Youngstown
  • mdbf
  • khanakhh
  • slotface
  • GTA5RPClips
  • kavyap
  • JUstTest
  • everett
  • cisconetworking
  • Durango
  • modclub
  • osvaldo12
  • tester
  • Leos
  • cubers
  • normalnudes
  • megavids
  • anitta
  • provamag3
  • lostlight
  • All magazines