brian_gettler, to history
@brian_gettler@mas.to avatar

Héritage Canadiana, a large repository of digitized microfilm reels from Library and Archives Canada, has quietly begun rolling out full-text-searchable transcriptions (now mostly OCR on typed archival docs, but #Transkribus generated transcriptions on manuscript sources are on their way). The search engine is awful - you can't search only transcribed sources - but the collection is already impressive and promises to get better fast. @histodons

https://heritage.canadiana.ca/

#histodons #CdnHist

archivist_Liz, to powerlifting
@archivist_Liz@digipres.club avatar

Hi, I’m Liz! I’m an working on & . I mostly post on topics.

I also post about or my . 😻

I live in , sometimes I share my , other times .

Besides English, I speak/read German, French, Italian, & a little Czech. I’m learning some Portuguese too!

Other stuff: Working on a project using

DigitalHistory, to histodons German
@DigitalHistory@fedihum.org avatar

Umfangreiche französischsprachige Quellenkorpora des Mittelalters maschinell erschließen?

Im nächsten nimmt Pauline Spychala (DHI Paris) die Texterkennungsplattformen & unter die Lupe. Ziel ihres Projektes ist die Entwicklung eines Workflows, der beide Tools effektiv kombiniert, um u.a. den Eigenschaften der untersuchten Quellen gerecht zu werden.

🔜 Mi, 22. Nov., 4-6 pm - via Zoom

ℹ️ Info: https://dhistory.hypotheses.org/6384

@histodons

daieuxetdailleurs, to Quebec French
@daieuxetdailleurs@framapiaf.org avatar
petrichor, to random
@petrichor@digipres.club avatar

Here's a few more details about my progress training a handwriting model with

https://erambler.co.uk/blog/training-a-handwriting-model-update-1/

petrichor, to random
@petrichor@digipres.club avatar

OK, I've finally got round to transcribing enough pages of my own handwriting to train up a model with , and the results are surprisingly good! I expected to need more than the minimal 25 pages to get a decent level of accuracy but it's already noticeably better than the generic recognition on my reMarkable tablet or OneNote.

petrichor,
@petrichor@digipres.club avatar

Since is open source, it should be possible now to recreate this training on my own desktop with the same parameters, and apply the model to recognise new pages, and from there figure out a workflow to simplify getting handwritten notes into plain text for reference or publication.

Has done any of these stages? Any pointers?

Private
dta_cthomas,
@dta_cthomas@mstdn.social avatar

@jacobward @polarbear @histodons

have you heard of/tried https://app.transkribus.eu/? (NB They also have a very handy "Scan Tent", https://readcoop.eu/de/scantent/).

They also have a pricing scheme, but you might get far enough with your free start credits.

Would allow you to concentrate on photographing in the archive, and then batch- resp. your document folders later.

You could even train your own model, but the existing ones work for regular scripts well enough, I guess.

jonaskjoller, to earlymodern

I hear a frequent complaint about applying quantitative methods on texts that have been through tools, such as , that the expected error rate means that you will miss too many occurrences of the word you are looking for. (1/n)

@histodons @digitalhumanities @earlymodern

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines