#Transkribus - kbin.social

brian_gettler, 1 month ago to history

Héritage Canadiana, a large repository of digitized microfilm reels from Library and Archives Canada, has quietly begun rolling out full-text-searchable transcriptions (now mostly OCR on typed archival docs, but #Transkribus generated transcriptions on manuscript sources are on their way). The search engine is awful - you can't search only transcribed sources - but the collection is already impressive and promises to get better fast. @histodons

https://heritage.canadiana.ca/

#histodons #CdnHist

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mpjgregoire

archivist_Liz, 6 months ago to powerlifting

#Introduction

Hi, I’m Liz! I’m an #archivist working on #recordsManagement & #digitalPreservation. I mostly post on #digipres topics.

I also post about #powerlifting or my #cat. 😻

I live in #Vienna, sometimes I share my #WienLiebe, other times #grant.

Besides English, I speak/read German, French, Italian, & a little Czech. I’m learning some Portuguese too!

Other stuff: #antifascism #feminism #histGender #communityArchives #antifa #vegan Working on a project using #Transkribus

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Nerdfest, RyunoKi, fiee, fkamiah17 +4 more

DigitalHistory, 6 months ago to histodons German

Umfangreiche französischsprachige Quellenkorpora des Mittelalters maschinell erschließen?

Im nächsten #DigitalHistoryOFK nimmt Pauline Spychala (DHI Paris) die Texterkennungsplattformen #eScriptorium & #Transkribus unter die Lupe. Ziel ihres Projektes ist die Entwicklung eines Workflows, der beide Tools effektiv kombiniert, um u.a. den Eigenschaften der untersuchten Quellen gerecht zu werden.

🔜 Mi, 22. Nov., 4-6 pm - via Zoom

ℹ️ Info: https://dhistory.hypotheses.org/6384

#DigitalHistory #HTR @histodons

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

daieuxetdailleurs, 6 months ago to Quebec French

[#veille #HTR] Déchiffrer des documents d’archives manuscrits à l’aide de l’intelligence artificielle | UdeMNouvelles
https://nouvelles.umontreal.ca/article/2023/11/08/dechiffrer-des-documents-d-archives-manuscrits-a-l-aide-de-l-intelligence-artificielle/
#UdeM #transkribus #IA #archives #quebec @archivistodon

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

petrichor, 9 months ago to random

Here's a few more details about my progress training a handwriting model with #Transkribus

https://erambler.co.uk/blog/training-a-handwriting-model-update-1/

#htr

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ brainwane

petrichor, 9 months ago to random

OK, I've finally got round to transcribing enough pages of my own handwriting to train up a model with #Transkribus, and the results are surprisingly good! I expected to need more than the minimal 25 pages to get a decent level of accuracy but it's already noticeably better than the generic recognition on my reMarkable tablet or OneNote.

#htr

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ brainwane

petrichor, 9 months ago

Since #PyLaia is open source, it should be possible now to recreate this training on my own desktop with the same parameters, and apply the model to recognise new pages, and from there figure out a workflow to simplify getting handwritten notes into plain text for reference or publication.

Has done any of these stages? Any pointers?

#Transkribus #htr

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ brainwane

Private

dta_cthomas, 10 months ago

@jacobward @polarbear @histodons

have you heard of/tried #Transkribus https://app.transkribus.eu/? (NB They also have a very handy "Scan Tent", https://readcoop.eu/de/scantent/).

They also have a pricing scheme, but you might get far enough with your free start credits.

Would allow you to concentrate on photographing in the archive, and then batch- #OCR resp. #HTR your document folders later.

You could even train your own model, but the existing ones work for regular scripts well enough, I guess.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonaskjoller, 1 year ago to earlymodern

I hear a frequent complaint about applying quantitative methods on texts that have been through #HTR tools, such as #Transkribus, that the expected error rate means that you will miss too many occurrences of the word you are looking for. (1/n)

@histodons @digitalhumanities @earlymodern

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...