Transkribus Lite

https://app.transkribus.eu/

Reviewed by Jonathan Lawler, Archivist and Digital Collections Manager, Southeastern Baptist Theological Seminary [PDF Full Text]

Transkribus Lite is a web browser–based text recognition and transcription tool helpful to archivists with varying levels of career experience serving in all types of institutions. The tool was developed by the Recognition and Enrichment of Archival Documents (READ) project and is now supported by READ-COOP. READ-COOP counts prominent institutions as members, including the University of Cambridge, the National Archives of Finland, and the University of Maryland.[1] Transkribus Lite contains the essential features of the Transkribus Expert Client, but it is more accessible.

Transkribus Lite provides over one hundred publicly available AI models that run “handwritten text recognition” (HTR), similar to “optical character recognition” (OCR), on scanned documents. Many institutions have developed publicly available models to recognize text from different languages in documents spanning the eleventh through twenty-first centuries. However, a particularly beneficial element of Transkribus Lite is the ability to train an HTR model to accurately recognize and transcribe the handwriting of a single writer. This is especially helpful for projects transcribing a collection largely consisting of text by one author.

Figure 1: Transkribus Lite public text recognition model choices

The archives staff at Southeastern Baptist Theological Seminary (SEBTS) fortunately discovered Transkribus Lite during a project to transcribe early twentieth century handwritten letters. The tool provided us with a single platform to process scans of correspondence and harness the capabilities of AI-powered text recognition, limiting manual text entry. Transkribus Lite’s intuitive user interface also eased our work. The interface was especially helpful when training employees to use the tool for transcription projects.

Figure 2: Transkribus Lite user interface

SEBTS staff used Transkribus Lite to provide researchers with searchable PDF/A files of handwritten letters. In addition to PDF files, the platform can export transcriptions in XML, DOCX, and TXT formats. We were thus able to fulfill our mission of providing free access to the archival materials in our care. This increased access was accomplished within the tool’s free capabilities.

Using Transkribus Lite requires “credits.” Users begin with 500 free credits, which is enough to transcribe roughly 3,000 pages of printed text or 500 pages of handwritten material, essentially enough for a small project like ours. Once the existing credits are exhausted, users must purchase additional credits.

Transkribus Lite can help smaller institutions with limited funding begin transcription projects of handwritten documents. You can get a feel for the abilities of Transkribus Lite for free here.


[1] “About,” READ-COOP, accessed May 16, 2023, https://readcoop.eu/about/.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.