Skip to content

Latest commit

 

History

History
50 lines (37 loc) · 1.75 KB

2019-05-07 ALTO Board Meeting Minutes.md

File metadata and controls

50 lines (37 loc) · 1.75 KB

2019-05-07 ALTO Board Meeting Minutes

Agenda

  1. Open discussion of supporting handwriting in ALTO. [All]
  2. Other business. [All]

Attending members

  • Ahmed Samir
  • Ashok Popat
  • Art Rhyno
  • Evelien Ket
  • Frederick Zarndt
  • Hany Abdel Hamid Elsawy
  • Joachim Bauer
  • Ralph Marschall

Guests

  • Benjamin Kiessling *
  • David Smith *
  • Gerald Schreiber (CCS)
  • Sebastian Colutto *

* - connected virtually

Minutes

The Board welcomed Ben, David, Gerald and Sebastian to the meeting. After some introductions, the group discussed handwriting in depth, as well as options for preserving internal uncertainty in OCR processing. This would not only be extremely useful for the recognition of typeset text but has special significance for handwriting.

There was consensus that handwriting support is a desirable target for the ALTO format and general agreement that encoding multiple hypotheses within a standard and interoperable lattice structure would be valuable. Beneficial use cases include:

  • search/discovery (a rejected word, for example, might match a query and increase probability metrics that a page is relevant)

  • downstream natural language processing

  • deriving OCR confidence values

  • specific outputs based on word/term distributions, such as topic maps

There is a possible candidate lattice format developed at the University of Rostock connected to the Transkribus project. Sebastian may be able to liaise with the ALTO Board on behalf of Transkribus and will inquire about the availability of the model. This information will be shared with the Board and the group will continue to work toward handwriting support in ALTO. The meeting wrapped up after 3 hours, many thanks to all involved.