Science and Research Content

ProQuest project with TAMU Scholars to train OCR tech to read early modern fonts -

Information resources and technologies provider ProQuest, US, has participated in a project that is expected to vastly accelerate research of 15th through 17th Century cultural history. The company will provide access to page images from the veritable Early English Books Online and newcomer Early European Books to the Early Modern OCR Project (eMOP) at Texas A&M. eMOP will use the content to create a database of typefaces used in the early modern era, train OCR software to read them and then apply crowd-sourcing for editing. The project will turn the corpus of works from this historical period into fully searchable digital documents.

ProQuest is stated to have played a key worldwide role in preservation and access to early modern history, ensuring the survival of printed works from as early as 1450. In the 1930s, the company reportedly became a pioneer of microfiche, when it filmed the contents of the vast archives of the British Library and other major libraries across England — virtually every English language book printed in the 15th, 16th and 17th centuries. The microfilm collection, ProQuest's flagship Early English Books, opened these works to global study and created an avenue for preservation. It has since become the quintessential collection for study of the early modern era, according to the company.

In the 1990s, ProQuest began a massive effort to capture the collection digitally. Early English Books Online seeks to enable scholars to manage, share and collaborate on their research virtually. The company even created a social network that allows the scholars who use the collection as a base for their research to connect with each other.

Then, early in the 21st century, ProQuest expanded the programme to include major European libraries, launching Early European Books with the Danish Royal Library in Copenhagen and the Biblioteca Nazionale Centrale di Firenze in Italy. Digitisation projects are also underway with the UK's scientific and medical library — The Wellcome — and the National Library of the Netherlands.

eMop is led by Texas A&M Professors Laura Mandell, Director of the Initiative for Digital Humanities, Media, and Culture (IDHMC), Ricardo Gutierrez-Osuna of Computer Science, and Richard Furuta, Director of the Center for the Study of Digital Libraries (CSDL), along with Anton DuPlessis and Todd Samuelson, book historians from Cushing Rare Books Library. The scholars earned a two-year, $734,000 development grant from the Andrew W. Mellon Foundation to support the work. ProQuest is one of a variety of participating publishers and software organisations that are collaborating on the project.

Click here to read the original press release.

STORY TOOLS

  • |
  • |

sponsor links

For banner ads click here