IBM has developed a new digitisation technology for National Diet Library of Japan, the country’s only national library.
The company said the prototype technology, created by IBM Research, will allow the library digitise its literary artifacts on a wide scale to make them widely available and searchable online by all information seekers.
The new collaborative technology allows full-text digitisation of Japanese literature through expansive recognition of Japanese characters and enables users to review and correct language characters, script and structure.
IBM said the technology will improve the digitisation of library collections worldwide and is designed to promote future international collaborations and standardisation of libraries around the world.
The full-text digitisation system ensures quality recognition of Japanese characters, which is extremely diverse in terms of script.
IBM researchers aimed to optimize the amount of time needed to review and verify the accuracy of the digitized texts.
With latest collaborative tools through crowdsourcing, the technology allows many users to quickly read through the texts and make corrections at a much higher rate.
The system comes with a collaborative correction feature, that allows simultaneous corrections by multiple users through web browsers and improves the accuracy of optical character recognition (OCR).
The new offering also has a collaborative data structuring feature which can digitise texts for visually impaired people to read books using a voice browser and add structural information as well as correct read-out order, which are both supported by inference engines to reduce workload.