PipeOnline, a web-interface for DAISY Pipeline

Olav Indergaard

The Norwegian Library of Talking Books and Braille, P.O.Box 2764 Solli, N-0204 Oslo, Norway

Abstract

This paper describes the development and usage of a centralized production platform for the altformats DAISY and DTBook.

1. Introduction

In January 2007 the Norwegian national coordinator of accessibility in higher education "Universell" started the project "Universal structuring and accessibility in students literature". A working-group consisting of members from the higher education sector (Norwegian University of Science and Technology, University of Oslo, University of Bergen) and The Norwegian Library of Talking Books and Braille was formed to work on the project.

The goals of the project were:

1.1 The development of PipeOnline

A wide range of tools for producing accessible content were already available, but these tools always required a local installation of software and no centralized production platforms were available. The need for a centralized production platform for altformats resulted in the sub-project PipeOnline. PipeOnline is a tool for producing and distributing DAISY audio books and DTBook files over the internet. The PipeOnline project started up in June 2008 as a joint effort between NLB (The Norwegian Library of Talking Books and Braille) and DAISY Consortium. The core functions were ready in July 2009, and the project launched as a service in September 2009. NLB and DAISY Consortium are collaborating on maintenance and further development of the service.

1.2 DAISY and DTBook, the main formats in PipeOnline

DAISY (Digital Accessible Information System) is an international standard on the structuring of digital audio books. DTBook (Digital Talking Book) is a document format used in e-books and DAISY. DTBook is XML based, and is also known as DAISY XML. Both standards are developed and maintained by the DAISY Consortium.

A DAISY publication contains different types of navigation-elements (i.e. headings, page numbers, notes, sidebars, block quotes, production notes). This makes the publication easily accessible for visually impaired users. The DAISY publication can be used in a software or hardware DAISY playback system which allows the user to utilize specific DAISY features such as navigation between different text-elements and bookmarks. DAISY audio books can also be used in MP3 and CD-players (this causes loss of DAISY features). A DAISY publication containing approximately 50 hours of audio can normally be distributed on a single CD-ROM. DAISY is based on open standards (HTML/XHTML/XML/SMIL). The use of open standards makes the format independent on software and operating systems.

The figure shows a handshake between two hands. One hand is titled DTBook, the other is titled DAISY

Figure 1. DAISY and DTBook, two merging standards

2. Motivation for creating a centralized production platform

2.1 Services for disabled students in Norway

From 1999 all Norwegian universities and university colleges were required to have a contact person for students with disabilities, and also a plan of action to make each higher education institution (HEI) more accessible. This requirement was introduced by the Ministry of Education, partly as a response to the development of a national policy to make all parts of the society more inclusive. The institutions are all required to make necessary adjustments in order to ensure that disabled candidates have equal access to education, but it is up to the institutions themselves to define which services they can offer their students. All institutions are required to have an appointed contact person for disabled students as well as a plan of action on how to include this group of students.

In 2009 the Norwegian Anti-discrimination and Accessibility Act, which prohibits discrimination based on disability, was introduced. This Act also emphasizes the responsibility of universities and colleges to ensure equal opportunities in education for disabled students.

The main motivation for the creation of a centralized production platform was that it would give all institutions the same prerequisites to produce accessible content.

3. Problem

Norway has seven accredited universities, six accredited specialized university institutions, 25 accredited university colleges, two accredited national colleges of the arts and 29 private institutions of higher education. The size varies from about 30 000 students in the University of Oslo, to the smallest university colleges with less than a thousand students. There is no established national norm for the services offered to disabled students by Norwegian universities and colleges. Different institutions offer different kinds of accessibility services and the methods used by the institutions to provide altformats for students also differs significantly. General knowledge of altformats seem to span from very low in some institutions to very high in others.

When it comes to production, the quality of Norwegian speech-engines being used in commercial production tools for altformats is not acceptable for the production of students literature. Some of the institutions have no available tools for converting printed documents to accessible text.

4. Approach on constructing a centralized production platform

Some requirements for the centralized production platform were outlined.

The platform should:

4.1 Architecture of PipeOnline

PipeOnline was developed as a thin web client for DAISY Pipeline. DAISY Pipeline is a open source framework for DAISY-related conversions and manipulations. PipeOnline consists of several additional components to DAISY Pipeline (i.e. e-mail, speech-engines, web-interface and user database). The components chosen were: Apache Tomcat, MYSQL-database, Neospeech Paul TTS-engine, Brage TTS-engine and Google mail. The code in PipeOnline is also totally decoupled from the DAISY Pipeline core, making the system easy to maintain via core updates.

The figure shows the different components of PipeOnline

Figure 3. Components in PipeOnline

4.2 The development of the Norwegian speech-engine "Brage"

Alongside the PipeOnline-project, the Norwegian speech-engine "Brage" was developed. This was a joint effort between NLB and TPB (The Swedish Library of Talking Books and Braille). The Brage speech-engine is a unit selection TTS, where the utterances are automatically generated through selection and concatenation of segments from a large corpus of recorded sentences.

The development of a TTS system for the production of university level textbooks calls for considerations that are not always required for a conventional TTS system. The text corpus should preferably consist of text from the same area as the intended production purpose. Consequently, the corpus should contain a lot of non-fiction literature to cover various topics such as religion, medicine, biology, and law. From this corpus, high frequency terms and names are collected and added to the pronunciation dictionary. An important feature of the Brage speech-engine is that the production team has total control of the system components. An unlimited number of new pronunciations can be added, as well as modifications and extensions of the text processing system and rebuilding of the speech database.

5. Results

5.1 PipeOnline

PipeOnline serves as a tool for producing and distributing DAISY audio books and DTBook files over the internet. PipeOnline is free and open source software (LGPL). Users (currently only universities and university colleges) are granted access to PipeOnline by signing an agreement with NLB. NLB are also considering granting access to other organizations. DAISY audio books are narrated by Norwegian and English speech-engines and produced as DAISY full text. The audio books are compressed (as zip files) and distributed by e-mail. Royalties can be calculated on demand.

The figure shows the production cycle in PipeOnline

Figure 2. The PipeOnline production cycle

5.2 Useful features in PipeOnline

5.3 The benefits of using PipeOnline

Production and distribution time is significantly faster than common TTS-based DAISY production. Server environment greatly improves TTS-capacity (15 x normal speed). E-mail improves delivery time to a number of hours (compared to a number of days when productions are distributed by the Norwegian postal service). The production and distribution costs are also significantly lower because no physical media or transportation is required. This also makes the production more environmentally friendly.

PipeOnline has optional output formats. Both DTBook, DAISY 2.02 and DAISY 3 file sets (DAISY 3 supports the use of MathML) can be generated. New features and support for new standards can easily be implemented through the DAISY Pipeline core. DAISY Pipeline also serves as a reference implementation for future DAISY standards.

6. Conclusion

The PipeOnline service has so far had limited use, but some crucial experiences have been made.

6.1 Successful achievements of PipeOnline

Most users find the service easy to use and the user interface is reported by visually impaired users to be intuitive and accessible. The RTF to DTBook conversion seem to be especially appreciated as it makes non-technical users capable of creating DTBooks. The detailed view of running processes is also found to be very valuable to technical advanced users.

The most important achievement of the PipeOnline project is that major parts of the project code will be reused in DAISY Pipeline 2 (the next version of DAISY Pipeline). This will benefit a large number of organizations and persons worldwide.

6.2 Challenges in the use of PipeOnline

6.3.1 Document languages and content structuring

A lot of the material produced through PipeOnline has excerpts in other languages than Norwegian and English. This content is handled poorly as only Norwegian and English speech-engines are available in PipeOnline. The inclusion of STEM content (Science, Technology, Engineering, and Mathematics) is also very challenging. Knowledge of XML and related standards (MathML, LaTeX) is crucial, and the conversion from RTF to DTBook in PipeOnline is suitable only for the conversion of documents with simpler structures. Other free and open source conversion tools (i.e. Save As DAISY and Odt2DAISY) that are capable of handling STEM content are available, but these tools demand a lot of training to be used successfully.

6.3.2 Royalties

Royalties to the copyright holders makes institutions with limited funds cautious towards doing audio production in PipeOnline. Alongside the development of PipeOnline, the first DAISY-players capable of narrating DTBook files emerged. The result of this is that some institutions are only using the DTBook conversion in PipeOnline, then they distribute the DTBook to students who have DAISY players with DTBook playback-functionality. A DTBook production is not subject to royalties. While this is good from an accessibility point of view, the TTS-engines bundled with DAISY-players leave a lot to be desired when it comes to correct pronunciation of higher education literature texts. It is also a drawback that limited response is given to the developers of the Brage-TTS used in PipeOnline.

6.3.3 Local issues

Some institutions are still striving to achieve conversions from printed material to accessible text due to limited knowledge in the use of OCR-tools. Accessibility-personnel in some institutions find the structuring and editing in XML to be intimidating. This sometimes causes invalid document output and makes the user resign from further production through PipeOnline.

To train the users of PipeOnline in making good accessible structures also takes a lot of effort. The experiences from training are often not being put to regular use and future structuring efforts are therefore relinquished.

6.4 The future of PipeOnline

NLB are currently arranging workshops on PipeOnline for universities and university colleges. This will hopefully increase the adoption of PipeOnline as a production tool.

New formats (especially Epub) are gaining momentum among publishers. Improved support for Epub will probably be added to PipeOnline in the near future. Support for other formats such as DocBook and TEI will also be considered implemented in PipeOnline.

6.5 Future development of DAISY Pipeline

The next release of DAISY Pipeline (DAISY Pipeline 2) will focus on the integration of new technologies (XProc, XSLT 2.0, OSGi). The release will also serve as a reference implementation of DAISY 4 (also known as ZedNext), the next DAISY standard. DAISY Pipeline 2 will have better integration in heterogeneous environments and will also be a new framework for the DAISY Pipeline front-ends: Pipeline Lite, PipeOnline and Pipeline UI.

Dictionary

Altformats Alternative formats
DAISY Digital Accessible Information System
DTBook Digital Talking Book
HTML HyperText Markup Language
LGPL Lesser General Public License
MathML Mathematical Markup Language
OSGi Open Services Gateway initiative
RTF Rich Text Format
SAPI Speech Application Programming Interface
SMIL Synchronized Multimedia Integration Language
STEM Science, Technology, Engineering, and Mathematics
TEI Text Encoding Initiative
TTS Text To Speech
XHTML Extensible HyperText Markup Language
XML Extensible Markup Language
XSLT Extensible Stylesheet Language Transformation

References