Guidelines for Accessible E-text

November 2009

Round Table on Information Access for People with Print Disabilities Guidelines for Accessible E-text

Copyright © 2009 Round Table on Information Access for People with Print Disabilities

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means – electronic, mechanical, photocopying, recording, or otherwise – without prior permission of the publishers and copyright owner.

Published by Round Table on Information Access for People with Print Disabilities Inc., PO Box 229, Lindisfarne, Tasmania 7015

Email: RoundtableAdmn@bigpond.com

Web address: http://www.e-bility.com/roundtable/

Opening announcement

This DAISY version was produced by the Royal New Zealand Foundation of the Blind in February 2010. It has 4 navigation levels. The main sections are on level 1, the sections are on level 2, the sub-sections are on level 3 and the sub-sub sections are on level 4.

National Library of Australia Cataloguing-in-Publication entry

Author: Round Table on Information Access for People with Print Disabilities.

Title: Guidelines for accessible e-text / Round Table on Information Access for People with Print Disabilities.

ISBN: 9780980706406 (pbk.)

Subjects: People with disabilities--Services for.Electronic publications--Handbooks, manuals, etc.

Dewey Number: 362.4

About these guidelines

These guidelines are published by the Round Table on Information Access for People with Print Disabilities Inc. The Round Table is an umbrella organisation which brings together producers, distributors and consumers of information in alternative formats to print; blindness agencies, tertiary institutions and government departments in Australia and New Zealand.

These guidelines are available from Round Table in accessible formats.

Acknowledgements

Compiled by the E-Text Working Party of the Round Table.

Members of the Working Party:

Jane Wegener, Vision Australia (Working Party leader)

Moira Clunie, Royal New Zealand Foundation of the Blind

Peter Le, Vision Australia

Kathy Riessen, SouthAustralianSchool for Vision Impaired

Nicola Stowe, Royal Institute for Deaf and Blind Children

Sandra Vassallo, e-Bility Pty Ltd

Contents

Introduction to these guidelines – Page 1

What is e-text? – Page 1

About these guidelines - Page 1

Background - Page 2

Scope - Page 3

Who these guidelines are for - Page 3

General principles - Page 4

Summary checklist - Page 4

1. Equivalent to print - Page 5

2. Accessible - Page 6

3. Clear visual style - Page 10

4. Standards, Guidelines and Best Practice - Page 11

Particular formats - Page 12

HTML - Page 12

Rich Text Format (RTF) - Page 13

Plain text - Page 14

DAISY text - Page 16

Appendix 1: A note about PDF - Page 17

Appendix 2: Glossary - Page 18

Appendix 3: Round Table markup - Page 21

Appendix 4: Tips for working in Word - Page 23

Appendix 5: Tips for using Scanners and OCR - Page 25

Closing Announcement - Page 27

Page 1

Introduction to these guidelines

What is e-text?

E-text is structured electronic text which is accessible to people with a print disability, that is, to people who can't access information from regular print.

E-text may be an alternative version of a print document, that provides a print-disabled person with equivalent access to the information. The term "e-text" can also refer to a document that is originally created in digital format, and is accessible to all readers.

People with print disabilities use a range of technologies to access e-text. Typical reading methods include:

Synthetic speech - using screen reader software like JAWS or Window Eyes, or speech built into software like Adobe Acrobat.

Refreshable braille - using a portable braille display attached to a desktop computer or laptop, or a stand-alone braille notetaker.

Viewing on screen - using screen enlarging software, modified colour combinations, tools that visually highlight words while they are read with synthetic speech, magnifiers and zoom functions built into different software or operating systems, or not using any special software.

Typical formats for accessible electronic text include:

About these guidelines

These guidelines have been produced to provide document creators with an understanding of accessibility principles, and some best practice accessibility methods across a variety of electronic formats in common use.

Page 2

In a print-based society, people are significantly disadvantaged if they are cannot access information in print format. The trend towards mainstream electronic communication provides the opportunity for equity of information access, but the reality is that in 2009, many electronic documents are not designed to be fully accessible to readers with a print disability. Electronic documents need to be created with a standards-based approach to ensure accessibility.

The Round Table's goal has been to make the guidelines clear and easy to read in the hope they will be widely adopted.

Background

Version 1.0 of Guidelines for preparation of text materials on computer disk for people with print disabilities was produced by a sub-committee of the Round Table in July 1995.

In 2007 the Round Table Committee set up a working group to review the guidelines, inviting representation from people with experience in developing accessible formats. The group's purpose was to bring the guidelines up-to-date with newer technologies and to reflect the increase in end reading formats and editing processes as well as the increasing practice of converting electronic documents between different formats.

The new version reflects input received on earlier versions and takes into consideration the Australian Government style guide, as well as e-text guidelines produced by other organisations around the world.

In compiling the guidelines the working group sought input from people with experience in preparing different e-text formats and consulted widely with professionals and people with a print disability to determine the methods currently considered best practice.

Related standards, guidelines and production manuals were reviewed, to ensure that the draft guidelines were consistent where applicable. These included Round Table's Guidelines for Conveying Visual Information, the World Wide Web Consortium's Web Content Accessibility Guidelines, the DAISY standard, and production manuals from Vision Australia and RNZFB.

Workshops on the draft guidelines were presented at the 2007 and 2008 Round Table Conferences to stimulate discussion on the draft and feedback from participants was incorporated into the document.

Page 3

Scope

The guidelines describe basic principles for preparing accessible e-text documents which apply regardless of which file format is used. These principles will mean people with a print disability are able to use and understand the content and have equivalent access to the information contained in a document.

The guidelines then elaborate on how these principles apply to text in particular file formats. The new guidelines focus on e-text as an end reading format to be used by people with a print disability rather than as a base file that can be converted to other formats, but many of the principles of good document formatting will mean that an e-text document will convert more easily to braille or large print.

Who these guidelines are for

Anyone preparing electronic documents will find the guidelines useful. They have been written for professional transcribers as well as a general audience. Possible uses include:

Page 4

General principles

Summary checklist

1. Equivalent to print

An accessible e-text version of a print document should provide the same information as the original print. Differences from the print, or producer's interpretations of visual material, should be marked with a producer's note.

2. Accessible

An accessible e-text document must be fully usable by a person with a print disability, in the sense of perceivable, operable, understandable and robust ( http://www.webaim.org/articles/pour/ )

2.1 Arrange text in a linear reading order

Text should be arranged so that it can be read in a linear way, that is, pieces of information follow each other in logical sequential order. For example, "floating" textboxes should be incorporated into the flow of the text.

2.2 Include structural markup

Structural elements of the text, such as headings, paragraphs, lists and tables, should be "marked up" so that adaptive technology can interpret their significance, meaning and context. Structural markup methods include using "Styles" in word processing software, Round Table codes in plain text documents, or semantically-correct HTML elements.

For example, mark the main headings in a Word document with the Heading 1 style, then modify the visual appearance of that Style, rather than selecting each heading individually and applying visual markup (bold, bigger text size). In HTML, use structural

Page 8

elements like <h1> and control the visual appearance with stylesheets, rather than <font size=+2>.

The resulting document might look exactly the same but the significance of the visual structuring can be interpreted for different reading methods. For example, a screen reader can extract a list of headings in a document and use them for navigation.

Structural elements that should be marked up include headings, paragraphs, lists, tables, and emphasised words.

Headings styles are usually hierarchical. Most texts can be divided into main sections such as chapters, then into smaller sections, then into sub-sections, and so forth. Headings are used to label each piece of the text. The major headings through the document should be assigned Heading 1. Heading 2 applies to subheadings, and so on. HTML allows a maximum of 6 levels of heading.

Common structures should be treated consistently, for example all level 2 headings should be structurally marked in the same way, even if they don't look the same visually.

2.3 Verbalise images and visual elements

Where an image or visual element adds meaning to the text, that meaning should be expressed in words, or verbalised, within the accessible e-text file.

Text explanations can be included as a producer's note in the text, as "alt" attributes attached to an image file or as captions below an image.

For some readers, it may be more appropriate to provide images in an alternative format such as a tactile or enlarged graphic. If an image has been produced as a physical enlarged or tactile image, this should be explained with a producer's note.

The key to conveying visual information in e-text format is to interpret what information the illustration is adding to the text, then describing this information in a simple, structured and straightforward manner. A good guideline is to imagine reading the document to another person, and think about what would be said in place of the image.

Some general guidelines for verbalising images are:

Page 9

Refer to the Round Table's Guidelines for Conveying Visual Information for more detailed examples of verbalisation.

2.4 Express special characters and languages unambiguously

3. Clear visual style

An accessible e-text document should have a clear, legible visual style that considers the needs of people with a range of print disabilities.

4. Standards, Guidelines and Best Practice

An accessible e-text document should follow relevant standards and best practice where they exist, only deviating from these where they are needed for particular reasons such as limitations in adaptive technology. A document that is produced according to standards is more likely to work with a wider range of adaptive technology, and is easier to adapt - whether to other e-text formats including webpages or to other formats like braille, large print or synthetic audio.

Particular formats

HTML

HTML is the language in which the content of most webpages is written. It is simple to create, highly navigable and supported by most adaptive technology.

HTML elements wrap pieces of text content within short codes that provide information about the structure and meaning of content. For example:

<h1>This is a top level heading</h1>

<li>This is a list item</li>

HTML is designed to describe the structure of a document, rather than its appearance. For example, the HTML table element implies a data table and the blockquote element indicates that content inside this element is a direct quote. Adaptive technology can interpret information from structural elements and change behavior accordingly - for example, most screen readers can extract a list of headings from a document and use them like a table of contents.

Once the content has been properly marked-up, all the HTML elements can be styled using CSS, and in this way can be easily customised to achieve a desired presentational effect, such as formatting in columns or indenting blocks of text, without affecting the meaning of the markup. Separating the content structure (HTML) from the layout/appearance (CSS) means that visual style can be adjusted easily for different users.

Check documents carefully when converting to HTML from another format, as many built-in conversion tools also embed visual formatting information within the document (for example, font colour and size).

When creating HTML documents:

Page 13

Rich Text Format (RTF)

Word processing formats, such as RTF, Microsoft Word and OpenDocument are widely supported by current adaptive technology. It is preferable to use RTF as this is supported by a wider range of software, however MS Word or OpenDocument can be used if requested by individuals. RTF can be created in a variety of word processing programs, and files in Word and OpenDocument can be saved as RTF.

RTF can be useful for providing downloadable content on the web, so that a whole document is available as one file.

When creating documents in these formats, use "Styles" to add semantics and ensure consistent presentation.

Plain text

Plain text is the most universally accessible file format. It is useful when producers are unsure of what adaptive technology will be used to access the file.

Different conventions exist for formatting plain text documents.

DAISY text

DAISY is a standard for structured digital books that has been developed by the DAISY Consortium, which is made up of print disability organisations around the world. A DAISY book can include audio, text and images. The treatment of text in a DAISY book is based on XML, and allows a closer representation of the print than any other electronic text format. For example:

Full text DAISY books consist of marked-up text that can be accessed and navigated using DAISY software ( http://daisy.org/about_us/dtbooks.shtml )

The DAISY website contains guidelines for producing DAISY books: http://www.daisy.org/

Page 17

Appendix 1: A note about PDF

Portable Document Format (PDF) is a file format developed by Adobe that makes it possible to reliably create, combine, and control text and graphical documents with the original formatting intact.

Although there has been some recent improvements in the software used to create PDFs, this file format is not regarded as sufficiently accessible for documents designed for general distribution by the Human Rights and Equal Opportunity Commission (HREOC) or by other accessibility experts world wide including Royal National Institute of Blind People (RNIB) in the UK.

HREOC's position is: "Despite considerable work done by Adobe, PDF remains a relatively inaccessible format to people who are blind or vision-impaired. Software exists to provide some access to the text of some PDF documents, but for a PDF document to be accessible to this software, it must be prepared in accordance with the guidelines that Adobe have developed. Even when these guidelines are followed, the resulting document will only be accessible to those people who have the required software and the skills to use it. The Commission's view is that organisations who distribute content only in PDF format, and who do not also make this content available in another format such as RTF, HTML, or plain text, are liable for complaints under the Disability Discrimination Act (DDA). Where an alternative file format is provided, care should be taken to ensure that it is the same version of the content as the PDF version, and that it is downloadable by the user as a single document, just as the PDF version is downloaded as a single file." Source: Human Rights and Equal Opportunity Commission (August 2002). World Wide Web Access: Disability Discrimination Act Advisory Notes [Electronic Version 3.2]. Retrieved 29 August 2007 from http://www.hreoc.gov.au/disability_rights/standards/www_3/www_3.html#s2_3

Where several alternative formats of a document are provided, the best practice approach is:

Page 18

Appendix 2: Glossary

Accessible: Usable by a person with a print disability, in the sense of perceivable, operable, understandable and robust. http://www.webaim.org/articles/pour/

Alternative Format: Any text that has been reproduced in either large print, electronic text (e-text), braille, or audio format.

Adaptive Technology: Software programs that enable individuals with a print disability to view printed materials

Caption: Written description that normally accompanies a picture, chart or diagram

Character Set: A defined list of characters recognised by computer software. Each character is represented by a number. The ASCII character set, for example, uses the numbers 0 through 127 to represent all English characters as well as special control characters. European ISO character sets are similar to ASCII, but they contain additional characters for European languages. Unicode is a widely used character set for representing characters in any language that is backwards compatible with ASCII.

Colour Contrast: The contrast between the brightness and hue of text and background colours.

Cascading Style Sheets (CSS): A style sheet language used to control the presentation of a document written in a markup language. Its most common application is to style documents written in HTML and XHTML. CSS is used to define colours, fonts, layout, and other aspects of document presentation. It is designed primarily to enable the separation of document content (written in HTML or a similar markup language) from document presentation (written in CSS).

DAISY: An open standard for accessible digital books. http://www.daisy.org

Endnote: A note placed at the end of an article, chapter, or book that comments on or cites a reference for a designated part of the text

Footnote: A note placed at the bottom of a page of a book that comments on or cites a reference for a designated part of the text.

Large Print: Print that is enlarged and reformatted for clarity, designed to be optimally legible for people with low vision.

Page 19

LaTeX: A document markup language and document preparation system for the TeX typesetting program, which can be used to lay out complex mathematical equations.

Markdown: A human-readable markup system based on plain text email conventions: http://daringfireball.net/projects/markdown/

MathML: Mathematical Markup Language (MathML) is an application of XML developed by the W3C for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into text documents. http://www.w3.org/Math/

Metadata: information about a document that is used to facilitate the understanding, use and management of the document.

Optical Character Recognition (OCR): The electronic identification and digital encoding of printed or handwritten characters by means of an optical scanner and specialised software.

Print Disability: An individual with a print disability is unable to access the information disseminated in a regular print format because they 1. are blind or vision impaired; 2. have physical disabilities which limit their ability to hold or manipulate information in a printed form; 3. have perceptual or other disabilities which limit their ability to follow a line of print or which affect their concentration; or 4. cannot comprehend information in a print format due to insufficient literacy or language skills

Producer: Someone responsible for the production of alternate format material

Producer's Note: A note included by the producer to indicate differences from the print or the producer's interpretation of visual material

Project Gutenberg: Project Gutenberg is the first and largest single online collection of free electronic books, or eBooks. http://www.gutenberg.org

Semantic Markup: using the most meaningful tag to describe the type of content.

Styles: A style is a set of formatting characteristics that can be applied to text in a document to quickly change its appearance.

Subscripts and Superscripts: a number, figure, symbol, or indicator that appears smaller than the normal line of type and is set slightly below or above it - subscripts appear below the baseline, while superscripts are above. Subscripts and superscripts are typically used in formulas, mathematical expressions, and descriptions of chemical compounds or isotopes, but have many other uses as well.

Page 20

Tactile Graphics: Images that use raised surfaces so that a visually impaired person can feel them. They are used to convey non-textual information such as maps, paintings, graphs and diagrams

W3C: World Wide Web Consortium. The main international standards organisation for the World Wide Web. http://www.w3.org/

WCAG: Web Content Accessibility Guidelines (WCAG) are published by the W3C's Web Accessibility Initiative. They give guidance on making content accessible, primarily for disabled users, but also for all user agents, including highly limited devices, such as mobile phones. http://www.w3.org/WAI/

XML: Stands for Extensible Markup Language. A general-purpose markup language used to encode structured information. A document written in XML contains content (text) and structure (elements or "tags" which describe the content's purpose), but does not describe how the content should be displayed. This separation of content and presentation allows an XML document to be easily converted between different presentations such as a webpage, print document, braille or synthetic speech.

Page 21

Appendix 3: Round Table markup

Conventions for Round Table markup were originally described in Guidelines for preparation of text materials on computer disk for people with print disabilities, 1998. In practice, actual codes used have differed between organisations, with new codes developed as they were needed. The following list is based on the original Guidelines, as well as production manuals by Vision Australia and RNZFB.

If using Round Table codes within an e-text document, always include a list of codes at the beginning or end of the file, or in a separate readme file.

Page numbers: conventions include:

<pp> #

<p #>

<opp> # (for original print page)

<hp #> (for handout pages)

Headings: <h#> where # is the level of heading (up to <h6>)

Emphasis:

<b> for bold

<it> or <i> for italics

<ul> or <u> for underlining

<other> for other types of emphasis e.g. small caps.

Footnotes:

<fn> or <fn#> for footnotes

<mn> for margin notes

Tables:

<table>

<row # col #> where each cell needs separate markup

Superscript and subscript:

<sub>

<super> or <sup>

Images:

<figure>

Page 22

<caption>

<diagram>, <graph>, <photograph>, <picture>, <map>, <cartoon>

Boxes and indented text:

<box>

<box#>

<indent>

Special print symbols

<sym copyright>

<sym registered trademark>

Producer's note:

<transcriber's note>

<tn>

Line numbers

<l#>

Page 23

Appendix 4: Tips for working in Word

Viewing hidden characters

Hidden characters include line breaks and spaces. It can be useful to work with hidden characters visible. Click Show/Hide Paragraph mark on the Standard toolbar

Turning off Smart Quotes

On the Tools menu, click AutoCorrect Options,

Word Short Cut Keys

Find-and-replace tips

The ^p character finds a paragraph mark, or hard line break.

To add an additional line between paragraphs:

Find ^p and replace with ^p^p.

Where a document has retained line breaks within paragraphs, with an additional line break between paragraphs:

  1. 1. Find ^p^p and replace with *****

  2. 2. Find ^p and replace with a space character

  3. 3. Find ***** and replace with ^p (or ^p^p if an additional line break between paragraphs is needed)

Page 25

Appendix 5: Tips for using Scanners and OCR

The advent of scanners and Optical Character Recognition (OCR) software has enabled the easy transfer of printed text into an electronic format. However, production of a well formatted and correct document will usually require additional proofreading and formatting.

There are a number of OCR software packages available and the following suggested tips are generic in nature and not specific to any particular software.

General tips

Formatting

Proofreading

There are a number of common errors which can occur during OCR, not all of which will be picked up by a spell checker. Careful proofreading is still required for an accurate document. Listed below are some common errors which are worth looking for.

Page 26

Foreign languages

Make sure correct language settings are used so that accents on characters are recognised correctly.

Page 27

Closing Announcement

That concludes Guidelines for Accessible E-text: Round Table on Information Access for People with Print Disabilities Guidelines for Accessible E-text.