Skip navigation links (access key: Z)Library and Archives Canada / Bibliothèque et Archives CanadaSymbol of the Government of Canada
Français - Version française de cette pageHome - The main page of the Institution's WebsiteContact Us - Institutional contact informationHelp - Information about using the institutional WebsiteSearch - Search the institutional Websitecanada.gc.ca - Government of Canada Web site


Open Standards

Open Standards are standards with specifications which are open to the public and can be freely implemented by any developer. Open standards usually develop and are maintained by formal bodies and/or communities of interested parties, such as the Free Software/Open Source community. Open standards exist in opposition to proprietary standards, which are developed and maintained by commercial companies.

Open standards work to ensure that the widest possible group of contemporary readers may access a publication. In a world of multiple hardware and software platforms, it is virtually impossible to guarantee that a given electronic publication will retain its intended look and feel for all viewers, but open standards at least increase the likelihood that a publication can be opened in some form.

From a business perspective, open standards also make good sense. They help to ensure that product development and debugging occurs quickly, cheaply and effectively by dispersing these tasks among wide groups of users. Open standards also work to promote customer loyalty, because the use of open standards suggests that a company trusts its clients and is willing to engage in honest conversations with them.

Open standards also facilitate the archiving and storage of electronic publications. NLC's collection of electronic publications, for example, emphasizes the acquisition of 'format neutral' publications in order to avoid conversion of electronic documents whenever possible. In fact, when converting electronic documents from proprietary formats to standard formats is deemed necessary for the purpose of long-term preservation, the NLC is prepared to sacrifice a superior presentation format to long-term preservation needs. Publishers that adopt open standards for their publications from the outset can help to ensure that they remain relatively stable for posterity.

The Free Software Foundation / Open Source

The GNU Project www.gnu.org/home.html was launched by Richard Stallman in 1984 to develop a completely free Unix-like operating system. With the addition of a kernel created by Linus Torvalds, this system has become known as GNU/Linux.

GNU/Linux is the ultimate open standard -- an entire universe of software applications and document types that are extremely powerful and reliable, and are available to all, for nothing. For more information about GNU/Linux, visit Slashdot www.slashdot.org, Linux.com, and The Open Source Development Network ttp://www.osdn.com.

There are a number of commonly used open standards for digital publications:

JPEG (.jpg) Graphics

Characteristics

The JPEG (Joint Photographic Experts Group) graphics format is a 24-bit compression method developed specifically for the online display of photographic images.

JPEG uses a 'lossy' compression method, which means that it removes information from the source image during the file creation process in order to make the final image smaller for online use. Most software applications that are capable of producing JPEGs also allow the user to specify the level of compression, which allows for smaller files, but with a corresponding loss in image quality.

Aside from standard JPEGs, there are also 'progressive' JPEGs, which feature both higher compression rates than standard JPEGs, and support for 'interlacing', which loads the image into a browser in a series of incrementally clearer steps.

Applications

JPEGs were specifically designed for the display of photographs in an online environment. They work best with complex images that display a range of tones. For line art and other types of simple images, GIFs and PNGs produce smaller images, with no recognizable loss in image quality.

Examples

The Online Image Archive www.maths.tcd.ie/pub/images/images.html is an extensive archive of images stored in both the GIF and JPEG formats.

Resources

The official JPEG homepage www.jpeg.org/public/jpeghomepage.htm leads to links to JPEG's committee members' sites, as well as to other useful sources of information about JPEG. It also provides information about how to join the JPEG committee.

The JPEG Image Compression FAQ www.jpeg.org/public/jpeghomepage.htm appears in HTML on this web page.

PNG (.png) Graphics

Characteristics

PNG stands for both 'Portable Network Graphics' and 'PNG Not GIF'. PNG is a lossless compression standard which allows files to be stored at 8-, 24- or 32-bit depth.

PNG was designed as a replacement for the GIF file format, and has many advantages over it. Aside from its superior interlacing method (PNGs begin to display after a much smaller proportion of the file has been loaded than GIFs require), PNGs contain information about the operating system under which they were created, which means that computers can use this information to automatically adjust themselves to display the image properly. Like GIFs, PNGs support transparency.

Applications

Because PNG is a relatively new graphics format, it is not backwards-compatible with older browsers. PNGs also tend to be slightly larger than GIFs.

However, unlike GIFs, PNGs don't use the patented LZW compression format, owned by Unisys. If publishers create GIFs using a program containing an unlicensed copy of the Lempel-Ziv-Welch compression algorithm (http://www.rasip.fer.hr/research/compress/algorithms/fund/lz/lzw.html), tthat may leave them open to a charge of 'contributory infringement' from Unisys. See this article http://community.borland.com/devnews/article/1,1714,20002,00.html and the Burn All GIFs http://zgp.org/~dmarti/burnallgifs/ page for more information.

Examples

Many pages produced by members of the Open Source/Free Software movement feature PNGs. PNGArt http://www.pngart.com/ features over 50000 royalty-free PNGs available for download. This page at WonderStorm (www.wonderstorm.com/techstuff/) provides information about using PNGs as Web page backgrounds.

Resources

The W3 Consortium officially endorses the PNG standard, and maintains links to its specifications www.w3.org/Graphics/PNG/ and other resources, including a page that will test your browser for PNG compatibility www.w3.org/Graphics/PNG/.

Plain text/ASCII text

Characteristics

The most basic of open standards for electronic publication is an electronic document without any formatting (i.e. 'plain text'). Virtually all word processors are capable of saving documents in plain text; for example, Microsoft Word provides a 'Text Only' option in its 'File/Save As' menu (the default Word file format, sometimes called a '.doc' document, is a proprietary format -- see the section on proprietary standards for further explanation). Among PC users, it is customary to append plain text files with the '.txt' extension for easy identification; this practice has continued even into the current era of long filenames.

Sometimes people mistakenly refer to plain text as ASCII. The Webopedia provides the following definition for ASCII:

Acronym for the American Standard Code for Information Interchange. Pronounced ask-ee, ASCII is a code for representing English characters as numbers, with each letter assigned a number from 0 to 127. Text files stored in ASCII format are sometimes called ASCII files. Text editors and word processors are usually capable of storing data in ASCII format, although ASCII format is not always the default storage format.
(For more information on ASCII, visit Yahoo!'s ASCII Page).

In other words, ASCII is a character coding scheme, and ASCII text is text coded using that scheme. Plain text refers specifically to unformatted text and does not make any reference to the coding. When people or institutions request a document in ASCII, what they usually want is plain text.

Publishers that find plain text too restrictive may want to consider the other minimal formatting options that word processors provide, such as 'Text Only with Line Breaks', which inserts carriage returns at the end of typed lines, making them easier to read in plain text editing programs such as BBEdit, SimpleText and WordPad.

Applications

The most frequent use of plain text documents is in the area of software documentation. Plain text is the best option whenever there is uncertainty about the technological capabilities of a publication's intended audience, because virtually any text reader/editor can open plain text documents.

Examples

Michael Hart began Project Gutenberg (http://promo.net/pg/)in 1971 with the avowed intention of transferring as many public-domain works of literature as possible into plain text datafiles. Many of these files were written before the advent of sophisticated word processors; their existence today is a testament to the archival value of the textfile.

HTML

Characteristics

HMTL, or Hypertext Markup Language, uses a system of tags to describe the structure and layout of a document in a manner that makes it viewable by web browsers and other forms of software. Like its cousin XML, HTML was designed to describe a document's structure, not its appearance (though it has been heavily retrofitted with the use of various plug-ins and supplementary languages to provide greater control over appearance). HTML also makes it possible to connect documents to each other via hyperlinks, an essential characteristic of the online medium.

Applications

HTML is the formatting or 'markup' language that is used to prepare documents for viewing on the World Wide Web. Increasingly, HTML is also being used as documentation for software distributed on CD-ROM, as its hyperlinks provides an excellent structure for reference materials.

Examples

Examples of HTML in its varying degrees of complexity lie under every page of the World Wide Web. To view the HTML source of a Web page, select the 'View Source' option from your Web browser's tool bar.

HighWire Press is one of the two largest free full-text science archives on earth, featuring over 260 sites, and more than 255,000 free, full-text articles.

Coach House Books, a small Canadian literary press, has been publishing HTML editions of its frontlist poetry, fiction, drama and art book titles since 1997. This case study presents a short description of the online publishing practices at Coach House Books.

Resources

The W3 Consortium's (W3C) home page for HTML provides links to their specifications for HTML, guidelines on how to use HTML to the best effect, and links to related subjects on the W3C site.

The W3C also provides a free HTML Validation Service. This web-base device checks HTML documents for conformance to W3C HTML, XHTML and other HTML-related standards. Publishers using cascading style sheets (CSS) to format their HMTL documents will also want to visit W3C's CSS validator.

Dave Raggett's HTML TIDY is a free utility for fixing HTML coding mistakes automatically and tidying up sloppy editing. This tool can also help publishers to identify where they need to pay further attention to making HTML pages more accessible to people with disabilities.

The HTML Writers Guild is the world's largest international organization of Web authors, with over 123,000 members in more than 150 nations worldwide. Its site provides resources, support, representation, and education for web authors at all skill levels.

Webopedia's HTML page provides a wide variety of links to other related resources.

XML

Characteristics

XML, or Extensible Markup Language, is exactly what it sounds like: a highly customizable, robust protocol for carrying the kinds of tasks that HTML performs one huge step forward. In fact, it is a sort of cousin to HTML, since both derive from SGML (the Standard Generalized Markup Language), an international standard created to solve problems in exchanging data between different types of computer systems.

Applications

XML describes a document's structure, not its appearance. Briefly, it allows a publisher to identify, quantify and define the information in their documents in a manner that makes sense to them and others in their industry, and allows the document to be reshaped for different needs (such as different types of displays) without disturbing its underlying structure. The relevance of XML today stems from the fact that basic HTML simply will not be capable of providing an adequate environment for the next wave of networked digital publishing, where data will have to be displayed in an increasingly varied number of environments, and searched in more sophisticated ways.

Examples

The XML 101 site provides a number of examples of XML documents of varying complexity, as well as other useful resources, tutorials and links.

Resources

XML in 10 points provides a Cook's Tour of the language. Answers to Frequently Asked Questions about the Extensible Markup Language can be found here.

O'Reilly's XML portal site and The XML Industry Portal go into considerably greater detail than the aforementioned sites, and provide access to discussion forums, resources and relevant recent news stories about XML.

For more technologically advanced publishers, The XML Cover Pages is a comprehensive online reference work for XML and SGML.

The Open eBook Publication Structure

Characteristics

The Open eBook Publication Structure is a specification for representing the content of electronic books. It is based on the premise that in order for electronic book technology to achieve widespread success in the marketplace, all electronic reading systems must be able to access a large number and variety of titles, and to present content with a reasonable degree of fidelity, accuracy, accessibility, and uniformity across various platforms.

The specification itself is based on HTML and XML, and is designed to allow publishers and authors to deliver their material in a single format. It describes a set of common, minimal guidelines which strive to reflect existing content format standards.

Applications

The chief reason for using The Open eBook Publication Structure is to attempt to capture a slice of the emerging eBook market. The more eBooks that exist, the more likely that the reader platforms that support standard formats will become ubiquitous. Companies that distribute eBooks may also offer more in the way of guarantees about tracking electronic documents and managing rights and licensing than most publishers may be able to manage on their own.

Examples

Netlibrary has a collection of over 3500 free eBooks.

Resources

The Open eBook Forum (OeBF), an association of hardware and software companies, publishers, authors and users of electronic books, is the primary source of information about this specification.

Brown University's Scholarly Technology Group (STG), in conjunction with NuvoMedia, Inc, makers of the Rocket eBook, have developed the Open eBook Validator, a free service that enables authors and publishers to quickly and easily test their publications for conformance with the Open eBook Publication Structure Specification.