📜 ⬆️ ⬇️

Electronic books and their formats: talking about EPUB - its history, pros and cons

Earlier in the blog we wrote about how the formats of e-books DjVu and FB2 appeared .

The topic of today's article is EPUB.


Image: Nathan Oakley / CC BY
')

Format history


In the 90s proprietary solutions dominated the e-book market. And many manufacturers of readers have their own format. For example, in NuvoMedia used files with the extension .rb. These were containers with an HTML file and a .info file containing metadata. This state of affairs complicated the work of publishers - they had to lay out books for each format separately. A group of engineers from Microsoft, already mentioned by NuvoMedia and SoftBook Press, took up the situation.

At that time, Microsoft was going to conquer the e-book market and was developing a reader application for Windows 95. It can be said that creating a new format was part of the business strategy of the IT giant.

If we talk about NuvoMedia, then this company is considered the manufacturer of the first mass e-reader Rocket eBook . The internal memory of the device was only eight megabytes, and the battery life did not exceed 40 hours. As for the SoftBook Press, they also developed electronic readers. But their devices had a distinctive feature - a built-in modem - it allowed downloading digital literature directly from the SoftBookstore store.

At the beginning of zero, both companies - NuvoMedia and SoftBook - were purchased by Gemstar media company and merged into Gemstar eBook Group. This organization for several years engaged in the sale of readers (for example, RCA REB 1100 ) and digital books, but in 2003 went out of business .

But back to the development of a single standard. In 1999, Microsoft, NuvoMedia and SoftBook Press founded the Open eBook Forum organization, which included working on a draft document that marked the beginning of the EPUB. Initially, the standard was called OEBPS (stands for Open EBook Publication Structure). It allowed the distribution of digital publishing in a single file (ZIP archive) and simplified the transfer of books between different hardware platforms.

Later, IT companies Adobe, IBM, HP, Nokia, Xerox, and the publishers McGraw Hill and Time Warner joined the Open eBook Forum. Together, they continued the development of OEBPS and developed the ecosystem of the digital literature as a whole. In 2005, the organization was renamed the International Digital Publishing Forum, or IDPF .

In 2007, IDPF changed the name of the OEBPS format to EPUB and began developing its second version. She was presented to the public in 2010. New almost did not differ from its predecessor, but received support for vector graphics and embedded fonts.

By this time, EPUB was conquering the market and becoming the default standard for many publishers and manufacturers of electronic gadgets. The format was already used by O'Reilly and Cisco Press, plus it was supported by Apple, Sony, Barnes & Noble, ONYX BOOX devices.

In 2009, the Google Books project announced support for EPUB - it was used to distribute more than a million free books. The popularity of the format began to acquire and from writers. In 2011, JK Rowling spoke about plans to launch the Pottermore site and make it the only point of sale for the Potteriana books in digital form.

EPUB was chosen as the standard for literature distribution, primarily because of its ability to implement copy protection ( DRM ). All books in the online store writer is still available only in this format .

The third version of the EPUB format was released in 2011. Developers added the ability to work with audio and video files and footnotes. Today, the standard continues to evolve - in 2017, IDPF even joined the W3C consortium, which is implementing technology standards for the World Wide Web.

How does the EPUB


The book in the EPUB format is a ZIP archive. It stores the text of the publication as XHTML or HTML pages or PDF files. Also in the archive is media content (audio, video or images), fonts and metadata. It may also contain additional files with CSS or PLS -documents with information for speech generation services.

An XML markup is responsible for displaying content. A fragment of a book with embedded audio and image may look like this :

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:ev="http://www.w3.org/2001/xml-events" epub:prefix="media: http://idpf.org/epub/vocab/media/#"> <head> <meta charset="utf-8" /> <link rel="stylesheet" type="text/css" href="../css/shared-culture.css" /> </head> <body> <section class="base"> <h1>the entire transcript</h1> <audio id="bgsound" epub:type="media:soundtrack media:background" src="../audio/asharedculture_soundtrack.mp3" autoplay="" loop=""> <div class="errmsg"> <p>Your Reading System does not support (this) audio</p> </div> </audio> <p>What does it mean to be human if we don't have a shared culture? What does a shared culture mean if we can't share it? It's only in the last 100, or 150 years or so, that we started tightly restricting how that culture gets used.</p> <img class="left" src="../images/326261902_3fa36f548d.jpg" alt="child against a wall" /> </section> </body> </html> 

In addition to the content files in the archive is a special navigation document (Navigation Document). It describes the layout of the text and images in the book. Reader applications access it if the reader wants to “skip over” several pages.

Another mandatory file in the archive is the package. It includes metadata — information about the author, publisher, language, title, and so on. There also includes a list (spine) subsections of the book. An example of a package document can be viewed in the IDPF repository on GitHub .

Virtues


The advantage of the format is its flexibility. EPUB allows you to create a dynamic layout of the document, adapting to the screen size of the device. This is one of the main reasons why the format supports a large number of readers (and other electronic devices). For example, with EPUB, all ONYX BOOX readers work out of the box: from basic and 6-inch Caesar 3 to premium and 9.7-inch Euclid .


/ ONYX BOOX Caesar 3

Since the format is based on popular standards (XML), it is easy to convert to read on the Internet. EPUB also supports interactive elements. Yes, they are similar elements in PDF, but you can add them to a PDF-document only with the help of proprietary software. In the case of EPUB, they are added to the book by markup and XML tags in any text editor.

Another advantage of EPUB are functions for people with vision problems or dyslexia. The standard allows you to modify the display of text on the screen - for example, highlight certain letter combinations.

EPUB, as we have already noted, gives the publisher the opportunity to install copy protection. If desired, e-booksellers can use their mechanisms that restrict access to the document. To do this, you need to modify the file rights.xml in the archive.

disadvantages


To create an EPUB publication, you need to understand the syntax of XML, XHTML and CSS. At the same time, we have to work with a large number of tag identifiers. For comparison, the same standard FB2 includes only the minimum required set of tags - sufficient for the layout of fiction. And to create PDF-documents do not require special knowledge at all - specialized software is responsible for everything.

Also, EPUB is criticized for the complexity of the design of comics and other books with many illustrations. In this case, the publisher has to create a static layout with fixed coordinates for each image - this can take a lot of time and effort.

What's next


IDPF is currently working on new format specifications. For example, one of them will help create interactive tutorials with hidden sections . The same book will look different for the teacher and the student - in the second case, for example, answers to tests or test questions will be hidden.


Image: Guian Bolisay / CC BY-SA

It is expected that the new feature will help reorganize the educational process. Today, the EPUB is quite actively used by large universities, such as Oxford University. A few years ago, they added support for EPUB 3.0 to their digital library application.

IDPF also creates a specification for embedding Open Annotation footnotes in EPUB. This standard was developed at W3C in 2013 — it simplifies working with complex types of annotations. For example, it can be used to put a note to a specific section of a JPEG image. Additionally, the standard implements a mechanism for synchronizing changes in annotations between copies of a single EPUB document. Notes of the Open Annotation format can be added to EPUB files even now, but the formal specification for them has not yet been adopted.

Work is also underway on a new version of the standard - EPUB 3.2. It will show the WOFF 2.0 and SFNT formats , which are used to compress fonts (in some cases, they can reduce file sizes by 30%). Also, developers will replace some obsolete HTML attributes. For example, instead of a separate trigger element for activating audio and video files in the new standard, there will be native HTML audio and video elements.

The draft specification and change list are already available in the W3C GitHub repository.



Reviews of ONYX-BOOX readers:

Source: https://habr.com/ru/post/456958/


All Articles