On eBook Reading Systems

Abstract

In this page I collect some notes about Reading Systems for eBooks, and their (current) lack of functions to properly support advanced usages of eBooks. The opinions contained in this page are personal. Feel free to disagree, to make them yours, even to exploit them in commercial products. In the latter case, I would like to be kept posted of your progress. Comments are welcome, especially if you point out mistakes or you have useful suggestions: just drop me an email or send me a tweet.

Important Update: on 2013-09-09, we published the project on GitHub: https://github.com/pettarin/epub3reader/, where you can find screenshots, a pre-compiled APK, and the source code.

Note: on 2013-05-08, I updated/clarified parts of this page, partially in response to comments and critiques I gathered. Original parts that I wanted to edit/delete have been striken out ~~like this~~. A discussion takes place here.

Definitions

I wrote this page with the least possible technical language, so that it can be read by real-world people, not only software developers and ebook producers. (Kidding, of course.) To avoid misunderstandings, let me define the precise meaning I will give to some frequently occurring words:

book: a self-contained, packaged cognitive unit, possibily composed by different resources, like text, illustrations, audio, etc. (used here mainly to refer to the content)
eBook: the digital instantiation of a book (used here mainly to refer to content+markup+metadata) — bowerbird suggested: “one or more files in a directory”
ecosystem: an integrated platform (like Amazon Kindle or Kobo) where a user can buy/store eBooks, read them from multiple devices/apps, synchronize notes and bookmarks, etc.
pBook: the physical (usually, paper) instantiation of a book Reading System (RS): a software used to get a rendition (visual, aural, both, etc.) of an eBook. It might be a stand-alone PC application, an app, the reading software of an eReader, etc.

My View on eBooks and Reading Systems

Let me start with a bold statement: reading eBooks, now, is a quite disappointing reading experience, with respect to their (abstract) potential.

Do not get me wrong: if you simply want to read novels or comics on your spare time, you might find decently formatted eBooks, and decent Reading Systems (RS) to enjoy them. (Even in this case, though, few RS allow the reader to apply really deep custom settings to the rendition of the eBook, and they almost all focus on typographical aspects, like changing font, its height, line density, margins, justification, and so on.)

But when you start using books for more complex activities, like studying, learning a foreign language, consulting professional manuals, and the like, current RS prove to be very poor tools. Have you ever experienced extremely different renditions of the same eBook, in two different RS? Have you ever tried taking annotations on an eBook and then expect them to be embedded into the eBook, while, at the same time, being able to use them outside the RS/ecosystem where you created them? Have you ever experienced the rather tiring, frustrating experience of using footnotes, dictionaries, translations, glossaries, on today RS? Have you tried referencing a text fragment of an eBook ~~in another book (eBook or pBook)~~?

I think that the main reasons for all this mess are:

eBooks are reaching critical mass (in terms of serious users) only now;
non-recreational eBooks are a marginal fraction of the eBook market, but their have an enormous potential (just think about the educational segment);
these high commercial stakes make big publishers/vendors cultivate their own ecosystem, for commercial gain, instead of fostering open, interoperable standards and tools, sometimes even with the governments/regulatory agencies compliance;
on the other hand, open initiatives from public institutions seem to be geared toward favoring content production/digitalization, rather than improving standards and tools;
very few eBooks offer a real advantage, beyond de-materialization, over their pBook equivalents (i.e., a lot of eBooks tend to be just the digitalization of a pBook, instead of being an augmentation of the corresponding content).

I do not have a magical recipe to improve the digital publishing ecosystem, especially in its financial dynamics: I leave this task to those who can pull the right triggers. But I can certainly contribute to the technical side of the discussion about eBooks and RS. The only prediction I have the gut to make is that the future belongs to open formats and tools, and that building walled gardens full of amazing proprietary tools will not pay off long term. Hence, in what follows, I will speak about open formats, RS and tools only.

Please, Give Us The Need for Intelligent RS

I think that one of worst pitfalls in the industry is that current RS are way dumber than they should be, given the level of sophistication that formats like EPUB3 allow. (As bowerbird noted, authoring tools are quite ineffective too. I agree with his view that even a 6 years old should be able to produce an eBook. This theme calls for another, deeper discussion.)

For example, no RS that I am aware of takes any advantage from the semantic vocabulary defined by EPUB3 (with the exception of footnotes in iBooks). But I do not want to enter the dangerous field of discussing a particular format. So, let’s consider the generic theme of how links are handled in eBooks: how many RS allow the user to split the viewport to display the target (footnote, other chapter of the same text, an external Web page) concurrently with, say, the text fragment referencing it? (AFAIK, only ASTRI Bee can split the viewport, but it does not manage links automatically.) Along this line, the list of examples can be made arbitrarily long.

The effect is that actual eBook creators tend to spend insane amounts of time embedding in their eBooks code (CSS, Javascript) to make up for these missing functions. I say insane, because usually these attempts are not working well on all RS, and, more importantly, they are the digital publishing equivalent of the reinventing the wheel principle. Any programmer knows that she should avoid this as much as possible.

Besides support/standardization issues, I feel that eBooks are not fully recognized as different cognitive objects than the corresponding pBooks. For example, in principle, eBooks allow the reader to actively query the data contained in the eBook, unlike pBooks, where the reader was forced into a passive role regarding the book-to-brain transfer process, being unable to dynamically alter the content being displayed. Have you heard about any RS supporting XQuery lately? Of course, you do not.

To sum up, I think that making eBooks should be as simple as using any semantic-aware language, being sure that that eBook will be rendered uniformly by different RS and that special functions should be enabled by the RS, not requiring the eBook producer to code them from scratch every time! If someone wants to include her exotic JS stuff, that is fine with me, but, at least for common functions, there should be no need to code your own JS library and embed it into every single eBook. Finally, a good eBook should contain high-quality metadata and marked content, which are two necessary things to make complex usages possible.

Some Experiments with Android

So far, I just spew generic ideas (to borrow bowerbird’s words), so let me go a little bit technically deeper on some coding experiments.

Currently, I am supervising a team of three students at my (former) Department of Information Engineering of the University of Padova, Padova, Italy. They are coding a demo Android app for reading EPUB2/EPUB3 eBooks, with focus on complex books.

The goal of this project is to show that, by putting some intelligence in the RS, the reading experience can be greatly enhanced. We focus on managing the following features:

Internal links
External links
Interaction with the dictionary
Parallel text

A first focus of the project is about allowing the user to efficiently explore linked resources, by splitting the viewport to allow her simultaneous access to both linked and linking elements. For example, if you are reading full-screen an eBook and you click on a link, split the viewport in two panels, and render both the linking passage and the linked resource, which might be another part of the same eBook or an external Web page. Moreover, you can apply the same interaction to consulting a dictionary or a glossary.

A second focus of the project is about allowing a natural reading of parallel texts (for example, a book in its original language (say, EN) along with its translation in another language (say, IT)). It should not be that difficult to achive with eBooks, right? Wrong. So far, the only ways of doing it are:

display text in EN, then display text in IT and add links to go back and forth between corresponding paragraphs (very tedious, not natural)
interleave text in EN and in IT or use a table structure (bad from a visual point of view)
use JS to make a popup with the translation appear (works only in few EPUB3 RS)
use a FXL eBook (works only in few EPUB3 RS)

Plus, the last two solutions require a lot of coding just to make the rendition work.

Our approach to the problem is different: first of all, the eBook should only contain the textual materials, marked up like any other EPUB book (so it is still compatible with other RS). But then, with the same split-the-viewport technique, we will offer the user the choice of whether she wants to read only the EN text, only the IT text, or both, simultaneously, in two half-screen panels. To keep things simple, we assume that the XHTML pages inside the EPUB container are named according to a naming convention that lets our app recognize the corrispondence between the original text and the translation (say, p001.en.xhtml and p001.it.xhtml), but also allows for shared, non-translated stuff (cover, introduction, etc.). The same mechanism might be achieved by other means (e.g., multi-OPF, etc.), but this is not the focus of the project.

(As bowerbird noted, a similar mechanism might be applied to TOC, search results, metadata, supplementary materials, etc.)

We will release this demo app by summer 2013, under a free software license. If the project will receive enough interest (and, possibly, funding) we will try to develop it further, including regular functions (typography management, bookshelf, etc.) and some other, more interesting features, including some of those listed below.

Extra Ideas

In what follows, I list, in no particular order, some ideas for functions that current RS lack, partially or entirely, and that I think I would like to see on RS.

Export annotations, bookmarks, highlights to an exchange format (say, XML or CSV), and their remote backup/synchronization
Bundle annotations, bookmarks, highlights to the eBook container, so that they can be stored together
If an Audio-eBook contains an M3U playlist, let the user choose between an MP3-like player and normal (text+audio) rendition
Media Overlay in reflowable eBooks, providing good playback controls (volume, speed, delay, highlight style) and tap-to-play mechanism
Media Overlay with multi text fragment association
Ability to swap text/audio/video language association in Media Overlays
Configurable multi-layout multi-panel views with auto-tiling
Configurable multi-layout multi-panel views with auto-tiling
If in single-viewport mode, after coming back from an internal link, highlight the original anchor point (like the red dot in Marvin)
Support book status (e.g., via local storage), which is greatly needed by interactive fiction and game-eBooks
Automatic citation recognition (e.g., via DOI)
Automatic lexicon generation from user dictionary usage
Automatic generation of factual context by relating eBook content to (Web?) resources
Accessing online databases to get the right pronounciation of foreign words and names (with automatic entity resolution)
Resolution of external entities
Self-updating function: an eBook can pull updates from the publisher/store
Typos reporting function
Custom, user-defined CSS and clear control of them over the cascade
Support for custom, user-generated dictionaries (e.g., StarDict)
Anonymous reading statistics
RegEx search, support for XQuery-like interrogation

Clearly this list is not exhaustive, feel free to email me your own suggestions.

Interesting Links

Cinque cose che vorrei – da editore – per progettare ebook: Fabrizio Venerandi’s blog post (in Italian) on five missing features of eBooks
Department of Information Engineering, University of Padova, Padova, Italy
Smuuks: my own company, mainly producing EPUB3 Audio-eBooks
il Narratore audiolibri: audiobooks and EPUB3 Audio-eBooks
Readium: EPUB3 reader by the IDPF
Marvin: EPUB2 RS with smart functions
Blio: promising EPUB2/EPUB3 RS (now stalled?)
Digital Education Content 0.101: post by Richard Pipe on eBooks and education
e0: rethinking the (EPUB) eBook format

Few Comments on bowerbird’s Notes

A few clarifications:

I agree that authoring tools are problematic as well (if not more). I just wanted to primarily discuss RS, because I think that if people (i.e., the straw man) see what eBooks might look like, w.r.t. the poor stuff sold (!) now, that might give momentum to a better digital publishing ecosystem.
“Please Give Us” => it was a bad choice of title for that section. I agree that people with coding skills (especially from open communities) should start working on coherent projects to “make them (=RS, authoring tools, “good” eBooks) happen”. I amended my Web page.
I strongly agree on the statement that eBooks should be “for people”, they should be remixable, also with user-generated content, and also kids should be able to make them. I will push the “remixability” thing even further: I strongly believe that current IP policies (most notably, DRM) are deemed to a giant defeat in the long run (fortunately). I hope the transient will be quick enough.
We picked EPUB (open, even if overcomplicated) for our project to be able to show some results to the common person, who might have a mild interest in eBooks and perhaps has some EPUBs that found too difficult to peruse.

Now, a question: is ZML format (like this) the answer to all the questions raised so far? Or do you have “a bigger picture”, hinged around it? I fear that overfocussing on formats is problematic too.