The eTextReader Project Source Forge Web Logo

Overview of system architecture

The eTextReader project is made up of several components:

Each of these components are described in this document.

Textbook reading/browsing software

Almost all of the eTextReader software is written in Java, with the primary exception being the code that allows users to create/edit ink annotations, which is a C# .NET application. The software is broken down into several packages, which are unfortunately not as well defined as they could be:

Overviews of the individual packages should be contained in the package overview file in the API documentation.

Database of users and annotations

As previously mentioned, an original design goal was to allow any sort of backing store to be used to hold the annotation database. Over time, this design has not been carefully followed, and much of the interface specification has taken place inside of the DBClient class. In particular, calls to obtain a reference to the annotation database hard code the creation of a DBClient instance, rather than using a factory implementation to obtain access to the data store. All further discussion of the implementation will therefore be focused on this particular implementation of the data store.

The data store is implemented using a Microsoft SQL Server database. In order to facilitate use of the application in both on- and off-line environments, a version of the database server is installed locally when the application is installed, and this local version is periodically synchronized with the central database. Previously, the "normal" method of operation was assumed to be on-line, with off-line status considered to be exceptional. For performance reasons, this assumption is likely to be reversed, so that the local data store is always the current source of annotation information, with periodic updates still made to the central data store.

The most important table in the database is the notations table. The current schema of this table is as follows:

Columns in the notations table
Column Name Data type Description
id int A unique identifier for the annotation. See note on generation of these ids below
url varchar(256) The URL of the content this annotation refers to
addressStart varchar(256) The starting address annotation's text anchor. See note on addresses below
addressEnd varchar(256) The ending address of the annotation's text anchor.
type varchar(32) The type of the annotation, such as text note, bookmark, etc.
author varchar(64) Who created the annotation? This should really contain the user id rather than name
target varchar(32) A singleton field describing the target of the annotation, whose purpose has been subsumed by the NotationModes table
viewableBy varchar(64) A singleton field describing who can view the annotation, whose purpose has been subsumed by the NotationModes table
subject varchar(64) A user-generated subject for the annotation, generally used as a human-readable identifier. Some exceptions to this exist.
discussionID int If this annotation is part of a discussion, a reference to the discussion ID which correlates postings to a discussion together. If the annotation is not part of a discussion, this field is null.
regarding int Another annotation in this table that this annotation refers to. Used only by the discussion mechanism currently.
created datetime The date and time on which this annotation was created. Generally system supplied.
modified datetime The date and time on which this annotation was last modified. Generally system supplied.
body text The content of the annotation; usually user entered text, although exceptions exist
rowguid uniqueidentifier A unique system generated identifier used for synchronization purposes
Diagram image If the annotation is a diagram (or an ink annotation), this field contains the data associated with the diagram.
viewmode varchar(32) Specifies how the annotation should be displayed.
isReference varchar(8) If set to true, then the body of the annotation actually contains a URL; the content stored at this URL is then the content of the annotation.

More details on the tables in the database are available here.

Generation of ID values

Addressing Scheme

A scheme must be created to facilitate identifying portions of text referenced by annotations. The current scheme employed by the eTextReader system is similar to that used by the XPath systems. Document content is modeled as a tree of nodes, and selection of a portion of the content is done by outlining the path through this tree needed to reach the content in question.

Since the content being displayed by the eTextReader is often outside of the user's control, it is possible for an address to be correct at the time of annotation creation, but to have subsequent changes to the content break this address. The eTextReader attempts to minimize this probability by starting all paths at an element containing an XML ID attribute. If this is done, then changes outside the range of content specified by the ID should not break any addresses that start at the given ID value. If an ID attribute cannot be found along the path from the selection to the root node of the tree, a path is generated starting at the top of the tree.

Currently, no effort is made to identify links that may have been broken. One possible way of doing this is to store a checksum of the selected content, and verifiying that this checksum is correct when the annotation is loaded. Note that this only allows us to know when a link has been broken; it will likely not be all that useful when attempting to locate the correct location of the link.