The eTextReader Project | ||||||
|
The eTextReader project is made up of several components:
Each of these components are described in this document.
Almost all of the eTextReader software is written in Java, with the primary exception being the code that allows users to create/edit ink annotations, which is a C# .NET application. The software is broken down into several packages, which are unfortunately not as well defined as they could be:
Overviews of the individual packages should be contained in the package overview file in the API documentation.
As previously mentioned, an original design goal was to allow any sort of backing store to be used to hold the annotation database. Over time, this design has not been carefully followed, and much of the interface specification has taken place inside of the DBClient class. In particular, calls to obtain a reference to the annotation database hard code the creation of a DBClient instance, rather than using a factory implementation to obtain access to the data store. All further discussion of the implementation will therefore be focused on this particular implementation of the data store.
The data store is implemented using a Microsoft SQL Server database. In order to facilitate use of the application in both on- and off-line environments, a version of the database server is installed locally when the application is installed, and this local version is periodically synchronized with the central database. Previously, the "normal" method of operation was assumed to be on-line, with off-line status considered to be exceptional. For performance reasons, this assumption is likely to be reversed, so that the local data store is always the current source of annotation information, with periodic updates still made to the central data store.
The most important table in the database is the notations table. The current schema of this table is as follows:
Column Name | Data type | Description |
---|---|---|
id | int | A unique identifier for the annotation. See note on generation of these ids below |
url | varchar(256) | The URL of the content this annotation refers to |
addressStart | varchar(256) | The starting address annotation's text anchor. See note on addresses below |
addressEnd | varchar(256) | The ending address of the annotation's text anchor. |
type | varchar(32) | The type of the annotation, such as text note, bookmark, etc. |
author | varchar(64) | Who created the annotation? This should really contain the user id rather than name |
target | varchar(32) | A singleton field describing the target of the annotation, whose purpose has been subsumed by the NotationModes table |
viewableBy | varchar(64) | A singleton field describing who can view the annotation, whose purpose has been subsumed by the NotationModes table |
subject | varchar(64) | A user-generated subject for the annotation, generally used as a human-readable identifier. Some exceptions to this exist. |
discussionID | int | If this annotation is part of a discussion, a reference to the discussion ID which correlates postings to a discussion together. If the annotation is not part of a discussion, this field is null. |
regarding | int | Another annotation in this table that this annotation refers to. Used only by the discussion mechanism currently. |
created | datetime | The date and time on which this annotation was created. Generally system supplied. |
modified | datetime | The date and time on which this annotation was last modified. Generally system supplied. |
body | text | The content of the annotation; usually user entered text, although exceptions exist |
rowguid | uniqueidentifier | A unique system generated identifier used for synchronization purposes |
Diagram | image | If the annotation is a diagram (or an ink annotation), this field contains the data associated with the diagram. |
viewmode | varchar(32) | Specifies how the annotation should be displayed. |
isReference | varchar(8) | If set to true, then the body of the annotation actually contains a URL; the content stored at this URL is then the content of the annotation. |
More details on the tables in the database are available here.
A scheme must be created to facilitate identifying portions of text referenced by annotations. The current scheme employed by the eTextReader system is similar to that used by the XPath systems. Document content is modeled as a tree of nodes, and selection of a portion of the content is done by outlining the path through this tree needed to reach the content in question.
Since the content being displayed by the eTextReader is often outside of the user's control, it is possible for an address to be correct at the time of annotation creation, but to have subsequent changes to the content break this address. The eTextReader attempts to minimize this probability by starting all paths at an element containing an XML ID attribute. If this is done, then changes outside the range of content specified by the ID should not break any addresses that start at the given ID value. If an ID attribute cannot be found along the path from the selection to the root node of the tree, a path is generated starting at the top of the tree.
Currently, no effort is made to identify links that may have been broken. One possible way of doing this is to store a checksum of the selected content, and verifiying that this checksum is correct when the annotation is loaded. Note that this only allows us to know when a link has been broken; it will likely not be all that useful when attempting to locate the correct location of the link.