DOCENG '17: ACM SYMPOSIUM ON DOCUMENT ENGINEERING 2017
PROGRAM FOR TUESDAY, SEPTEMBER 5TH
Days:
next day
all days

View: session overviewtalk overview

09:45-10:45 Session 3

Keynote talk: Theresa Zammit Lupi

10:45-11:15Coffee Break
11:15-12:45 Session 4

Generation, Manipulation and Presentation

11:15
The RASH Javascript Editor (RAJE): a wordprocessor for writing Web-first scholarly articles
SPEAKER: unknown

ABSTRACT. The most used format used for submitting and publishing papers in the academic domain is Portable Document Format (PDF), since its possibility of being rendered in the same way independently from the device used for visualising it. However, the PDF format has some important issues as well: it lacks of interactivity, it is not machine-readable, it has seen as a monolithic artefact in the Web, and it is not offering an acceptable accessibility to people with disabilities. In order to address these issues, recently some journals, conferences, and workshops have started to accept also HTML as Web-first submission/publication format. However, most of the people are not able to produce a well-formed HTML5 article from scratch, and they would, thus, need an appropriate interface, e.g. a wordprocessor, for creating such HTML-compliant scholarly article.

So as to provide a solution to the aforementioned issue, in this paper we introduce the RASH Javascript Editor (a.k.a. RAJE), which is a multiplatform wordprocessor for writing scholarly article in HTML natively. In particular it generates documents compliant with the RASH format, that is a subset of HTML5 developed for sharing scholarly articles with appropriate Web technologies. In particular, RAJE allows authors to write research papers by means of a user-friendly interface hiding the complexities of HTML5. We also discuss the outcomes of an user study where we ask several people from the Academia to write a scientific paper using RAJE.

11:45
The Fábulas Model for Authoring Web-based Children's eBooks
SPEAKER: unknown

ABSTRACT. Nowadays, tablets and smart-phones are commonly used by children for both entertainment and education purposes. Interactive multimedia eBooks running on mobile devices allow a richer experience when compared to traditional text-only books, being potentially more engaging and entertaining to readers. However, to explore the most exciting features in these environments, authors are left alone in the sense that there is no high level (less technical) support, and these features are usually only accessible through programming or some other technical skill. In this work, we aim at extracting the main features on enhanced eBooks for children and introduce a model named Fables --- which is the Portuguese word for fables --- that allows authors to create interactive multimedia children's eBooks declaratively. The model was conceived by taking, as a starting point, a systematic analysis of the common concepts, with the focus on identifying and categorizing recurring characteristics and pointing out functional and non-functional requirements that establish a strong orientation towards the set of desirable abstractions of An underlying model. Moreover, the paper presents a case study for the implementation of Fables on the Web, and discusses the authoring of a complete interactive story over it.

12:15
Effective Floating Strategies

ABSTRACT. This paper presents an extension to the general framework for globally optimized pagination described in Mittelbach (2016). The extended algorithm supports automatic placement of floats as part of the optimization. It uses a flexible constraint model that allows for the implementation of typical typographic rules that can be weighted against each other to support different application scenarios.

By "flexible" we mean that the rules of typographic presentation of the content of a document element are not fixed---but neither are they completely arbitrary; also, some of these rules are absolute whereas others are in the form of preferences.

It is easy to see that without restrictions the float placement possibilities grow exponentially if the number of floats has a linear relation to the document size. It is therefore important to restrict the objective function used for optimization in a way that the algorithm does not have to evaluate all theoretically possible placements while still being guaranteed to find an optimal solution.

Different objective functions are being evaluated against typical typographic requirements in order to arrive at a system that is both rich in its expressiveness of modeling a large class of pagination applications while at the same time is capable of solving the optimization problem in an acceptable time frame for realistic input data.

Frank Mittelbach. 2016. A General Framework for Globally Optimized Pagination. In Proceedings of the 2016 ACM Symposium on Document Engineering (DocEng '16). ACM, New York, NY, USA, pages 11--20.

12:45-13:00 Session 5

BoF: How it Works

13:00-15:00Lunch Break

Lunch/BoF

15:00-16:30 Session 6

Collections, Systems and Management

15:00
The Mitchell Library WordCloud: Beyond Boolean Search
SPEAKER: unknown

ABSTRACT. Libraries are increasingly offering on-line digital access to their collections. However traditional search-based interfaces are restrictive and do not encourage the user to explore the collection in the same way that a physical collection does. We present the Mitchell WordCloud, a novel on-line interface to the David Scott Mitchell collection of the State Library of New South Wales. Based on interface design principles for explorative search, it presents the user with a word cloud derived from the collection and a list of titles. As the user drags words from a word cloud to tell the system what they like or dislike the title list is reordered.The surrounding interface elements--image bar, time line and Dewey bar--provide complementary insights into the collection. The traditional vector space model for measuring text similarity was extended to take account of user dislikes. Testing the Mitchell WordCloud in user studies confirmed that the application is easy to use and encourages exploration.

15:30
Small-Term Distribution for Disk-Based Search
SPEAKER: unknown

ABSTRACT. A disk-based search system distributes a large index across multiple disks on one or more machines, where documents are typically assigned to disks at random in order to achieve load balancing. However, random distribution degrades clustering, which is required for efficient index compression. Using the GOV2 dataset, we demonstrate the effect of various ordering techniques on index compression, and then quantify the effect of various document distribution approaches on compression and load balancing.

We explore runtime performance by simulating a disk-based search system for a scaled-out 10xGOV2 index over ten disks using two standard approaches, document and term distribution, as well as a hybrid approach: small-term distribution. We find that small-term distribution has the best performance, even in the presence of list caching, and argue that this rarely discussed distribution approach can improve disk-based search performance for many real-world installations.

16:00
Maintaining Integrity and Non-Repudiation in Secure Offline Documents
SPEAKER: unknown

ABSTRACT. Securing sensitive digital documents (such as health records, legal reports, government documents, and financial assets) is a critical and challenging task. Unreliable Internet connections, viruses, and compromised file storage systems impose a significant risk on such documents and can compromise their integrity especially when shared across domains while they are shared in offline fashion. In this paper, we present a new framework for maintaining integrity in offline documents and provide a non-repudiation security feature without relying on a central repository of certificates. This framework has been implemented as a plug-in for the Microsoft Word application. It is portable because the plug-in is attached to the document itself and it is scalable because there are no fixed limits on the numbers of users who can collaborate in producing the document. Our framework provides integrity and non-repudiation guarantees for each change in the document’s version history.

16:15
Distributing Text Mining tasks with librairy
SPEAKER: unknown

ABSTRACT. We present librairy a novel architecture to store, process and analyze large collections of textual resources, integrating existing algorithms and tools into a common, distributed, high-performance workflow. Available text mining techniques can be incorporated as independent plug&play modules working in a collaborative manner into the framework. In the absence of a pre-defined flow, librairy leverages on the aggregation of operations executed by different components in response to an emergent chain of events. Extensive use of Linked Data (LD) and Representational State Transfer (REST) principles are made to provide individually addressable resources from textual documents. We have described the architecture design and its implementation and tested its effectiveness in real-world scenarios such as collections of research papers, patents or ICT aids, with the objective of providing solutions for decision makers and experts in those domains. Major advantages of the framework and lessons-learned from these experiments are reported.

16:30-17:00Coffee Break
17:00-18:30 Session 7

Document Analysis: visual document analysis

17:00
Assessing Binarization Techniques for Document Images
SPEAKER: Rafael Lins

ABSTRACT. Image binarization is a technique widely used for documents as monochromatic documents claim for far less space for storage and computer bandwidth for network transmission than their color or even grayscale equivalent. Paper color, texture, aging, translucidity, kind and color of ink used in handwritting, printing process, digitalization process, etc. are some of the factors that affect binarization. No algorithm is good enough to be a winner in the binarization of all kinds of documents. This paper presents a methodology to assess the performance of binarization algorithms for a wide variety of text documents, allowing a judicious quantitative choice of the best algorithms and their parameters.

17:30
Baseline detection on Arabic Handwritten Documents
SPEAKER: unknown

ABSTRACT. Document processing comprises different steps depending on the nature of the documents. For text documents, and specially for handwritten documents, recognition of their contents is one of the main tasks. Handwritten Text Recognition (HTR) is the process of automatically obtaining the recognition of the content of a handwritten text document. In document processing, the basic unit for the acquisition process is the page image, whilst line image is the basic form for the HTR process. This is a bottle-neck which is holding back the massive industrial document processing. Baseline detection can be used not only to segment page images into line images but also for many other document processing steps. Baseline detection problem can be formulated as a clustering problem over a set of interest points. In this work we study the use of an automatic baseline detection technique, based on interest point clustering, in Arabic handwritten documents. The experiments reveal that this technique provides promising results for this task.

17:45
High-Performance Preprocessing of Architectural Drawings for Legend Metadata Extraction via OCR
SPEAKER: unknown

ABSTRACT. This paper describes the results of an investigation into methods of preprocessing architectural plots to enable them to be processed very quickly via OCR, detecting the region containing the relevant metadata legend and obtaining it in machine-readable form for e.g. automated folding and filenaming applications. We show how a processing pipeline adapted to this type of content can vastly increase processing time, maintaining acceptable accuracy. Initial results show a reduction in total processing time from 2-3 minutes to around 15 seconds for most documents encountered, with the folding orientation being correctly detected in 78% of cases and the legend region being completely detected in 60% of cases, high enough for the use-case at hand.

18:00
Automatic Page Turning for Musical Scores using Eye-Gaze Tracking
SPEAKER: unknown

ABSTRACT. Musical scores are available as digital documents through digital libraries which are a great asset to aspiring musicians, providing immediate access to thousands of scores. The combination of digital scores and portable tablets allows musicians to download sheet music directly to tablet devices and play the music on the tablet, hence compacting large volumes of works to a single, portable device. However, since the tablet screen size is typically smaller than the printed sheet music, this means that the music score is displayed at a smaller scale, increasing the difficult in reading the music. Adjusting the score to fit into the available screen space implies that more frequent page turns are required. Page turning is made more complex when taking into account that music may be abbreviated through the use of repeat mark symbols and written directions. Such repeats allow for smaller printed books by avoiding printing of sections of music that are the same. However, this may give rise to forward and backward page turns to sections of the music. Automated page turning would therefore be more desirable. An ideal system would present the score as flattened score, that is, it is displayed as it should be played rather than as it is written in order to allow the performer to execute repeats in the music with little effort. The system should also follow the musician’s pace such that the music displays the score at the appropriate tempo, allowing for slower playing in case of musicians still learning their music and also for stylistic variations in tempo.

In this paper we document our investigation in digital music sheet representation, expanding the score and eliminating the need of backward navigation. Moreover, we propose the use of eye-gaze tracking to keep track of the performer’s position on the score and hence automate the page turning which is paced according to the musician’s performance. Our digital music display tool is divided into two parts. In the first part, a scanned image of the music is processed in order to represent this as a flattened score, displaying this at a suitable size on the digital tablet. This requires pre-processing algorithms which segment the score into systems and then into bars, and symbol recognition to determine the instructions related to the musical flow. Domain knowledge is then used to flatten the score. In the second part, eye-gaze tracking is performed to determine the bar at which the musician is gazing and hence activate the page turn.

In a pilot study used to evaluate the proposed system, we found that our proposed score flattening and eye-gaze page turning reduced the time spent navigating the page turns by more than 50% when compared to the MobileSheets and SheetMusic sheet music apps for tablets.