View: session overviewtalk overview
Registration
Welcome Note
Keynote talk: Theresa Zammit Lupi
Generation, Manipulation and Presentation
11:15 | The RASH Javascript Editor (RAJE): a wordprocessor for writing Web-first scholarly articles SPEAKER: unknown ABSTRACT. The most used format used for submitting and publishing papers in the academic domain is Portable Document Format (PDF), since its possibility of being rendered in the same way independently from the device used for visualising it. However, the PDF format has some important issues as well: it lacks of interactivity, it is not machine-readable, it has seen as a monolithic artefact in the Web, and it is not offering an acceptable accessibility to people with disabilities. In order to address these issues, recently some journals, conferences, and workshops have started to accept also HTML as Web-first submission/publication format. However, most of the people are not able to produce a well-formed HTML5 article from scratch, and they would, thus, need an appropriate interface, e.g. a wordprocessor, for creating such HTML-compliant scholarly article. So as to provide a solution to the aforementioned issue, in this paper we introduce the RASH Javascript Editor (a.k.a. RAJE), which is a multiplatform wordprocessor for writing scholarly article in HTML natively. In particular it generates documents compliant with the RASH format, that is a subset of HTML5 developed for sharing scholarly articles with appropriate Web technologies. In particular, RAJE allows authors to write research papers by means of a user-friendly interface hiding the complexities of HTML5. We also discuss the outcomes of an user study where we ask several people from the Academia to write a scientific paper using RAJE. |
11:45 | The Fábulas Model for Authoring Web-based Children's eBooks SPEAKER: unknown ABSTRACT. Nowadays, tablets and smart-phones are commonly used by children for both entertainment and education purposes. Interactive multimedia eBooks running on mobile devices allow a richer experience when compared to traditional text-only books, being potentially more engaging and entertaining to readers. However, to explore the most exciting features in these environments, authors are left alone in the sense that there is no high level (less technical) support, and these features are usually only accessible through programming or some other technical skill. In this work, we aim at extracting the main features on enhanced eBooks for children and introduce a model named Fables --- which is the Portuguese word for fables --- that allows authors to create interactive multimedia children's eBooks declaratively. The model was conceived by taking, as a starting point, a systematic analysis of the common concepts, with the focus on identifying and categorizing recurring characteristics and pointing out functional and non-functional requirements that establish a strong orientation towards the set of desirable abstractions of An underlying model. Moreover, the paper presents a case study for the implementation of Fables on the Web, and discusses the authoring of a complete interactive story over it. |
12:15 | Effective Floating Strategies SPEAKER: Frank Mittelbach ABSTRACT. This paper presents an extension to the general framework for globally optimized pagination described in Mittelbach (2016). The extended algorithm supports automatic placement of floats as part of the optimization. It uses a flexible constraint model that allows for the implementation of typical typographic rules that can be weighted against each other to support different application scenarios. By "flexible" we mean that the rules of typographic presentation of the content of a document element are not fixed---but neither are they completely arbitrary; also, some of these rules are absolute whereas others are in the form of preferences. It is easy to see that without restrictions the float placement possibilities grow exponentially if the number of floats has a linear relation to the document size. It is therefore important to restrict the objective function used for optimization in a way that the algorithm does not have to evaluate all theoretically possible placements while still being guaranteed to find an optimal solution. Different objective functions are being evaluated against typical typographic requirements in order to arrive at a system that is both rich in its expressiveness of modeling a large class of pagination applications while at the same time is capable of solving the optimization problem in an acceptable time frame for realistic input data. Frank Mittelbach. 2016. A General Framework for Globally Optimized Pagination. In Proceedings of the 2016 ACM Symposium on Document Engineering (DocEng '16). ACM, New York, NY, USA, pages 11--20. |
Lunch/BoF
Collections, Systems and Management
Document Analysis: visual document analysis
17:00 | Assessing Binarization Techniques for Document Images SPEAKER: Rafael Lins ABSTRACT. Image binarization is a technique widely used for documents as monochromatic documents claim for far less space for storage and computer bandwidth for network transmission than their color or even grayscale equivalent. Paper color, texture, aging, translucidity, kind and color of ink used in handwritting, printing process, digitalization process, etc. are some of the factors that affect binarization. No algorithm is good enough to be a winner in the binarization of all kinds of documents. This paper presents a methodology to assess the performance of binarization algorithms for a wide variety of text documents, allowing a judicious quantitative choice of the best algorithms and their parameters. |
17:30 | Baseline detection on Arabic Handwritten Documents SPEAKER: unknown ABSTRACT. Document processing comprises different steps depending on the nature of the documents. For text documents, and specially for handwritten documents, recognition of their contents is one of the main tasks. Handwritten Text Recognition (HTR) is the process of automatically obtaining the recognition of the content of a handwritten text document. In document processing, the basic unit for the acquisition process is the page image, whilst line image is the basic form for the HTR process. This is a bottle-neck which is holding back the massive industrial document processing. Baseline detection can be used not only to segment page images into line images but also for many other document processing steps. Baseline detection problem can be formulated as a clustering problem over a set of interest points. In this work we study the use of an automatic baseline detection technique, based on interest point clustering, in Arabic handwritten documents. The experiments reveal that this technique provides promising results for this task. |
17:45 | High-Performance Preprocessing of Architectural Drawings for Legend Metadata Extraction via OCR SPEAKER: unknown ABSTRACT. This paper describes the results of an investigation into methods of preprocessing architectural plots to enable them to be processed very quickly via OCR, detecting the region containing the relevant metadata legend and obtaining it in machine-readable form for e.g. automated folding and filenaming applications. We show how a processing pipeline adapted to this type of content can vastly increase processing time, maintaining acceptable accuracy. Initial results show a reduction in total processing time from 2-3 minutes to around 15 seconds for most documents encountered, with the folding orientation being correctly detected in 78% of cases and the legend region being completely detected in 60% of cases, high enough for the use-case at hand. |
18:00 | Automatic Page Turning for Musical Scores using Eye-Gaze Tracking SPEAKER: unknown ABSTRACT. Musical scores are available as digital documents through digital libraries which are a great asset to aspiring musicians, providing immediate access to thousands of scores. The combination of digital scores and portable tablets allows musicians to download sheet music directly to tablet devices and play the music on the tablet, hence compacting large volumes of works to a single, portable device. However, since the tablet screen size is typically smaller than the printed sheet music, this means that the music score is displayed at a smaller scale, increasing the difficult in reading the music. Adjusting the score to fit into the available screen space implies that more frequent page turns are required. Page turning is made more complex when taking into account that music may be abbreviated through the use of repeat mark symbols and written directions. Such repeats allow for smaller printed books by avoiding printing of sections of music that are the same. However, this may give rise to forward and backward page turns to sections of the music. Automated page turning would therefore be more desirable. An ideal system would present the score as flattened score, that is, it is displayed as it should be played rather than as it is written in order to allow the performer to execute repeats in the music with little effort. The system should also follow the musician’s pace such that the music displays the score at the appropriate tempo, allowing for slower playing in case of musicians still learning their music and also for stylistic variations in tempo. In this paper we document our investigation in digital music sheet representation, expanding the score and eliminating the need of backward navigation. Moreover, we propose the use of eye-gaze tracking to keep track of the performer’s position on the score and hence automate the page turning which is paced according to the musician’s performance. Our digital music display tool is divided into two parts. In the first part, a scanned image of the music is processed in order to represent this as a flattened score, displaying this at a suitable size on the digital tablet. This requires pre-processing algorithms which segment the score into systems and then into bars, and symbol recognition to determine the instructions related to the musical flow. Domain knowledge is then used to flatten the score. In the second part, eye-gaze tracking is performed to determine the bar at which the musician is gazing and hence activate the page turn. In a pilot study used to evaluate the proposed system, we found that our proposed score flattening and eye-gaze page turning reduced the time spent navigating the page turns by more than 50% when compared to the MobileSheets and SheetMusic sheet music apps for tablets. |