A Lightweight and Efficient Mechanism for Fixing the Synchronization of Misaligned Subtitle Documents
SPEAKER: unknown
ABSTRACT. Online subtitle databases allow users to easily find subtitle documents in multiple languages for thousands of films and TV series episodes. However, getting the subtitle document that gives satisfactory synchronization on the first attempt is like hitting the jackpot. The truth is that this process often involves a lot of trial-and-error because multiple versions of subtitle documents have distinct synchronization references, given that they are targeted at variations of the same audiovisual content. Building on our previous efforts to address this problem, in this paper we formalize and validate a two-phase subtitle synchronization framework. The benefit over current approaches lays in the usage of audio fingerprint annotations generated from the base audio signal as second-level synchronization anchors. This way, we allow the media player to dynamically fix during playback the most common cases of subtitle synchronization misalignment that compromise users’ watching experience. Results from our evaluation process indicate that our framework has minimal impact on existing subtitle documents and formats as well as on the playback performance.
DocuGram: Turning Screen Recordings into Documents
SPEAKER: unknown
ABSTRACT. In this paper we describe DocuGram, a novel tool to capture and share documents originating from any application. As users scroll through pages of their document inside the native application (Word, Google Docs, web browser), the system captures and analyses in real-time the rendered video frames and reconstitutes the original document pages into an easy to view HTML-based representation. In addition to detecting and regenerating the document pages, a DocuGram also includes the interactions users had over them, e.g. mouse motions and voice comments. A DocuGram allows users to flexibly share enhanced documents across applications.
An Exploratory Study on Managing and Searching for Documents in Software Engineering Environments
SPEAKER: unknown
ABSTRACT. Large number of documents are usually produced in the software industry. In this work, we conduct a qualitative study to explore the main practices and challenges related to managing these documents. The results of this study are based on interviews with 13 practitioners from nine companies. The main findings of the study are: (1) much data is stored in e-mails and in meeting protocols, (2) practitioners
like wikis, (3) when searching for documents, practitioners would rather browse the structure than use the search function and (4) searching for documents is still a challenge due to the low effectiveness of search functions and the scattering
of documents over several locations and tools.
Mass Serialization Method for Document Encryption Policy Enforcement
SPEAKER: unknown
ABSTRACT. Analytics obtained during the creation of a database of mass serialized codes can also be used to help enforcement of encryption policy on documents. In this paper, we introduce a set of metrics which complement traditional NIST cryptography methods – 4 mass serialization and one entropy metric -- which in combination can allow a discrimination between encrypted vs. zipped files. We describe the use of these methods to identify a broad range of non-randomness in number sets, and apply them to a more mundane problem—that of automatic assessment of the encryption state of a corpora of documents.
Generation of Search-able PDF of the Chemical Equations segmented from Document Images
SPEAKER: unknown
ABSTRACT. PDF format of scanned document images is not searchable. OCR tries to remedy this adversity by converting document images into
editable and searchable data, but it has it's own limitations in presence of equations - both mathematical and chemical. OCR system for mathematical equation is already a major research area and has provided successful result. However,
chemical equation segmentation has been a less ventured road. In this paper, we present a novel method for automated generation of searchable PDF format of segmented chemical equations from scanned document images by performing chemical symbol recognition and auto-correction of OCR output. We use existing OCR system, pattern recognition technique, contextual data
analysis and a standard LaTeX package to generate the chemical equation in searchable PDF format. The effectiveness of the proposed method is verified through exhaustive testing on 234 document images.
A Multimodal Crowdsourcing Framework for Transcribing Historical Handwritten Documents
SPEAKER: unknown
ABSTRACT. Transcription of handwritten historical documents is one of the main topics in document analysis systems, due to humanistic and cultural reasons. State-of-the-art handwritten text recognition systems allow to speed up the transcription task. Currently, this automatic transcription is far from being perfect, and human expert revision is required in order to obtain the actual transcription. In this context, crowdsourcing emerged as a powerful tool for massive transcription at a relatively low cost, since the supervision effort of professional transcribers may be dramatically reduced. However, current transcription crowdsourcing platforms are mainly limited to the use of non-mobile devices, since the use of keyboards in mobile devices is not friendly enough for most users. This work presents the alternative of using speech dictation of handwritten text lines as transcription source in a crowdsourcing platform. The experiments explore how an initial handwritten text recognition hypothesis can be improved by using the contribution of speech recognition from several speakers, providing as a final result a better hypothesis to be amended by a professional transcriber with less effort.
Embedded Textual Content for Document Image Classification with Convolutional Neural Networks
SPEAKER: unknown
ABSTRACT. In this paper we introduce a novel document image classification method based on combined visual and textual information.
The proposed algorithm's pipeline is inspired to the ones of other recent state-of-the-art methods which perform document image classification using Convolutional Neural Networks.
The main addition of our work is the introduction of a preprocessing step embedding additional textual information into the processed document images.
To do so we combine Object Character Recognition and Natural Language Processing algorithms to extract and manipulate relevant text concepts from document images.
Such textual information is then visually embedded within each document image to improve the classification results of a Convolutional Neural Network.
Our experiments prove that the overall document classification accuracy of a Convolutional Neural Network trained using these text-augmented document images is considerably higher than the one achieved by a similar model trained solely on classic document images, especially when different classes of documents share similar visual characteristics.