BIGMM 2020: IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA 2020
PROGRAM FOR SATURDAY, SEPTEMBER 26TH
Days:
previous day
all days

View: session overviewtalk overview

09:30-10:00Coffee Break
10:00-11:30 Session 13A: Novel Applications

Novel Applications

10:00
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences Using Artificial Neural Network

ABSTRACT. HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.

10:20
A Weighted Text Representation framework for Sentiment Analysis of Medical Drug Reviews

ABSTRACT. The steady growth of the Internet has increased the amount of user-generated data on the web. Patients are now commonly posting their reviews after consuming the medicines to express themselves and create public awareness. Sentiment analysis can significantly contribute to the medical field by analyzing these public reviews and study the effectiveness or popularity of several medicines. Hence, in this paper, we propose an effective framework based on the weighted word representation technique and adding linguistic constraints to model the contextually similar words. We also train six popular classifiers for the same, namely: SVM, Decision Tree, Random Forests, Naïve Bayes, and K-Nearest Neighbor. Extensive experiments were performed on the drug review dataset, which is crawled from online pharmaceutical review websites. The results demonstrate that the proposed framework has outperformed various state-of-the-art by showing an accuracy of 94.6% and an F1 score of 90.2%, respectively. This shows that the framework has effectively captured the sentiments expressed by the people on different drugs.

10:40
Strategies for Enhancing Training and Privacy in Blockchain Enabled Federated Learning

ABSTRACT. Several recent advances in Federated Learning have made it possible for researchers to train their models on private data present on contributing devices without compromising their privacy. In this paradigm, each contributor's local updates are aggregated and averaged to update the global model. In this paper, we introduce a secure and decentralized training for distributed data. In order to develop an efficient decentralized system, blockchain technology is introduced via Ethereum, which enables us to create a value-driven incentive mechanism. This is done to encourage the contributors to positively affect the learning of the global model. We provide an enhanced security mechanism by implementing differential privacy and homomorphic encryption. The performance of the global model has been significantly boosted by implementing Elastic Weight Consolidation, which prevents Catastrophic forgetting, a scenario where the model learns only on new data and forgets its previous learnings. It proves essential in distributed training since the model is being trained on a spectrum of data, often present in clusters on each contributor's device. We introduce an innovative way of using hyperparameter optimization in federated learning with the help of a Hyperopt and deposit based reward mechanism. Experiments verify the capability of the novel strategies incorporated in our system.

11:00
Exploring Multi Feature Optimization for Summarizing Clinical Trial Descriptions

ABSTRACT. Documenting Clinical Trial Descriptions of patients, can aid doctors targeting of diagnostics and treatment plans and can be used for future reference. However, with the rapid growth of population manually checking all previous files of a patient is not feasible, we address this challenge by providing summaries of clinical trial descriptions (i.e essential details of clinical trial descriptions). We present a framework for automatically summarizing Clinical Trial Descriptions, which takes advantage of different features in semantic and syntactic space. We provide a detailed ablation study to show the contribution of each feature in our approach and release our code on GitHub

11:15
Event Detection and Localization for Sparsely Populated Outdoor Environment using Seismic Sensor

ABSTRACT. The popularity of the low power sensor devices in security applications is increasing due to its simplicity, affordability and ability for undercover operations. Additionally, these sensors have nature to detect the changes passively. In this paper, we present a framework to detect the changes in the monitored area with the help of seismic sensors. We not only detect the target but also identify the location of a target. Our framework is also capable of identifying the closest sensor to the target to generalize the operations over multiple sensors. We propose two methods based on non-linear regression and property of seismic waves. The event-detection module detects the target in the monitored area with 96% accuracy on average. The proposed approaches show 1.4 meters and 1.32 meters average location estimation error. In addition to this, 64.03% and 68.32% of test locations show error within the range of 1.5 meters.

10:00-13:00 Session 13B: DHPYH

Digital Humanities into the Palm of Your Hand

11:50-13:00 Session 14A: BDH I

BDH I

11:50
DP-ANN: A new differential private Artificial Neural Network with Application on Health data

ABSTRACT. Privacy of the individual data, especially in the Health data, is very sensitive and important. Privacy preserving Machine learning is emerging as one of the solutions of security of data with the utility to create knowledge. In this paper, we have proposed a differential private artificial neural network (DP-ANN) and shows its application to predict the spread and the peak number of COVID-19 cases. Three different privacy-preserving ANN-models based on Laplacian noise at different locations of training model mainly in activation function, loss function, and weights separately are introduced. Results show that DP-ANN model with private activation function produces the result similar to the Base ANN model.

12:10
Spread & Peak Prediction of Covid-19 using ANN and Regression

ABSTRACT. Covid-19 caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus has presented tough times for countries all over the world with number of cases and casualties running in millions. While virologists and doctors have spent sleepless nights to come up with a potent vaccine, the work-life of government personnel including administrative staff, hospital employees etc. has not been any easier. Amidst this turmoil, the common question crossing every mind is concerned with the statistics about this infection including expected number of infections, peak prediction etc. We try to answer these questions by analyzing the time series data of Covid-19 infections for certain hard-hit countries and states in India. A series of the machine and deep learning models have been built to capture the infection distribution so that these models could predict the fate of this infection in the near future. We also make an attempt to predict the time when active cases would cease to increase

12:30
RDWT-SVD-FIREFLY BASED DUAL WATERMARKING TECHNIQUE FOR MEDICAL IMAGES

ABSTRACT. In order to improve the security of medical data, this paper uses a robust and secure dual watermarking technique in RDWT-SVD domain. Dual watermarking is achieved by embedding text watermark in medical watermark image using DWT, generating the final watermark. Before embedding, turbo code is used to encode the text watermark. Further, trade-off between imperceptibility and robustness is controlled in our scheme through firefly optimization. Experimental results indicate the proposed method offers high imperceptibility and robustness, and have superior overall performance.

11:50-13:00 Session 14B: MAISG I

MAISG I

11:50
UNSUPERVISED FUZZY INFERENCE SYSTEM FOR SPEECH EMOTION RECOGNITION USING AUDIO & TEXT CUES

ABSTRACT. Speech Emotion Recognition (SER) is the technique for determining underlying emotions from speech samples. Usually text transcripts supplement vocal cues and contains additional information that boosts the SER process. An unsupervised Fuzzy Inference System (FIS) is developed for SER, in this paper, that incorporates audio and text features. The extracted features are: pitch, energy and textual sentiment score. The proposed system is based on Mamdani Fuzzy Inference model and is capable of determining four emotions: happy, sad, angry and neutral. Our FIS has three variants based on the sentiment lexicon- AFINN, SentiWordNet and VADER selected for computing textual sentiment score as the text feature. The main highlights of this work are: i) formulation of eleven novel fuzzy rules based on audio and text cues for SER ii) comparative analysis of all variants of our proposed unsupervised FIS with five state-of-the-art supervised machine learning approaches for SER iii) both speaker- independent SER and speaker-dependent SER are executed iv) investigations unveil that for few speakers in speaker dependent SER have received higher accuracies than the others and v) the proposed unsupervised FIS can handle multiple datasets without any training while the supervised machine learning algorithms fails for cross-dataset evaluation. The experiments conducted on speech datasets: SAVEE and RAVDESS, indicate that our FIS has achieved higher accuracy and f1-scores as compared to the other state-of-the-art-methods.

12:10
Recent Developments in Generative Adversarial Networks: A Review

ABSTRACT. In recent times, Generative Adversarial Networks (GANs) have created a lot of buzz in the research community. GANs are formulated on the zero-sum game theory, where two neural nets compete against each other. The resultant deep model is capable of generating data similar to any data distribution provided. It utilizes the adversarial learning approach and is far more capable in learning features than the traditional machine learning models. This review focusses on the origin and evolution of GANs. Firstly, the traditional GAN is explored in terms of its structure and loss functions. Then come the common challenges of training GANs. Thirdly, the review dives into numerous GAN variants and explains their improvements. The review then lists the wide variety of applications and ends with the conclusion.

12:30
Open Domain Suggestion Mining using Fine-Grained Analysis

ABSTRACT. Suggestion mining tasks are often semantically complex and lack sophisticated methodologies that can be applied to real-world data. The presence of suggestions across a large diversity of domains and the absence of large labelled and balanced datasets render this task particularly challenging to deal with. In an attempt to overcome these challenges, we propose a two-tier pipeline that leverages Discourse Marker based oversampling and fine-grained suggestion mining techniques to retrieve suggestions from online forums. Through extensive comparison on a real-world open-domain suggestion dataset, we demonstrate how the oversampling technique combined with transformer based fine-grained analysis can beat the state of the art. Additionally, we perform extensive qualitative and qualitative analysis to give construct validity to our proposed pipeline. Finally, we discuss the practical, computational and reproducibility aspects of the deployment of our pipeline across the web.

13:00-14:00Lunch Break
14:00-15:30 Session 15A: BDH II

BDH II

14:00
A robust medical image watermarking framework based on SVD and DE in Integer DCT domain

ABSTRACT. A secure medical image watermarking framework using SVD, DE, step space filling curve and IntDCT is proposed. The proposed algorithm starts with partition of the host medical image into 88 blocks and then IntDCT is applied in each block. By collecting integer DC coefficients, a feeble resolution medical image is obtained. SVD is applied on this feeble resolution image. For maintaining a strong relationship between imperceptibility and robustness, the popular differential evolution (DE) technique is chosen to examine the most suitable scaling factors. The singular values of the host image are modified with the singular values of the watermark to insert the watermark in the host image. Here, DE is playing a very important role to determine the most suitable scaling factors to the embedded process for maintaining the dilemma between robustness and imperceptibility without degrading the nobility of medical images. To increase the security of the watermark image, first it is scuffled with the help of step space filling curve before the embedding process. The experimental observations and results proved the efficiency of the proposed framework. It maintains the quality of the watermarked images and the watermark can be extracted from the seriously malformed images.

14:20
DRDNET: DIAGNOSIS OF DIABETIC RETINOPATHY USING CAPSULE NETWORK

ABSTRACT. Diabetic Retinopathy (DR) is a polygenic disorder issue that affects human eyes. Bruise to the blood vessels of the photosensitive tissue of the retina causes this complication. It’s most frequent in patients who had diabetes for more than ten years. This downside is going on in several individuals worldwide. However, the number of medical practitioners and also the tools needed for the detection of DR are very less for serving the mass population. In this paper, we have proposed DRDNet (Diabetic Retinopathy Diagnosis Network), a neural network framework based on capsule networks (CapsNets) for DR diagnosis. Experiments on a dataset with 1,265 images demonstrate that CapsNet shows better accuracy and convergence behavior for the complex data than the state-of-the-art techniques. The proposed DRDNet performs with an overall accuracy of 80.59% for five class, as compared to the closest competitor with an accuracy of 75.83%. We performed a study on a mixed dataset for two class and found that testing accuracy was 80.59%. We have also done training on a two class model and testing on other unseen datasets. Moreover, we observed that DRDNet has much higher confidence for the predicted probabilities as compared to other state-of-the-art techniques.

14:40
A Dilated Convolutional Approach for Inflammatory Lesion Detection Using Multi-Scale Input Feature Fusion

ABSTRACT. The present manuscript proposes a novel CNN architecture to detect inflammatory lesion abnormality in Wireless Capsule Endoscopy (WCE) images. Such images encompasses a wide range of lesions and hence early diagnosis can be of vital importance. The proposed model learns the collective features of various inflammatory lesion subgroups and aggregates that information to solve a binary classification problem by distinguishing between normal and abnormal frames. The proposed model has one primary and three secondary branches. The primary branch resembles a generic CNN model with convolution and max-pooling layers whereas the secondary branches consist of dilated convolution layers and max-pooling layers. The proposed model lies fuses the multi-scale input context at varying dilation rates with different levels of the primary branch. This enhances feature quality by merging dominant global features with the local input context at multiple scales without any loss of resolution. The performance of the proposed model has been assessed using various objective evaluation metrics. The preliminary experiments indicates that the proposed model outperforms state-of-the-art models with an accuracy of 97.9%, ROC-AUC of $1$ and Precision recall AUC of 99.7.

14:00-15:30 Session 15B: MAISG II

MAISG II

14:00
Adversarial Machine Learning for Self Harm Disclosure Analysis

ABSTRACT. Adversarial Machine Learning has been gaining attention from the NLP community due to low interpretability and low robustness of the current state-of-the-art systems. In this work, we study the effect of various adversarial attacks for detection of suicidal intent in social media setting. Suicide Ideation is a sensitive issue and is a leading cause of death. We show how various models are rendered useless after attacks and perform adversarial training using the most ideal attacks to improve their robustness. We also conduct several experiments with the attacks to study their effect and propose an approach for adversarial training using Generative Adversarial Networks.

14:20
Deep Residual Neural Networks for Image in AudioSteganography

ABSTRACT. Steganography is the art of hiding a secret message inside a publicly visible carrier message. Ideally, it is done without modifying the carrier, and with minimal loss of information in the secret message. Recently, various deep learning based approaches to steganography have been applied to different message types. We propose a deep learning based technique to hide a source RGB image message inside finite length speech segments without perceptual loss. To achieve this, we train three neural networks; an encoding network to hide the message in the carrier, a decoding network to reconstruct the message from the carrier and an additional image enhancer network to further improve the reconstructed message. We also discuss future improvements to the algorithm proposed.

14:40
Analysing Emotions on Lecture Videos using CNN AND HOG

ABSTRACT. Facial Expressions play a vital role in the process of recognizing emotions and also to have non-verbal communication as well as in identifying people. Emotional Recognition turns out to be very important in everyday life just next to the tone of voice.Novel applications in Human-Computer Interaction (HMI) have been enabled by this and in many other areas. Inevitably most of the recent research on this area focuses on Convolutional Neural Networks (CNN) for extraction of features and inference from those features. In this paper we used CNN algorithm along with Histogram of gradients (HOG) features for higher accuracy. We perform emotional analysis of lecturers from Impartus videos. By detecting their emotions from a sequence of videos during their coursework, we evaluate the feedback they are expected to receive at the end of the duration of their course.

15:00
Are Bots Humans? Analysis of Bot Users in 2019 Indian Lok Sabha Elections

ABSTRACT. Social media platforms have taken political and cultural conversations to an online platform making them more accessible. Ability to anonymously post has allowed more people to participate fearlessly. However, this has also led to an opportunity to spread miss information and manipulative content. Political groups around the globe have used Bot accounts to help spread their preferred narrative online during elections. In the midst of 2019 Indian Lok Sabha Elections speculations were made about the presence of manual cyber-troops/IT Cells which operate fake accounts and push propaganda. Our finding suggests that a sizable portion of Bot accounts seems to be operated by humans in the background. These accounts have a very distinct usage pattern on Twitter compared to legitimate human users. Our experiments also point out that only 1.3% of total interactions are directed from Humans to Bots, showing Bot accounts inability to gel well in the online social network.

15:30-16:00Coffee Break
16:00-17:00 Session 16: Grand Challenge

Grand Challenge

16:00
Attributional analysis of Multi-Modal Fake News Detection Models

ABSTRACT. Fake news detection is a procedure for identifying a particular news article as counterfeit or real. In this paper, we propose and assess the ability of two approaches for the task of multi-modal fake news detection. For the first approach, we fuse the textual and image modalities. The textual features are obtained from the pre-trained language models such as BERT and SBERT and image features are extracted from ResNet18 pre-trained on ImageNet dataset. In the second approach, we use Visual Attention for fake news detection. We test both the strategies on Gossipcop and Politifact dataset. Our experiments show that the complete text of the article and the BERT model setting provides the best result. Further, we use Integrated gradients to analyze our models by observing input attributions.

16:15
METOO BMGC: A MUTLI-TASK MUTLIMODAL FRAMEWORK FOR TWEET CLASSIFICATION BASED ON CNN

ABSTRACT. The paper describes the system description and design for our solution submitted for the MeToo BigMM Grand Challenge (BMGC). The challenge involves building a multimodal framework for predicting the linguistic aspect of tweets such as relevance, stance, hate speech, sarcasm and dialogue acts. As a whole, there are 10 different categories that are to be predicted. For the challenge we try several different approaches based on Bidirectional LSTM with attention and Convolutional Neural Networks. We describe in detail our insights behind the various steps involved from the pre-processing to defining the model architecture. We further provide a detailed analysis of the results obtained from the challenge. Overall our team ranked 4th on the final leaderboard of the challenge.

16:30
Multimodal Sentiment Analysis of #MeToo Tweets using Focal Loss

ABSTRACT. The #MeToo trend has led to people talking about personal experiences of harassment more openly. This work at- tempts to aggregate such experiences of sexual abuse to facilitate a better understanding of social media constructs and to bring about social change [1]. We propose an approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead we try to detect the stand of a person on the topic and deduce the emotions conveyed. We have made use of a Multimodal Bi-transformer model [2] which combines both image and text features to produce an optimal prediction of a tweet’s stand and sentiments on the #MeToo campaign.

16:45
Stance Classification with Improved Elementary Classifiers Using Lemmatization (Grand Challenge)

ABSTRACT. Twitter, a microblogging and social networking service, gives us access to a large scale social data. In this report, we try to obtain sentiments of the tweets in context of the #metoo movement. We try to derive a model that classifies tweets into categories of different linguistic aspects like hate, sarcasm, allegations and support/opposition. This helps analyze the tweets and flag them according to relevance. Here we explore elementary machine learning algorithms, multinomial naive Bayes and random forest classifier with ratio selection classification to improve stance classification at greater efficiency than baseline models.

17:00
Transfer Learning with Augmented Vocabulary for Tweet Classification

ABSTRACT. In this paper, we describe our experiments and insights gathered in the process of participating in the BigMM Grand Challenge 2020, where the task was to classify a set of tweets pertaining to the #MeToo movement into five linguistic aspects. We analyzed the data set and experimented with several approaches, including classical machine learning models as well as state of the art deep learning architectures. We achieved our best results by applying transfer learning on a pre-trained ULMFiT model. Our best performing approach has ranked first on the leader-board when the grand challenge finished.

17:15
MeToo: Sentiment Analysis using Neural Networks

ABSTRACT. In the contemporary world, wide-range online social data has become a valuable tool in studying global movements and their impact on people as well as the people’s impact on them. One such movement, #MeToo has led hitherto unknown victims of sexual assault to come out and openly share their personal experiences. While a lot of people have responded positively towards the movement, some have not. Our work attempts to study the data that we have gathered from Twitter and generate a better understanding of the MeToo movement’s impact on society and vice versa. We use three different model architectures, a Convolutional Neural Network, a Recurrent Neural Network and a BERT transformer to classify the collected tweets on the basis of their attitude towards the people who have come out and the movement in general. The source to our GitHub project can be found at: https://github.com/ahmadkhan242/MeToo-

17:00-17:45 Session 17: BDMAS

BDMAS

17:00
Analyzing Traffic Violations through e-challan System in Metropolitan Cities

ABSTRACT. Given that India is now moving towards automated solutions to curb traffic violations and road accidents, we focus our efforts in characterizing traffic violations in Indian cities. In this work, we present our characterization of the traffic violation incidents via an Automated e-challan (electronic traffic-violation receipt) issuance system established in Ahmedabad and New Delhi. To explore this, we describe a method to collect e-challans from the traffic police portals of Ahmedabad and New Delhi, and collect an exhaustive dataset of over 6 million e-challans. We analyze the incidents of traffic violations in these cities and the associated spatial-temporal patterns. Characterizing the prevalence of repeat violations and fine payment behavior, we find that 57% of unique vehicles in Ahmedabad are involved in repeat offenses. Temporal analysis of the data reveals that the number of e-challans issued per day is continuously increasing over the years as the system is being improved. We find that it is a significant difference in e-challans issued during the festivals. Spatial analysis reveals that different violation types are distributed differently with the existence of certain unique hotspots. Finally, we also demonstrate how traffic violations can act as a proxy measure to analyze the efficacy of various government road-traffic regulations, such as the Motor Vehicles (Amendment) Act 2019. Our work suggests that high penalties may have an immediate impact on decreasing traffic violations in the short term, but the trend might not hold in the long run.

17:20
INFORMAL HUMAN SETTLEMENTS, BIG DATA AND SOCIAL THEORY IN MOROCCO

ABSTRACT. The paper adopts Actor-Network Theory (ANT) endorsed by sociologists Callon, Mitchell and Latour to reveal how the Chicago School of economics have utilized big data and policies based on abstract quantitative analysis of big data to ‘measure’ the livelihood of urban poor that remains outside the regulation of modern state authority in post-colonial nations. Specifically, by studying contemporary Morocco and the policies targeting eradication of informal human settlements such as slums and shanties to eradicate poverty in urban cities, the paper utilizes Callon and Mitchell’s theory of Economization to understand how big data plays a transforming role in framing and the politics of metrology- critical for the re-alignment of state and market power. International organizations and Hayekian economic experts have relied on abstract big data analysis to import results from one country to another making themselves ‘policy experts’ on poverty and development with little recognition of the contingent socioeconomic realities of countries in the global south. Consequently, the paper dwells with three of the given topics for the workshop involving socioeconomic implications of big data, the consequent urban planning focused on urban poor’s living settlements and the role of academic economics and the free market in re-instating the power of the state.