AAAI-14: TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE

PROGRAM FOR THURSDAY, JULY 31ST

Days:
previous day
all days

View: session overviewtalk overview

08:40-09:25 Session 38A: AAAI-14: What's Hot Session (ICML/RSS/ICLR)
Location: Room 301A
08:40
Challenges in Machine Learning
08:55
What's Hot in Robotics
09:10
What's Hot in Learning Representations
08:40-09:25 Session 38B: AAAI-14: Senior Member Session
Location: Room 301B
08:40
Automating Science: A Grand Challenge for AI

ABSTRACT. The emergence of “big data” – data that are far more voluminous, diverse, and inter-related than we know how to cope with – has resulted in the transformation of many historically data poor disciplines into increasingly data rich disciplines. Much attention has focused on the challenges of management, processing, and analysis of big data. Advances in computing, storage, and communication technologies make it possible to organize, annotate, link, share, and analyze big data, setting the stage, in many disciplines, for a transformation from descriptive science to predictive science. However, there is a huge gap between our ability to acquire and process data and our ability to make effective use of data to advance science. While we understand how to automate routine aspects of data management and analytics, humans are still largely responsible for most aspects of the scientific process: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Accelerating the transformation of the once data poor, but increasingly data rich disciplines from descriptive sciences to predictive sciences calls for (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science). Formalization and automation of (key aspects of) science presents a grand challenge for AI, with the attendant opportunities for foundational advances in AI. Against this background, this talk will highlight some AI challenges in automating science; review some developments that set the stage for automating science; briefly review some advances in AI (e.g., in computational discovery, active learning, causal inference, literature-based discovery, argumentation systems, scientific knowledge representation and scientific applications of semantic web that signal progress on many of the AI problems in automating science, and conclude with a vision of AI research aimed at making the most of data, big and small, to not only accelerate science, but also enable new kinds of science.

08:55
Themes and Progress in Computational Scientific Discovery
SPEAKER: Pat Langley

ABSTRACT. I propose to review themes and progress in computational discovery of scientific knowledge. Topics will include viewing discovery as search through a space of models, the role of knowledge in aiding this search, and a focus on explaining observatinons rather than merely summarizing data. I will contrast research in this tradition with the more recent work on 'data-intensive' science.

09:10
Task Learning: a Challenge Problem for Integrated Intelligent Agents
SPEAKER: John Laird

ABSTRACT. The goal of this presentation is to establish task learning as a challenge problem for AI. Our vision for a general task learning agent is one where a human interactively teaches the agent new tasks, using a combination of language, gestures, and demonstrations, with reference to the agent’s available background knowledge. As the task is defined, the agent asks questions about concepts it does not know, and about instructions that are not clear.The need for taskable agents is going to become critical as intelligent autonomous agents play increasing roles in our lives. It will be impossible to preprogram all the tasks that a human user will want an agent to perform. We will want to dynamically instruct our agents with new tasks and incrementally customize their behavior so that they do things the way we want them to be done. This presentation will build off of the results of a (proposed) NSF workshop on task learning being held in May that will bring together experts in autonomous agents, cognitive architecture, human-robot interaction, learning, and cognitive robots to establish a roadmap for research in task learning.

08:40-09:25 Session 38C: AAAI-14: Senior Member Session / IJCAI-JAIR 2014 Best Paper
Location: Room 303A
08:40
Senior Member: Tackling Real World Data Streams: a Call to Arms

ABSTRACT. Data stream mining has been a very active and successful research area in the last decade. It is best characterised as the online and resource-bounded processing and analysis of unbounded streams of data originating from non-stationary data sources. Still, when looking at real world data streams, the current state of the art in stream mining research is lacking in several regards. Shortcomings include preprocessing, labeling, scalability, evaluation, complex data, and a dearth of streaming algorithms for non-classification problems.

08:55
Senior Member: Computational Social Choice

ABSTRACT. Computational social choice is an expanding fields of research that merges classical disciplines like voting theory, political science, and economics with more modern ones like artificial intelligence and computational complexity. The main aim of this research area is to study AI settings where one needs to aggregate preferences coming from multiple sources, or agents, and to do so with computational concerns in mind. I plan to describe this very exciting and inter-disciplinary area of research, which in the past few years has been more and more present in large AI conférences like AAAI and IJCAI, as well as AAMAS and EC, by introducing the main lines of work and hinting at its potential for the future.

09:10
Wikipedia-based Semantic Interpretation for Natural Language Processing
08:40-09:25 Session 38D: AAAI-14: What's Hot Session (AIIDE/CP/QA)
Location: Room 303B
08:40
Challenges in Interactive Entertainment
SPEAKER: Kevin Dill
08:55
Challenges in Constraint Programming
09:10
Challenges Beyond Factoid Question Answering
SPEAKER: Ken Barker
09:00-10:00 Session 39: IAAI-14: Crowdsourcing
Location: Room 304A/B
09:00
Emerging: Robust Protection of Fisheries with COmPASS

ABSTRACT. Fish stocks around the world are in danger from ille- gal fishing. In collaboration with the U.S. Coast Guard (USCG), we work to defend fisheries from illegal fish- erman (henceforth called Lanchas) in the U.S. Gulf of Mexico. We have developed the COmPASS (Conserva- tive Online Patrol ASSistant) system to design USCG patrols against the Lanchas. In this application, we face a population of Lanchas with heterogeneous behavior who fish frequently. We have some data about these Lanchas, but not enough to fit a statistical model. Previ- ous security patrol assistants have focused on counter- terrorism in one-shot games where adversaries are as- sumed to be perfectly rational, and much less data about their behavior is available. COmPASS is novel because: (i) it emphasizes environmental crime; (ii) it is based on a repeated Stackelberg game; (iii) it allows for bounded rationality of the Lanchas and it offers a robust approach against the heterogeneity of the Lancha population; and (iv) it can learn from sparse Lancha data. We report the effectiveness of COmPASS in the Gulf in our numeri- cal experiments based on real fish data. The COmPASS system is to be tested by USCG.

09:20
Emerging: Swissnoise: Online Polls with Game-Theoretic Incentives

ABSTRACT. There is much interest in crowdsourcing information that is distributed among many individuals, such as the likelihood of future events, election outcomes, the quality of products, or the consequence of a decision. To obtain accurate outcomes, various game-theoretic incentive schemes have been proposed. However, only prediction markets have been tried in practice. In this paper, we describe an experimental platform, swissnoise, that compares prediction markets with peer prediction schemes developed in recent AI research. It shows that peer prediction schemes can achieve similar performance while being applicable to a much broader range of questions.

09:40
Emerging: Crowdsourcing for Multiple-Choice Question Answering

ABSTRACT. We leverage crowd wisdom for multiple-choice question answering, and employ lightweight machine learning techniques to improve the aggregation accuracy of crowdsourced answers to these questions. In order to develop more effective aggregation methods and evaluate them empirically, we developed and deployed a crowdsourced system for playing the "Who wants to be a millionaire?" quiz show.Analyzing our data (which consist of more than 200,000 answers), we find that by just going with the most selected answer in the aggregation, we can answer over 90% of the questions correctly, but the success rate of this technique plunges to 60% for the later/harder questions in the quiz show. To improve the success rates of these later/harder questions, we investigate novel weighted aggregation schemes for aggregating the answers obtained from the crowd.By using weights optimized for reliability of participants (derived from the participants' confidence), we show that we can pull up the accuracy rate for the harder questions by 15%, and to overall 95% average accuracy.Our results provide a good case for the benefits of applying machine learning techniques for building more accurate crowdsourced question answering systems.

09:25-09:55 Session 40A: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301A
09:25
Exact Subspace Clustering in Linear Time
SPEAKER: unknown

ABSTRACT. Subspace clustering is an important unsupervised learning problem with wide applications in computer vision and data analysis. However, the state-of-the-art methods for this problem suffer from high time complexity---quadratic or cubic in $n$ (the number of data instances). In this paper we exploit a data selection algorithm to speedup computation and the robust principal component analysis to strengthen robustness. Accordingly, we devise a scalable and robust subspace clustering method which costs time only linear in $n$. We prove theoretically that under certain mild assumptions our method solves the subspace clustering problem exactly even for grossly corrupted data. Our algorithm is based on very simple ideas, yet it is the only linear time algorithm with noiseless or noisy recovery guarantee. Finally, empirical results verify our theoretical analysis.

09:40
Learning with Augmented Class by Exploiting Unlabeled Data
SPEAKER: unknown

ABSTRACT. In many real-world applications of learning, the environment is open and changes gradually, which requires the system to have the ability of detecting and adapting to the changes. Class-incremental learning (C-IL) is an important and practical problem where data from unseen augmented classes are fed, but has not been studied well in the past. In C-IL, the system should beware of predicting instances from augmented classes as a seen class, and thus faces the challenge that no such instances were shown in training. In this paper, we investigate tackling the challenge by using unlabeled data, which can be cheaply collected in many real-world applications. We propose the LACU framework as well as the LACU-SVM approach to learn the concept of seen classes while incorporating the structure presented in the unlabeled data, so that the misclassification risks among the seen classes as well as between the augmented and the seen classes are minimized simultaneously. Experiments on diverse datasets show the effectiveness of the proposed approach.

09:25-09:55 Session 40B: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301B
09:25
Power Iterated Color Refinement
SPEAKER: unknown

ABSTRACT. Color refinement is a basic algorithmic routine for graph isomorphism testing and has recently been used for computing graph kernels as well as for lifting belief propagation and linear programming. So far, color refinement has been treated as a combinatorial problem. Instead, we treat it as a nonlinear continuous optimization problem and prove that it implements a conditional gradient optimizer that can be turned into graph clustering approaches using hashing and truncated power iterations. This shows that color refinement is easy to understand in terms of (local) random walks, easy to implement (matrix-vector multiplications) and is readily parallelizable. We support our theoretical results with experimental evidence on real-world graphs with millions of edges.

09:40
Convex Co-embedding
SPEAKER: unknown

ABSTRACT. We present a general framework for association learning where entities are embedded in a common latent semantic space to allow relatedness to be expressed by geometry---an approach that underlies the state of the art for link prediction, relation learning, multi-label tagging, relevance retrieval and ranking. Although current approaches rely on local training algorithms applied to non-convex formulations, we demonstrate how general convex relaxations can be easily achieved for entity embedding, both for the standard multi-linear and prototype-distance response models. We propose an incremental optimization strategy that exploits decomposition to allow scaling. An experimental evaluation reveals the advantages of tractable and repeatable global training in different case studies.

09:25-09:55 Session 40C: AAAI-14: Game Design/Intelligent Tutoring
Location: Room 302A
09:25
Synthesis of Geometry Proof Problems
SPEAKER: unknown

ABSTRACT. This paper presents an intelligent tutoring system, GeoTutor, for Euclidean Geometry that is automatically able to synthesize proof problems and their respective solutions given a geometric figure together with a set of properties true of it. GeoTutor can provide personalized practice problems that address student deficiencies in the subject matter.

09:40
Automatic Game Design via Mechanic Generation
SPEAKER: unknown

ABSTRACT. Game designs often center on the game mechanics - rules governing the logical evolution of the game. We seek to develop an intelligent system that generates computer games. As first steps towards this goal we present a composable and cross-domain representation for game mechanics that draws from AI planning action representations. We use a constraint solver to generate mechanics subject to design requirements on the form of those mechanics - what they do in the game. A planner takes a set of generated mechanics and tests whether those mechanics meet playability requirements - controlling how mechanics function in a game to affect player behavior. We demonstrate our system by modeling and generating mechanics in a role-playing game, platformer game, and combined role-playing-platformer game.

09:25-09:55 Session 40D: AAAI-14: Heuristic Search and Optimization / Planning and Scheduling
Location: Room 302B
09:25
Optimal Decomposition in Linear Constraint Systems
SPEAKER: unknown

ABSTRACT. Decomposition can be defined as a technique to obtain complete solutions by easy composition of partial solutions. Typically, these partial solutions are obtained by distributed and concurrent local problem solving without communication between the individual problem solvers. Constraint decomposition plays an important role in distributed databases, distributed scheduling and violation detection: Here, it enables conflict-free local decision making, while avoiding communication overloading. One of the main issues in decomposition is the loss of flexibility due to the composition technique used. Here, flexibility roughly refers to the freedom in choosing suitable values for the variables in order to satisfy the constraints. In this paper we concentrate on linear constraint systems and efficient decomposition techniques for these systems. Using a generalization of a flexibility metric developed for STNs, we show how an efficient decomposition technique for linear constraints can be derived that minimizes the loss of flexibility due to decomposition. As a by-product of our decomposition technique, we show that an intuitively attractive flexibility metric for linear constraint systems can be developed where decomposition does not incur any loss of flexibility.

09:40
Cached Iterative Weakening for Optimal Multi-Way Number Partitioning
SPEAKER: unknown

ABSTRACT. The NP-hard number-partitioning problem is to separate a multiset S of n positive integers into k subsets, such that the largest sum of the integers assigned to any subset is minimized. The classic application is scheduling a set of n jobs with different run times onto k identical machines such that the makespan, the time to complete the schedule, is minimized. We present a new algorithm, cached iterative weakening (CIW), for solving this problem optimally. It incorporates three ideas distinct from the previous state of the art: it explores the search space using iterative weakening instead of branch and bound; generates feasible subsets once and caches them instead of at each node of the search tree; and explores subsets in cardinality order instead of an arbitrary order. The previous state of the art is represented by three different algorithms depending on the values of n and k. We provide one algorithm which outperforms all previous algorithms for k >= 4. Our run times are up to two orders of magnitude faster.

09:25-09:55 Session 40E: AAAI-14: Machine Learning / Vision
Location: Room 303A
09:25
Supervised Hashing for Image Retrieval via Image Representation Learning
SPEAKER: unknown

ABSTRACT. Hashing is a popular approximate nearest neighbor search approach in large-scale image retrieval. Supervised hashing, which incorporates similarity/dissimilarity information on entity pairs to improve the quality of hashing function learning, has recently received increasing attention. However, in the existing supervised hashing methods for images, an input image is usually encoded by a vector of hand-crafted visual features. Such hand-crafted feature vectors do not necessary preserve the accurate semantic similarities of images pairs, which may often degrade the performance of hashing function learning. In this paper, we propose a supervised hashing method for image search, in which we automatically learn a good image representation tailored to hashing as well as a set of hash functions. The proposed method has two stages. In the first stage, given the pairwise similarity matrix $S$ on pairs of training images, we propose a scalable coordinate descent method to decompose $S$ into a product of $HH^T$ where $H$ is a matrix with each of its row being the approximate hash code associated to a training image. In the second stage, we propose to simultaneously learn a good feature representation for the input images as well as a set of hash functions, via a deep convolutional network tailored to the learned hash codes in $H$ or the discrete class labels of the images. Extensive empirical evaluations on three benchmark datasets with different kinds of images show that the proposed method has superior performance gains over several state-of-the-art supervised and unsupervised hashing methods.

09:40
Predicting Emotions in User-Generated Videos
SPEAKER: unknown

ABSTRACT. User-generated video collections are expanding rapidly in recent years, and systems for automatic analysis of these collections are in high demands. While extensive research efforts have been devoted to recognizing semantics like "birthday party" and "skiing", little attempts have been made to understand the emotions carried by the videos, e.g., "joy" and "sadness". In this paper, we propose a comprehensive computational framework for predicting emotions in user-generated videos. We first introduce a rigorously designed dataset collected from popular video-sharing websites with manual annotations, which can serve as a valuable benchmark for future research. A large set of features are extracted from this dataset, ranging from popular low-level visual descriptors, audio features, to high-level semantic attributes. Results of a comprehensive set of experiments indicate that combining multiple types of features---such as the joint use of the audio and visual clues---is important, and attribute features such as those containing sentiment-level semantics are very effective.

09:25-09:55 Session 40F: AAAI-14: Heuristic Search and Optimization
Location: Room 303B
09:25
Programming by Example using Least General Generalizations
SPEAKER: unknown

ABSTRACT. Programming-by-example (PBE) has recently seen important advances in the domain of text editing, but existing technology is restricted to transformations on relatively small unstructured text strings. In this paper we address structural/formatting transformations in richly formatted documents, using an approach based on the idea of least general generalizations from inductive inference, which avoids the scalability issues faced by state-of-the-art PBE methods. We describe a novel domain specific language (DSL) that can succinctly describe expressive transformations over XML structures (used for describing richly formatted content) and is equipped with a natural partial ordering between programs based on a subsumption relation. We then describe a synthesis algorithm that, given a set of input-output examples of XML structures, identifies a minimal DSL program that is consistent with the examples. We present experimental results over a benchmark of formatting tasks that we collected from online help forums, which show an average of 4.1 examples required for task completion in a few seconds.

09:40
Distribution-Aware Sampling and Weighted Model Counting for SAT
SPEAKER: unknown

ABSTRACT. Given a CNF formula and a weight for each assignment of values to variables, two natural problems are weighted model counting and distribution-aware sampling of satisfying assignments. Both problems have a wide variety of important applications. Due to the inherent complexity of the exact versions of the problems, interest has focused on solving them approximately. Prior work in this area scaled only to small problems in practice, or failed to provide strong theoretical guarantees, or employed a computationally-expensive maximum a poste- riori probability (MAP) oracle that assumes prior knowledge of a factored representation of the weight distribution. We present a novel approach that works with a black-box oracle for weights of assignments and requires only an {\NP}-oracle to solve both the counting and sampling problems. Our approach works under mild assumptions on the distribution of weights of satisfying assignments, provides strong theoretical guarantees, and scales to problems involving several thousand variables. We also show that the assumptions can be significantly relaxed if a factored representation of the weights is known.

09:55-10:15Coffee Break
10:15-11:15 Session 41A: AAAI-14 Invited Talk
Location: Hall 200A
10:15
Game Theory for Security: Key Algorithmic Principles, Deployed Applications, Research Challenges
SPEAKER: Milind Tambe

ABSTRACT. Security is a global concern, requiring efficient, randomized allocation and scheduling of limited security resources. To that end, we have used computational game theory to build decision aids for security agencies around the world. These decision aids are in use by agencies such as the US Coast Guard for protection of ports and ferry traffic, and the Federal Air Marshals Service and LAX police for protecting air traffic; our game-theoretic algorithms are also under evaluation for suppression of urban crime and for protection of wildlife and fisheries. I will overview my group's research in this growing area of security games.

10:15-11:45 Session 41B: IAAI-14: Transportation and Personalization
Location: Room 304A/B
10:15
Emerging: A Unified Framework for Augmented Reality and Knowledge-Based Systems in Maintaining Aircraft
SPEAKER: Geun-Sik Jo

ABSTRACT. Aircraft maintenance and training play one of the most important roles in ensuring flight safety. The maintenance process usually involves massive numbers of components and substantial procedural knowledge of maintenance procedures. Maintenance tasks require technicians to follow rigorous procedures to prevent operational errors in the maintenance process. In addition, the maintenance time is a cost-sensitive issue for airlines. This paper proposes intelligent augmented reality (IAR) system to minimize operation errors and time-related costs and help aircraft technicians cope with complex tasks by using an intuitive UI/UX interface for their maintenance tasks. The IAR system is composed mainly of three major modules: 1) the AR module 2) the knowledge-based system (KBS) module 3) a unified platform with an integrated UI/UX module between the AR and KBS modules. The AR module addresses vision-based tracking, annotation, and recognition. The KBS module deals with ontology-based resources and context management. Overall testing of the IAR system is conducted at Korea Air Lines (KAL) hangars. Tasks involving the removal and installation of pitch trimmers in landing gear are selected for benchmarking purposes, and according to the results, the proposed IAR system can help technicians to be more effective and accurate in performing their maintenance tasks.

10:35
Emerging: Optimizing a Start-Stop Controller Using Policy Search

ABSTRACT. We applied a policy search algorithm to the problem of optimizing a start-stop controller — a controller used in a car to turn off the vehicle's engine, and thus save energy, when the vehicle comes to a temporary halt. We were able to improve the existing policy by approximately 12% using real driver trace data. We also experimented with using multiple policies, and found that doing so could lead to a further 8% improvement if we could determine which policy to apply at each stop. The driver's behaviors before stopping were found to be uncorrelated with the policy that performed best; however, further experimentation showed that the driver's behavior during the stop may be more useful, suggesting a useful direction for adding complexity to the underlying start-stop policy.

10:55
Emerging: Advice Provision for Energy Saving in Automobile Climate Control Systems
SPEAKER: Amos Azaria

ABSTRACT. Reducing energy consumption of climate control systems is important in order to reduce human environmental footprint. The need to save energy becomes even greater when considering an electric car, since heavy use of the climate control system may exhaust the battery. In this paper we consider a method for an automated agent to provide advice to drivers which will motivate them to reduce the energy consumption of their climate control unit. Our approach takes into account both the energy consumption of the climate control system and the expected comfort level of the driver. We therefore build two models, one for assessing the energy consumption of the climate control system as a function of the system's settings, and the other, models human comfort level as a function of the climate control system's settings. Using these models, the agent provides advice to the driver considering how to set the climate control system. The agent advises settings which try to preserve a high level of comfort while consuming as little energy as possible. We empirically show that drivers equipped with our agent which provides them with advice significantly save energy as compared to drivers not equipped with our agent.

11:15
Emerging: StrokeBank: Automating Personalized Chinese Handwriting Generation
SPEAKER: Alfred Zong

ABSTRACT. Machine learning techniques have been successfully applied to Chinese character recognition; nonetheless, automatic generation of stylized Chinese handwriting remains a challenge. In this paper, we propose StrokeBank, a novel approach to automating personalized Chinese handwriting generation. We use a semi-supervised algorithm to construct a dictionary of component mappings from a small seeding set. Unlike previous work, our approach does not require human supervision in stroke extraction or knowledge of the structure of Chinese characters. This dictionary is used to generate handwriting that preserves stylistic variations, including cursiveness and spatial layout of strokes. We demonstrate the effectiveness of our model by a survey-based evaluation. The results show that our generated characters are nearly indistinguishable from ground truth handwritings.

11:20-12:20 Session 42A: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301A
11:20
A Spatially Sensitive Kernel to Predict Cognitive Performance from Short-Term Changes in Neural Structure
SPEAKER: unknown

ABSTRACT. This paper introduces a novel framework for performing machine learning on longitudinal neuroimaging datasets. These datasets are characterized by their size, particularly their width (millions of features per input).

Specifically, we address the problem of detecting subtle, short-term changes in neural structure that are indicative of cognitive decline and correlate with risk factors for Alzheimer's disease. We introduce a new spatially-sensitive kernel that allows us to reason about individuals, as opposed to populations.

In doing so, this paper presents the first evidence demonstrating that very small changes in white matter structure over a two year period can predict change in cognitive function in healthy adults.

11:35
Online Classification Using a Voted RDA Method
SPEAKER: unknown

ABSTRACT. We propose a voted dual averaging method for online classification problems with explicit regularization. This method employs the update rule of the regularized dual averaging (RDA) method proposed by Xiao, but only on the subsequence of training examples where a classification error is made. We derive a bound on the number of mistakes made by this method on the training set, as well as its generalization error rate.We also introduce the concept of relative strength of regularization, and show how it affects the mistake bound and generalization performance. We examine the method using ℓ1-regularization on a large-scale natural language processing task, and obtained state-of-the-art classification performance with fairly sparse models.

11:50
Bagging by design (on the sub-optimality of bagging)
SPEAKER: unknown

ABSTRACT. Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging \emph{ideal-bagging}. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps (obtained by subsampling) to train different base predictors. Most importantly, we provide an alternative subsampling method called \emph{design-bagging} based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the bagging and design-bagging on their prediction accuracy each relative to the accuracy ideal-bagging.This can possibly find applications in more involved bagging-based ensemble methods. Our analytical results are backed up by experiments on classification and regression settings.

12:05
LASS: A Simple Assignment Model with Laplacian Smoothing
SPEAKER: unknown

ABSTRACT. We consider the problem of learning soft assignments of N items to K categories given two sources of information: an item-category similarity matrix, which encourages items to be assigned to categories they are similar to (and to not be assigned to categories they are dissimilar to), and an item-item similarity matrix, which encourages similar items to have similar assignments. We propose a simple quadratic programming model that captures this intuition. We give necessary conditions for its solution to be unique, define an out-of-sample mapping, and derive a simple, effective training algorithm based on the alternating direction method of multipliers. The model predicts reasonable assignments from even a few similarity values, and can be seen as a generalization of semisupervised learning. It is particularly useful when items naturally belong to multiple categories, as for example when annotating documents with keywords or pictures with tags, with partially tagged items, or when the categories have complex interrelations (e.g. hierarchical) that are unknown.

11:20-12:20 Session 42B: AAAI-14: NLP and Machine Learning
Location: Room 301B
11:20
Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis
SPEAKER: unknown

ABSTRACT. Recursive neural models have achieved promising results in many natural language processing tasks. The main difference among these models lies in the composition function, i.e., how to obtain the vector representation for a phrase or sentence using the representations of words it contains. This paper introduces a novel Adaptive Multi-Compositionality (AdaMC) layer to recursive neural models. The basic idea is to use more than one composition functions and adaptively select them depending on the input vectors. We present a general framework to model each semantic composition as a distribution over these composition functions. The composition functions and parameters used for adaptive selection are learned jointly from data. We integrate AdaMC into existing recursive neural models and conduct extensive experiments on the Stanford Sentiment Treebank. The results illustrate that AdaMC significantly outperforms state-of-the-art sentiment classification methods. It helps push the best accuracy of sentence-level negative/positive classification from 85.4% up to 88.5%.

11:35
On Dataless Hierarchical Text Classification
SPEAKER: unknown

ABSTRACT. In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can be used to categorize the documents to the corresponding categories. This is done by embedding both labels and documents in a semantic space that allows one to compute meaningful semantic similarity between a document and a potential label. We show that this scheme can be used to support accurate multiclass classification without any supervision. We study several semantic representations and show how to improve the classification using bootstrapping methods. Our results show that bootstrapped dataless classification is competitive with supervised classification with thousands of labeled examples.

11:50
Instance-based Domain Adaptation in NLP via In-target-domain Logistic Approximation
SPEAKER: unknown

ABSTRACT. In the field of NLP, most of the existing domain adaptation studies belong to the feature-based adaptation, while the research of instance-based adaptation is very scarce. In this work, we propose a new instance-based adaptation model, called in-target-domain logistic approximation (ILA). In ILA, we adapt the source-domain data to the target domain by a logistic approximation. The normalized in-target-domain probability is assigned as an instance weight to each of the source-domain training data. An instance-weighted classification model is trained finally for the cross-domain classification problem. Compared to the previous techniques, ILA conducts instance adaptation in a dimensionality-reduced linear feature space to ensure efficiency in high-dimensional NLP tasks. The instance weights in ILA are learnt by leveraging the criteria of both maximum likelihood and minimum statistical distance. The empirical results on two NLP tasks including text categorization and sentiment classification show that our ILA model beats the state-of-the-art instance adaptation methods significantly, in cross-domain classification accuracy, parameter stability and computational efficiency.

12:05
Semi-supervised Matrix Completion for Cross-Lingual Text Classification
SPEAKER: unknown

ABSTRACT. Cross-lingual text classification is the task of assigning labels to a given document in a label-scarce target language by using a prediction model trained with labeled documents from a label-rich source language, which is popularly studied in the natural language processing area as it can largely decrease the expensive manual annotation effort in the target language. In this work, we proposed a novel semi-supervised representation learning approach to address this challenging task, which discovers interlingual features by simultaneously performing semi-supervised matrix completion. To evaluate the proposed learning technique, we conducted extensive experiments on eighteen cross language sentiment classification tasks with four different languages. The empirical results demonstrated the efficacy of our approach and outperformed the other comparison methods.

11:20-12:20 Session 42C: AAAI-14: Search and Constraint Satisfaction
Location: Room 302A
11:20
Maximum Satisfiability using core-guided MaxSAT Resolution
SPEAKER: unknown

ABSTRACT. Core-guided approaches to solving MaxSat have proved to be effective on industrial problems containing hard clauses and weighted soft clauses (weighted partial MaxSat or WPM). These approaches solve WPM problems by building a sequence of new WPM formulas, where in each formula a greater weight of soft clauses can be relaxed. Relaxation of the soft clauses is achieved via the addition of blocking variables to the soft clauses, along with constraints on these blocking variables. In this work we propose an alternative approach. Our approach also builds a sequence of new WPM formulas. However, these formulas are constructed using MaxSat resolution, a sound rule of inference for MaxSat. MaxSat resolution can in the worst case cause a quadratic blowup in the formula, so we propose a new compressed version of MaxSat resolution. Using compressed MaxSat resolution our new core-guided solver improves the state-of-the-art solving significantly more problems than other state-of-the-art solvers on the industrial benchmarks used in the 2013 MaxSat Solver Evaluation.

11:35
Adaptive Singleton-based Consistencies
SPEAKER: unknown

ABSTRACT. Singleton-based consistencies have been shown to dramatically improve the performance of constraint solvers on some difficult instances. However, they are in general too expensive to be applied exhaustively during the whole search. In this paper, we focus on partition-one-AC, a singleton-based consistency which, as opposed to singleton arc consistency, is able to prune values on all variables at each singleton test. We propose adaptive variants of partition-one-AC that do not necessarily run until having proved the fixpoint. The pruning can be weaker than the full version but the computational effort can be significantly reduced. Our experiments show that adaptive Partition-one-AC can obtain significant speedups over arc consistency and over the full version of partition-one-AC.

11:50
A Reasoner for the RCC-5 and RCC-8 Calculi Extended with Constants
SPEAKER: unknown

ABSTRACT. The problem of checking the consistency in qualitative calculi that contain both unknown and known entities (constants, i.e., real geometries) has recently appeared and has applications in many areas. Until now, all the approaches are theoretical and no implementation has been proposed. In this paper we present the first reasoner that takes as input RCC-5 or RCC-8 networks that involve entities with specific geometries and decides their consistency. We investigate the performance of the reasoner and contrary to lots of other works in this area we consider real datasets in our experimental analysis.

12:05
Fast Consistency Checking of Very Large Real-World RCC-8 Constraint Networks Using Graph Partitioning
SPEAKER: unknown

ABSTRACT. We present a new reasoner for RCC-8 constraint networks, called gp-rcc8, that is based on the patchwork property of path-consistent tractable RCC-8 networks and graph partitioning. We compare gp-rcc8 with state of the art reasoners that are based on constraint propagation and backtracking search as well as one that is based on graph partitioning and SAT solving. Our evaluation considers very large real-world RCC-8 networks and medium-sized synthetic ones, and shows that gp-rcc8 outperforms the other reasoners for these networks, while it is less efficient for smaller networks.

11:20-12:20 Session 42D: AAAI-14: AI and the Web/NLP and Text Mining
Location: Room 302B
11:20
Acquiring Comparative Commonsense Knowledge from the Web
SPEAKER: unknown

ABSTRACT. This paper presents a method for automatically constructing a large comparative commonsense knowledge base from Big Data. The resulting knowledge base is semantically refined and organized. Our method is based on linear optimization methods to clean and consolidate the noisy input knowledge, while also inferring new information. Our method achieves a high precision while maintaining good coverage.

11:35
Emotion Classification in Microblog Texts Using Class Sequential Rules
SPEAKER: unknown

ABSTRACT. This paper studies the problem of emotion classification in microblog texts. Given a microblog text which consists of several sentences, we classify its emotion as anger, disgust, fear, happiness, like, sadness or surprise if possible. Existing methods can be categorized as lexicon based methods or machine learning based methods. However, due to some intrinsic characteristics of the microblog texts, previous studies using these methods always get unsatisfactory results. This paper introduces a novel approach based on class sequential rules for emotion classification of microblog texts. The approach first obtains two potential emotion labels for each sentence in a microblog text by using an emotion lexicon and a machine learning approach respectively, and regards each microblog text as a data sequence. It then mines class sequential rules from the sequence set and finally derives new features from the mined rules for emotion classification of microblog texts. Experimental results on a Chinese benchmark dataset show the superior performance of the proposed approach.

11:50
SUIT: A Supervised User-Item based Topic model for Sentiment Analysis
SPEAKER: unknown

ABSTRACT. Topic models have been widely used for sentiment analysis. Previous studies shows that supervised topic model can better model the sentiment expressions. However most of existing topic methods only model the sentiment text information, but do not consider the user, who expressed the sentiment, and the item, which the sentiment is expressed on. Since different users may use different sentiment expressions for different items, we argue that it is better to incorporate the user and item information into the topic model for sentiment analysis. In this paper, we propose a novel Supervised User-Item based Topic model, called SUIT model, for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors. Our proposed method uses the tensor outer product of text topic proportion vector, user latent factor and item latent factor to model the sentiment label generalization. Extensive experiments are conducted on two datasets: review dataset and microblog dataset. The results demonstrate the advantages of our model. It shows significant improvement compared with supervised topic models and collaborative filtering methods.

12:05
Where and Why Users "Check In"
SPEAKER: unknown

ABSTRACT. The emergence of location based social network (LBSN) services makes it possible to study individuals’ mobility patterns at a fine-grained level and to see how they are impacted by social factors. In this study we analyze the check-in patterns in LBSN and observe significant temporal clustering of check-in activities. We explore how self-reinforcing behaviors, social factors, and exogenous effects contribute to this clustering and introduce a framework to distinguish these effects at the level of individual check-ins for both users and venues. Using check-in data from three major cities, we show not only that our model can improve prediction of future check-ins, but also that disentangling of different factors allows us to infer meaningful properties of different venues.

11:20-12:20 Session 42E: AAAI-14: Computational Sustainability and AI
Location: Room 303A
11:20
Spatial Scan for Disease Mapping on a Mobile Population
SPEAKER: unknown

ABSTRACT. Spatial scan statistics is used for discovery of spatial regions with significantly higher scores according to some density measure. In disease surveillance, spatial scan is a standard tool to detect spatial regions whose population has significantly higher disease risk than the overall population. In this important application, called the disease mapping, current residence is typically used to define the location of individuals from the population. Considering the mobility of humans at various temporal and spatial scales, using only information about the current residence can be insufficient because it ignores a multitude of exposures that occur away from home or which had occurred at previous residences. In this paper, we propose a novel spatial scan statistic that allows disease mapping in a mobile population. We also propose a computationally efficient disease mapping algorithm that uses the proposed statistic to find the significant high-risk spatial regions. The experimental results demonstrate that the proposed algorithm is superior to the traditional disease mapping algorithms in discovering high-risk regions in mobile populations. Moreover, the algorithm is applicable on large populations and over dense spatial grids.

11:35
Rounded Dynamic Programming for Tree-Structured Stochastic Network Design
SPEAKER: unknown

ABSTRACT. We develop a fast approximation algorithm called rounded dynamic programming (RDP) for stochastic network design problems on directed trees. The underlying model describes phenomena that spread away from the root of a tree, for example, the spread of influence in a hierarchical organization or fish in a river network. Actions can be taken to intervene in the network—for some cost—to increase the probability of propagation along an edge. Our algorithm selects a set of actions to maximize the overall spread in the network under a limited budget. We prove that the algorithm is a fully polynomial-time approximation scheme (FPTAS), that is, it finds (1−ε)-optimal solutions in time polynomial in the input size and 1/ε. We apply the algorithm to an important motivating problem in Computational Sustainability: that of efficiently allocating funds to remove barriers in a river network so fish can reach greater portions of their native range. Our experiments show that our algorithm is able to produce near- optimal solutions much faster than an existing technique.

11:50
Efficient Buyer Groups for Prediction-of-Use Electricity Tariffs
SPEAKER: unknown

ABSTRACT. Current electricity tariffs do not reflect the real cost that customers incur to suppliers, as units are charged at the same rate, regardless of how predictable each customer's consumption is. A recent proposal to address this problem are prediction-of-use tariffs. In such tariffs, a customer is asked in advance to predict her future consumption, and is charged based both on her actual consumption and the deviation from her prediction. Prior work studied the cost game induced by a single such tariff, and showed consumers would have an incentive to minimize their risk, by joining together when buying electricity as a grand coalition. In this work we study the efficient (i.e. cost-minimizing) structure of buying groups for the more realistic setting when multiple, competing prediction-of-use tariffs are available. We propose a polynomial time algorithm to compute efficient buyer groups, and validate our approach experimentally, using a large-scale data set of domestic electricity consumers in the UK.

12:05
Effective Management of Electric Vehicle Storage using Smart Charging
SPEAKER: unknown

ABSTRACT. The growing Electric Vehicles' (EVs) popularity among commuters creates new challenges for the smart grid. The most important of them is the uncoordinated EV charging that substantially increases the energy demand peaks, putting the smart grid under constant strain. In order to cope with these peaks the grid needs extra infrastructure, a costly solution. We propose an Adaptive Management of EV Storage (AMEVS) algorithm, implemented through a learning agent that acts on behalf of individual EV owners and schedules EV charging over a weekly horizon. It accounts for individual preferences so that mobility service is not violated but also individual benefit is maximized. We observe that it reshapes the energy demand making it less volatile so that fewer resources are needed to cover peaks. It assumes Vehicle-to-Grid discharging when the customer has excess capacity. Our agent uses Reinforcement Learning trained on real world data to learn individual household consumption behavior and to schedule EV charging. Unlike previous work, AMEVS is a fully distributed approach. We show that AMEVS achieves significant reshaping of the energy demand curve and peak reduction, which is correlated with customer preferences regarding perceived utility of energy availability. Additionally, we show that the average and peak energy prices are reduced as a result of smarter energy use.

11:20-12:20 Session 42F: AAAI-14: Multiagent Systems / Game Theory and Economic Paradigms
Location: Room 303B
11:20
Multi-Organ Exchange: The Whole is Greater than the Sum of its Parts
SPEAKER: unknown

ABSTRACT. Kidney exchange, where candidates with organ failure trade incompatible but willing donors, is a life-saving alternative to the deceased donor waitlist, which has inadequate supply to meet demand. While fielded kidney exchanges see huge benefit from altruistic kidney donors (who give an organ without a paired needy candidate), a significantly higher medical risk to the donor deters similar altruism with livers. In this paper, we begin by proposing the idea of liver exchange, and show on demographically accurate data that vetted kidney exchange algorithms can be adapted to clear such an exchange at the nationwide level. We then explore cross-organ donation where kidneys and livers can be bartered for each other. We show theoretically that this multi-organ exchange provides linearly more transplants than running separate kidney and liver exchanges; this linear gain is a product of altruistic kidney donors creating chains that thread through the liver pool. We support this result experimentally on demographically accurate multi-organ exchanges. We conclude with thoughts regarding the fielding of a nationwide liver or joint liver-kidney exchange from a legal and computational point of view.

11:35
The Computational Rise and Fall of Fairness
SPEAKER: unknown

ABSTRACT. The fair division of indivisible goods has long been an important topic in economics and, more recently, computer science. We investigate the existence of envy-free allocations of indivisible goods, that is, allocations where each player values her own allocated set of goods at least as highly as any other player's allocated set of goods. Under additive valuations, we show that even when the number of goods is larger than the number of agents by a linear fraction, envy-free allocations are unlikely to exist. We then show that when the number of goods is larger by a logarithmic factor, such allocations exist with high probability. We support these results experimentally and show that the asymptotic behavior of the theory holds even when the number of goods and agents is quite small. We demonstrate that there is a sharp phase transition from nonexistence to existence of envy-free allocations, and that on average the computational problem is hardest at that transition.

11:50
Online (Budgeted) Social Choice
SPEAKER: unknown

ABSTRACT. We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences over a set of $m$ candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality $k$. Each agent's (positional) score depends on the candidates in the set when he arrives, and the decision-maker's goal is to maximize average (over all agents) score.

We prove that no algorithm (even randomized) can achieve an approximation factor better than $O(\log\log m/ \log m)$. In contrast, if the agents arrive in random order, we present a $(1 - 1/e - o(1))$-approximate algorithm, matching a lower bound for the offline problem. We show that improved performance is possible for natural input distributions or scoring rules.

Finally, if the algorithm is permitted to revoke decisions at a fixed cost, we apply regret-minimization techniques to achieve approximation $1 - 1/e - o(1)$ even for arbitrary inputs.

12:05
Scalable Complex Contract Negotiation With Structured Search and Agenda Management
SPEAKER: unknown

ABSTRACT. A large number of interdependent issues in complex contract poses a challenge for current negotiation approaches, which becomes even more apparent when negotiation problems scale up. To address this challenge, we present a structured anytime search process with agenda management mechanism using a hierarchical negotiation model, where agents search at various levels during the negotiation with the guidance from a mediator agent. This structured negotiation process increases computational efficiency, making negotiations scalable for large number of interdependent issues. To validate the contributions of our approach, 1) we developed anytime tree search negotiation process with an agenda management mechanism using a hierarchical problem structure and constraint-based preference model for real-world applications; 2) we defined a scenario matrix to capture various characteristics of negotiation scenarios and developed a scenario generator that produces testing cases according to this matrix; and 3) we performed an extensive set of experiments to study the performance of this structured negotiation protocol and the influence of different scenario parameters, and investigated the Pareto efficiency and social welfare optimality of the negotiation outcomes.

12:20-13:30Lunch Break
12:30-13:00 Session 43: AAAI Community Meeting
Location: Room 301A
13:30-14:30 Session 44: AAAI-14 Invited Talk
Location: Hall 200A
13:30
30 Year of Probability in AI and Machine Learning
14:35-15:35 Session 45A: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301A
14:35
Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View
SPEAKER: unknown

ABSTRACT. Mutual information (MI) based approaches are an important feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

14:50
Efficient Generalized Fused Lasso and Its Application to the Diagnosis of Alzheimer’s Disease
SPEAKER: unknown

ABSTRACT. Generalized Fused Lasso (GFL) penalizes variables with L1 norms both on variables and their pairwise differences. GFL is useful when applied to data where prior information expressed with a graph is available in the domain. However, the existing algorithms for GFL take high computational cost and do not scale to large size problems. In this paper, we propose a fast and scalable algorithm for GFL. Based on the fact that the fusion penalty is the Lov´asz extension of a cut function, we show that the key building block is equivalent to recursively solving graph cut problems. We then solve GFL efficiently via a parametric flow algorithm. Runtime comparison demonstrate a significant speed-up over existing algorithms for GFL. Benefited from the scalability of our algorithm, we propose to formulate the diagnosis of Alzheimer’s Disease as GFL. Experiments demonstrate that GFL seems to be a natural way to formulate such a problem. Not only is the diagnosis performance promising, but the selected critical voxels are well structured, consistent across tasks and in accordance with clinical prior knowledge.

15:05
The Role of Dimensionality Reduction in Linear Classification
SPEAKER: unknown

ABSTRACT. Dimensionality reduction (DR) is often used as a preprocessing step in classification, but usually one first fixes the DR mapping, possibly using label information, and then learns a classifier (a filter approach). Best performance would be obtained by optimizing the classification error jointly over DR mapping and classifier (a wrapper approach), but this is a difficult nonconvex problem, particularly with nonlinear DR. Using the method of auxiliary coordinates, we give a simple, efficient algorithm to train a combination of nonlinear DR and a classifier, and apply it to a RBF mapping with a linear SVM. This alternates steps where we train the RBF mapping and a linear SVM as usual regression and classification, respectively, with a closed-form step that coordinates both. The resulting nonlinear low-dimensional classifier achieves classification errors competitive with the state-of-the-art but is fast at training and testing, and allows the user to trade off runtime for classification accuracy easily. We then study the role of nonlinear DR in linear classification, and the interplay between the DR mapping, the number of latent dimensions and the number of classes. When trained jointly, the DR mapping takes an extreme role in eliminating variation: it tends to collapse classes in latent space, erasing all manifold structure, and lay out class centroids so they are linearly separable with maximum margin.

15:20
Deep Modeling of Group Preferences for Group-based Recommendation
SPEAKER: unknown

ABSTRACT. Nowadays, most recommender systems (RSs) mainly aim to suggest appropriate items for individuals. Due to the social nature of human beings, group activities have become an integral part of our daily life, thus motivating the study on group RS (GRS). However, most existing methods used by GRS make recommendations through aggregating individual ratings or individual predictive results rather than considering the collective features that govern user choices made within a group. As a result, such methods are heavily sensitive to data, hence they often fail to learn group preferences when the data are slightly inconsistent with predefined aggregation assumptions. To this end, we devise a novel GRS approach which accommodates both individual choices and group decisions in a joint model. More specifically, we propose a deep-architecture model built with a collective deep belief network and dual-wing restricted Boltzmann machine. With such a deep model, we can use high-level features, which are induced from lower-level features, to represent group preference so as to relieve the vulnerability of data. Finally, the experiments conducted on a real-world dataset prove the superiority of our deep model over other state-of-the-art methods.

14:35-15:35 Session 45B: AAAI-14: Humans and AI
Location: Room 301B
14:35
Dramatis: A Computational Model of Suspense
SPEAKER: unknown

ABSTRACT. We introduce Dramatis, a computational model of suspense based on a reformulation of a psychological definition of the suspense phenomenon. In this reformulation, suspense is correlated with the audience’s ability to generate a plan for the protagonist to avoid an impending negative outcome. Dramatis measures the suspense level by generating such a plan and determining its perceived likelihood of success. We report on three evaluations of Dramatis, including a comparison of Dramatis output to the suspense reported by human readers, as well as ablative tests of Dramatis components. In these studies, we found that Dramatis output corresponded to the suspense ratings given by human readers for stories in three separate domains.

14:50
Ordering Effects and Belief Adjustment in the Use of Comparison Shopping Agents
SPEAKER: unknown

ABSTRACT. The popularity of online shopping has contributed to the development of comparison shopping agents (CSAs) aiming to facilitate buyers' ability to compare prices of online stores for any desired product. Furthermore, the plethora of CSAs in today’s markets enables buyers to query more than a single CSA when shopping, thus expanding even further the list of sellers whose prices they obtain. This potentially decreases the chance of a purchase based on the prices outputted as a result of any single query, and consequently decreases each CSAs’ expected revenue per-query. Obviously, a CSA can improve its competence in such settings by acquiring more sellers’ prices, potentially resulting in a more attractive ``best price''. In this paper we suggest a complementary approach that improves the attractiveness of a CSA by presenting the prices to the user in a specific intelligent manner, which is based on known cognitive-biases. The advantage of this approach is its ability to affect the buyer’s tendency to terminate her search for a better price, hence avoid querying further CSAs, without having the CSA spend any of its resources on finding better prices to present. The effectiveness of our method is demonstrated using real data, collected from four CSAs for five products. Our experiments with people confirm that the suggested method effectively influence people in a way that is highly advantageous to the CSA.

15:05
A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback
SPEAKER: unknown

ABSTRACT. This paper introduces two novel algorithms, SABL and I-SABL, for learning behaviors from human-provided rewards. The primary novelty of these algorithms is that instead of treating the feedback as a numeric reward signal, they interpret feedback as a form of discrete communication that depends on both the behavior the trainer is trying to teach and the teaching strategy used by the trainer. For example, some humans use a strategy where the lack of feedback may indicate whether the action was correct or incorrect, and interpreting this lack of feedback accurately can significantly improve learning speed. Results from user studies show that 1) humans use a variety of training strategies in practice, and 2) both algorithms can successfully learn a contextual bandit task faster than approaches that treat the feedback as numeric. Additionally, simulated trainers are employed to evaluate the algorithms in both contextual bandit and sequential decision-making domains with similar results.

15:20
Discovering Better AAAI Keywords via Clustering with Crowd-sourced Constraints
SPEAKER: unknown

ABSTRACT. Selecting good conference keywords is important because they often determine the composition of review committees and hence which papers are reviewed by whom. But presently conference keywords are generated in an ad-hoc manner by a small set of conference organizers. This approach is plainly not ideal. There is no guarantee, for example, that the generated keyword set aligns with what the community is actually working on and submitting to the conference in a given year. This is especially true in fast moving fields such as AI. The problem is exacerbated by the tendency of organizers to draw heavily on preceding years' keyword lists when generating a new set. Rather than a select few ordaining a keyword set that that represents AI at large, it would be preferable to generate these keywords more directly from the data, with input from research community members. To this end, we solicited feedback from seven AAAI PC members regarding a previously existing keyword set and used these 'crowd-sourced constraints' to inform a clustering over the abstracts of all submissions to AAAI 2013. We show that the keywords discovered via this data-driven, human-in-the-loop method are at least as preferred (by AAAI PC members) as 2013's manually generated set, and that they include categories previously overlooked by organizers. Many of the discovered terms were used for this year's conference.

14:35-15:35 Session 45C: AAAI-14: Game Theory and Economic Paradigms
Location: Room 302A
14:35
A Generalization of Probabilistic Serial to Randomized Social Choice
SPEAKER: unknown

ABSTRACT. The probabilistic serial (PS) rule is one of the most well-established and desirable rules for the random assignment problem. We present the egalitarian simultaneous reservation (ESR) social decision scheme — an extension of PS to the more general setting of randomized social choice. ESR also generalizes an egalitarian rule from the literature which is defined only for dichotomous preferences. We consider various desirable fairness, efficiency, and strategic properties of ESR and show that it compares favourably against other social decision schemes. Finally, we define a more general class of social decision schemes called Simultaneous Reservation (SR), that contains ESR as well as the serial dictatorship rules. We show that outcomes of SR characterize efficiency with respect to a natural refinement of stochastic dominance.

14:50
Biased Games
SPEAKER: unknown

ABSTRACT. We present a novel extension of normal form games that we call biased games. In these games, a player's utility is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased games capture important aspects of the interaction between software agents. Our main result is that biased games satisfying certain mild conditions always admit an equilibrium. We also tackle the computation of equilibria in biased games.

15:05
Simultaneous Cake Cutting
SPEAKER: unknown

ABSTRACT. We introduce the simultaneous model for cake cutting (the fair allocation of a divisible good), in which agents simultaneously send messages containing a sketch of their preferences over the cake. We show that this model enables the computation of divisions that satisfy proportionality — a popular fairness notion — using a protocol that circumvents a standard lower bound via parallel information elicitation. Cake divisions satisfying another prominent fairness notion, envy-freeness, are impossible to compute in the simultaneous model, but such allocations admit arbitrarily good approximations.

15:20
Incomplete Preferences in Single-Peaked Electorates

ABSTRACT. Incomplete preferences are likely to arise in real-world preference aggregation and voting systems. This paper deals with determining whether an incomplete preference profile is single-peaked. This is essential information since many intractable voting problems become tractable for single-peaked profiles. We prove that for incomplete profiles the problem of determining single-peakedness is NP-complete. Despite this computational hardness result, we find four polynomial-time algorithms for reasonably restricted settings.

14:35-15:35 Session 45D: AAAI-14: Planning and Scheduling
Location: Room 302B
14:35
Solving Uncertain MDPs by Reusing State Information and Plans
SPEAKER: unknown

ABSTRACT. While Markov decision processes (MDPs) are powerful tools for modeling sequential decision making problems under uncertainty, they are sensitive to the accuracy of their parameters. MDPs with uncertainty in their parameters are called Uncertain MDPs. In this paper, we introduce a general framework that allows off-the-shelf MDP algorithms to solve Uncertain MDPs by planning based on currently available information and replan if and when the problem changes. We demonstrate the generality of this approach by showing that it can use the VI, TVI, ILAO*, LRTDP, and UCT algorithms to solve Uncertain MDPs. We experimentally show that our approach is typically faster than replanning from scratch and we also provide a way to estimate the amount of speedup based on the amount of information is reused.

14:50
Structured Possibilistic Planning using Decision Diagrams
SPEAKER: unknown

ABSTRACT. Qualitative Possibilistic Mixed-Observable MDPs ($\pi$-MOMDPs), generalizing $\pi$-MDPs and $\pi$-POMDPs, are well-suited models to planning under uncertainty with mixed-observability when transition, observation and reward functions are not precisely known. Functions defining the model as well as intermediate calculations are valued in a finite possibilistic scale $\mathcal{L}$, which induces a \emph{finite} belief state space under partial observability contrary to its probabilistic counterpart. In this paper, we propose the first study of factored $\pi$-MOMDP models in order to solve large structured planning problems under \emph{imprecise} uncertainty, or considered as qualitative approximations of probabilistic problems. Building upon the SPUDD algorithm for solving factored (probabilistic) MDPs, we conceived a symbolic algorithm named PPUDD for solving factored $\pi$-MOMDPs. Whereas SPUDD's decision diagrams' leaves may be as large as the state space since their values are real numbers aggregated through additions and multiplications, PPUDD's ones always remain in the finite scale $\echL$ via $\min$ and $\max$ operations only. Our experiments show that PPUDD's computation time is much lower than SPUDD, Symbolic-HSVI and APPL for possibilistic and probabilistic versions of the same benchmarks under either total or mixed observability, while providing high-quality policies.

15:05
A Simple Polynomial-Time Randomized Distributed Algorithm for Connected Row Convex Constraints
SPEAKER: unknown

ABSTRACT. In this paper, we describe a simple randomized algorithm that runs in polynomial time and solves connected row convex (CRC) constraints in a distributed setting. CRC constraints generalize many known tractable classes of constraints like 2-SAT and implicational constraints. They can model problems in many domains including temporal reasoning and geometric reasoning; and generally speaking, play the role of ``Gaussians'' in the logical world. Our simple randomized algorithm for solving them in distributed settings, therefore, has a number of important applications. We support our claims through empirical results. We also generalize our algorithm to tractable classes of tree convex constraints.

15:20
Chance-constrained Probabilistic Simple Temporal Problems
SPEAKER: unknown

ABSTRACT. Robust scheduling is essential to many autonomous systems and logistics tasks. Probabilistic formalisms quantify the risk of schedule failure, which is essential for mission critical applications. Probabilistic methods for solving temporal problems exist that attempt to minimize the probability of schedule failure. These methods are overly conservative, resulting in a loss in schedule utility. Chance constrained formalism address this problem by imposing bounds on risk, while maximizing utility subject to these risk bounds.

In this paper we present the probabilistic Simple Temporal Network (pSTN), a probabilistic formalism for representing temporal problems with bounded risk and a utility over event timing. We introduce a constrained optimisation algorithm for pSTNs that achieves compactness and efficiency through a problem encoding in terms of a parameterised STNU and its reformulation as a parameterised STN. We demonstrate through a car sharing application that our chance-constrained approach runs in the same time as the previous probabilistic approach, yields solutions with utility improvements of at least 5% over previous arts, while guaranteeing operation within the specified risk bound.

14:35-15:35 Session 45E: AAAI-14: Applications
Location: Room 303A
14:35
A Hybrid Grammar-Based Approach for Learning and Recognizing Natural Hand Gestures
SPEAKER: unknown

ABSTRACT. In this paper, we present an approach to learn structured models of gesture performances that allow for a compressed representation and robust recognition of natural iconic gestures. We analyze a dataset of iconic gestures and show how the proposed hybrid grammar formalism can generalize over both structural and feature-based variations among different gesture performances.

14:50
A Machine Learning Approach to Musically Meaningful Homogeneous Style Classification
SPEAKER: unknown

ABSTRACT. Recent literature has demonstrated the difficulty of classifying between composers who write in extremely similar styles (homogeneous style). Additionally, machine learning studies in this field have been exclusively of technical import with little musicological interpretability or significance. We present a supervised machine learning system which addresses the difficulty of differentiating between stylistically homogeneous composers using foundational elements of music, their complexity and interaction. Our work expands on previous style classification studies by developing more complex features as well as introducing a new class of musical features which focus on local irregularities within musical scores. We demonstrate the discriminative power of the system as applied to Haydn and Mozart's string quartets. Our results yield interpretable musicological conclusions about Haydn's and Mozart's stylistic differences while distinguishing between the composers with higher accuracy than previous studies in this domain.

15:05
Forecasting Potential Diabetes Complications
SPEAKER: unknown

ABSTRACT. Diabetes complications often afflict diabetes patients seriously: over 68% of diabetes-related mortality is caused by diabetes complications. In this paper, we study the problem of automatically diagnosing diabetes complications from patients’ lab test results. The objective problem has two main challenges: 1) feature sparseness: a patient only takes 1:26% lab tests on average, and 65:5% types of lab tests are taken by less than 10 patients; 2) knowledge skewness: it lacks comprehensive detailed domain knowledge of association between diabetes complications and lab tests. To address these challenges, we propose a novel probabilistic model called Sparse Factor Graph Model (SparseFGM). SparseFGM projects sparse features onto a lower-dimensional latent space, which alleviates the problem of sparseness. SparseFGM is also able to capture the associations between complications and lab tests, which help handle the knowledge skewness. We evaluate the proposed model on a large collections of real medical records. SparseFGM outperforms (+20% by F1) baselines significantly and gives detailed associations between diabetes complications and lab tests.

14:35-15:35 Session 45F: AAAI-14: Robotics
Location: Room 303B
14:35
Efficient Optimization for Autonomous Manipulation of Natural Objects
SPEAKER: unknown

ABSTRACT. Manipulating irregular natural objects, such as rocks, is an essential capability of robots operating in outdoor environments. Previous studies have shown that stable grasps for known, man-made, objects can usually be planned by using physics-based simulators. However, planning is an expensive process that requires simulation of hand and object trajectories in different configurations, and evaluating the outcome of each trajectory. This problem is particularly concerning when the objects are irregular and cluttered, because the space of stable grasps is significantly smaller, and more configurations need to be evaluated before finding a good one. We present a learning approach, based on template matching, for fast detection of a small initial set of potentially stable grasps in a cluttered scene, using depth features. The predicted best grasps are further optimized by fine-tunning the configuration of the hand in simulation. To reduce the computational cost of this last operation, we model the predicted outcomes of the grasps as a Gaussian Process, and use an entropy-search method in order to focus the optimization on regions where the best grasp configuration is most likely to be. This approach is tested on the challenging task of clearing piles of real, unknown, rock debris using an autonomous robot. Empirical results show a clear advantage of this approach.

14:50
Qualitative Planning with Quantitative Constraints for Online Learning of Robotic Behaviours
SPEAKER: unknown

ABSTRACT. This paper resolves previous problems in the Multi-Strategy architecture for online learning of robotic behaviours. The hybrid method includes a symbolic qualitative planner that constructs an approximate solution to a control problem. The approximate solution provides constraints for a numerical optimisation algorithm, which is used to refine the qualitative plan into an operational policy. Introducing quantitative constraints into the planner gives previously unachievable domain independent reasoning. The method is demonstrated on a multi-tracked robot intended for urban search and rescue.

15:05
Learning from Unscripted Deictic Gesture and Language for Human-Robot Interactions
SPEAKER: unknown

ABSTRACT. As robots become more ubiquitous, it is increasingly important for untrained users to be able to interact with them intuitively. In this work, we investigate how people refer to objects in the world during relatively unstructured communication with robots. We collect a corpus of interactions from users describing objects, which we use to train language and gesture models that allow our robot to determine what objects are being indicated. We introduce a temporal extension to state-of-the-art hierarchical matching pursuit features to support gesture understanding, and demonstrate that combining multiple communication modalities more effectively captures user intent than relying on a single type of input. Finally, we present initial interactions with a robot that uses the learned models to follow commands while continuing to learn from user input.

15:20
Minimising Undesired Task Costs in Multi-robot Task Allocation Problems with In-Schedule Dependencies
SPEAKER: unknown

ABSTRACT. In multi-robot task allocation problems with in-schedule dependencies, tasks with high costs have a large influence on the overall time for a robot team to complete all tasks. We reduce this influence by calculating a novel task cost dispersion value which measures robots' collective preference for each task. By modifying the winner determination phase of sequential single-item auctions, our approach inspects the bids for every task to identify tasks which robots collectively consider to be high cost and we ensure these tasks are allocated before other tasks. We show empirically this method reduces the overall time taken to complete all tasks.

15:35-16:00Coffee Break
15:35-16:30 Session 46: AAAI-14 Poster Session III:E
Location: Hall 200C
15:35
Chinese Zero Pronoun Resolution: An Unsupervised Approach Combining Ranking and Integer Linear Programming
SPEAKER: unknown

ABSTRACT. Compared to overt pronoun resolution, there is less work on the more challenging task of zero pronoun resolution. State-of-the-art approaches to zero pronoun resolution are supervised, requiring the availability of documents containing manually resolved zero pronouns. In contrast, we propose in this paper an unsupervised approach to this task. Underlying our approach is the novel idea of employing a model trained on manually resolved overt pronouns to resolve zero pronouns. Experimental results on the OntoNotes corpus are encouraging: our unsupervised model rivals its supervised counterparts in performance.

15:35
Contraction and Revision over DL-Lite TBoxes
SPEAKER: unknown

ABSTRACT. An essential task in managing DL ontologies is to deal with changes over the ontologies. In particular, outdated axioms have to be removed from the ontology and newly formed axioms have to be incorporated into the ontology. Such changes are formalised as the operations of contraction and revision in the literatures. The operations can be defined in various ways. To investigate properties of a defined operation, it is best to identify some postulates that completely characterise the operation such that on the one hand the operation satisfies the postulates and on the other hand it is the only operation that satisfies all the postulates. Such characterisation results have never been shown for contractions under DLs. In this paper, we define model-based contraction and revision for DL-Lite$_{core}$ TBoxes and provide characterisation results for both operations. As a first step for applying the operations in practice, we also provide tractable algorithms for both operations. Since DL semantics incurs infinite numbers of models for DL-Lite TBoxes, it is not feasible to develop algorithms involving DL models. The key to our operations and algorithms is the development of an alternative semantics called type semantics. Type semantics closely resembles the semantics underlays propositional logic, thus it is more succinct than DL semantics. Most importantly, given a finite signature, any DL-Lite$_{core}$ TBox has finite numbers of type models.

15:35
Optimal and Efficient Stochastic Motion Planning in Partially-Known Environments
SPEAKER: unknown

ABSTRACT. A framework capable of computing optimal control policies for a continuous system in the presence of both action and environment uncertainty is presented in this work. The framework decomposes the planning problem into two stages: an offline phase that reasons only over action uncertainty and an online phase that quickly reacts to the uncertain environment. Offline, a bounded-parameter Markov decision process (BMDP) is employed to model the evolution of the stochastic system over a discretization of the environment. Online, an optimal control policy over the BMDP is computed. Upon the discovery of an unknown environment feature during policy execution, the BMDP is updated and the optimal control policy is efficiently recomputed. Depending on the desired quality of the control policy, a suite of methods is presented to incorporate new information into the BMDP with varying degrees of detail online. Experiments confirm that the framework recomputes high-quality policies in seconds and is orders of magnitude faster than existing methods.

15:35
Natural Temporal Difference Learning
SPEAKER: unknown

ABSTRACT. In this paper we investigate the application of natural gradient descent to Bellman error based reinforcement learning algorithms. This combination is interesting because natural gradient descent is invariant to the parameterization of the value function. This invariance property means that natural gradient descent adapts its update directions to correct for poorly conditioned representations. We present and analyze quadratic and linear time natural temporal difference learning algorithms, and prove that they are covariant. We conclude with experiments which suggest that the natural algorithms can match or outperform their non-natural counterparts using linear function approximation, and drastically improve upon their non-natural counterparts when using non-linear function approximation.

15:35
Large Scale Analogical Reasoning
SPEAKER: unknown

ABSTRACT. It has been argued that one can use cognitive simulation of analogical processing to answer comparison questions. In the context of a knowledge base (KB) system, a comparison question takes the form: What are the similarities and/or differences between A and B?, where \concept{A} and \concept{B} are concepts in the KB. Previous attempts to use a general purpose analogical reasoner to answer this question revealed three major problems: (a) the system presented too much information in the answer and the salient similarity or difference was not highlighted (b) analogical inference found some incorrect differences (c) some expected similarities were not found. The primary cause of these problems was the lack of availability of a well-curated KB, and secondarily, there were also some algorithmic deficiencies. In this paper, we present an of comparison questions that is inspired by the general model of analogical reasoning, but is specific to the questions at hand. We also rely on a well-curated biology KB. We present numerous examples of answers produced by the system and empirical data on the quality of the answers to claim that we have addressed many of the problems faced in the previous system.

15:35
Fused Feature Representation Discovery for High-dimensional and Sparse Data
SPEAKER: unknown

ABSTRACT. The automatic discovery of a significant low-dimensional feature representation from given data set is a fundamental problem in machine learning. This paper specifically focuses on to develop feature representation discovery methods appropriate for high-dimensional and sparse data, which are remain a frontier, but are now becoming a highly important tool. We formulate our feature representation discovery problem as a variant of semi-supervised learning problem, namely, an optimization problem over unsupervised data whose objective is evaluating the impact of each feature with respect to modeling a target task according to the initial model constructed by using supervised data. The most notable characteristic of our method is that it offers feasible processing speed even if the numbers of data and features both exceed the billions, and successfully provides significantly small feature sets, i.e., less than 10, that can also offer improved performance comparing with those obtained with using the original feature sets. We demonstrate the effectiveness of our method on experiments of two well-known natural language processing tasks.

15:35
Q-intersection Algorithms for Constraint-Based Robust Parameter Estimation
SPEAKER: unknown

ABSTRACT. Given a set of axis-parallel n-dimensional boxes, the q-intersection is defined as the smallest box encompassing all the points that belong to at least q boxes. Computing the q-intersection is a combinatorial problem that allows us to handle robust parameter estimation with a numerical constraint programming approach. The q-intersection can be viewed as a filtering operator for soft constraints that model measurements subject to outliers. This paper highlights the equivalence of this operator with the search of q-cliques in a graph whose boxicity is bounded by the number of variables in the constraint network. We present a computational study of the q-intersection. We also propose a fast incomplete algorithm and a sophisticated exact q-intersection algorithm. First experimental results show that our exact algorithm outperforms the existing one while our heuristic performs an efficient filtering on hard problems.

15:35
CoreCluster: A Degeneracy Based Graph Clustering Framework
SPEAKER: unknown

ABSTRACT. Graph clustering or community detection constitutes an important task for investigating the internal structure of graphs, with a plethora of applications in several diverse domains. Traditional tools for graph clustering, such as spectral methods, typically suffer from high time and space complexity. In this article, we present CoreCluster, an efficient graph clustering framework based on the concept of graph degeneracy, that can be used along with any known graph clustering algorithm. Our approach capitalizes on processing the graph in a hierarchical manner provided by its core expansion sequence, an ordered partition of the graph into different levels according to the k-core decomposition. Such a partition provides a way to process the graph in an incremental manner that preserves its clustering structure, while making the execution of the chosen clustering algorithm much faster due to the smaller size of the graph’s partitions onto which the algorithm operates.

15:35
Tree-Based On-line Reinforcement Learning
SPEAKER: Andre Barreto

ABSTRACT. Fitted Q-iteration (FQI) stands out among reinforcement-learning algorithms for its flexibility and easy of use. FQI can be combined with any regression method, and this choice determines the algorithm's theoretical and computational properties. The combination of FQI with an ensemble of regression trees gives rises to an algorithm, FQIT, that is computationally efficient, scalable to high dimensional spaces, and robust to irrelevant variables, outliers, and noise. Despite its nice properties and good performance in practice, FQIT also has some limitations: the fact that an ensemble of trees must be constructed (or updated) at each iteration confines the algorithm to the batch scenario. This paper aims to address this specific issue. Based on a strategy recently proposed in the literature, called the stochastic-factorization trick, we propose a modification of FQIT that makes it fully incremental, and thus suitable for on-line learning. We call the resulting method tree-based stochastic factorization (TBSF). We derive an upper bound for the difference between the value functions computed by FQIT and TBSF, and also show in which circumstances the approximations coincide. A series of computational experiments is presented to illustrate the properties of TBSF and to show its usefulness in practice.

15:35
Local-To-Global Consistency Implies Tractability of Abduction
SPEAKER: Michał Wrona

ABSTRACT. Abduction is a form of nonmonotonic reasoning that looks for an explanation, built from a given set of hypotheses, for an observed manifestation according to some knowledge base. Following the concept behind the Schaefer's parametrization CSP(Gamma) of the Constraint Satisfaction Problem (CSP), we study here the complexity of the abduction problem Abduction(Gamma, Hyp, M) parametrized by certain (omega-categorical) infinite relational structures Gamma, Hyp, and M from which a knowledge base, hypotheses and a manifestation are built, respectively.

We say that Gamma has local-to-global consistency if there is k such that establishing strong k-consistency on an instance of CSP(Gamma) yields a globally consistent (whose every solution may be obtained straightforwardly from partial solutions) set of constraints. In this case CSP(Gamma) is solvable in polynomial time. Our main contribution is an algorithm that under some natural conditions decides Abduction(Gamma, Hyp, M) in P when Gamma has local-to-global consistency.

As we show in the number of examples, our approach offers an opportunity to consider abduction in the context of spatial and temporal reasoning (qualitative calculi such as Allen's interval algebra or RCC-5) and that our procedure solves some related abduction problems in polynomial time.

15:35
Pairwise-Covariance Linear Discriminant Analysis
SPEAKER: unknown

ABSTRACT. In machine learning, linear discriminant analysis (LDA) is a popular dimension reduction method. In this paper, we first pro- vide a new perspective of LDA from information theory per- spective. From this new perspective, we propose a new formula- tion of LDA, which uses the pairwise averaged class covariance instead of the globally averaged class covariance used in stan- dard LDA. This pairwise (averaged) covariance describes data distribution more accurately. The new perspective also provides a natural way to properly weight different pairwise distances, which emphasizes the pairs of class with small distances, and this leads to the proposed pairwise covariance properly weight- ed LDA (pcLDA). The kernel version of pcLDA is presented to handle nonlinear projections. Efficient algorithms are presented to efficiently compute the proposed models.

15:35
A Multiarmed Bandit Incentive Mechanism for Crowdsourcing Demand Response in Smart Grids
SPEAKER: unknown

ABSTRACT. Demand response is a critical part of renewable integration and energy cost reduction goals across the world. Motivated by the need to reduce costs arising from electricity shortage and renewable energy fluctuations, we propose a novel multiarmed bandit mechanism for demand response (MAB-MDR) which makes monetary offers to strategic consumers who have unknown response characteristics. Our work is inspired by connection to and intuition from crowdsourcing mechanisms. The proposed mechanism incorporates realistic features such as a time varying and quadratic cost function. The mechanism marries auctions, that allow users to report their preferences, with online algorithms, that allow distribution companies to learn user-specific parameters. We show that MAB-MDR is dominant strategy incentive compatible, individually rational, and achieves sublinear regret. Such mechanisms can be effectively deployed in smart grids using new information and control architecture innovations and lead to welcome savings in energy costs.

15:35
Strategyproof exchange with multiple private endowments
SPEAKER: unknown

ABSTRACT. We study a mechanism design problem for exchange economies where each agent is initially endowed with a set of indivisible goods and side payments are not allowed. We assume each agent can withhold some endowments, as well as misreport her preference. Under this assumption, strategyproofness requires that for each agent, reporting her true preference with revealing all her endowments is a dominant strategy, and thus implies individual rationality. Our objective in this paper is to analyze the effect of such private ownership in exchange economies with multiple endowments. As fundamental results, we first show that the revelation principle holds under a natural assumption and that strategyproofness and Pareto efficiency are incompatible even under the lexicographic preference domain. We then propose a class of exchange rules, each of which has a corresponding directed graph to prescribe possible trades, and provide a necessary and sufficient condition on the graph structure so that the rule satisfies strategy-proofness.

15:35
SOML: Sparse Online Metric Learning with Application to Image Retrieval
SPEAKER: unknown

ABSTRACT. Image similarity search plays a key role in many multimedia applications, where multimedia data (such as images and videos) are usually represented in high-dimensional feature space. In this paper, we propose a novel Sparse Online Metric Learning (SOML) scheme for learning sparse distance functions from large-scale high-dimensional data and explore its application to image retrieval. In contrast to many existing distance metric learning algorithms that are often designed for low-dimensional data, the proposed algorithms are able to learn sparse distance metrics from high-dimensional data in an efficient and scalable manner. Our experimental results show that the proposed method achieves better or at least comparable accuracy performance than the state-of-the-art non-sparse distance metric learning approaches, but enjoys a significant advantage in computational efficiency and sparsity, making it more practical for real-world applications.

15:35
Finding Median Point-Set Using Earth Mover's Distance
SPEAKER: unknown

ABSTRACT. In this paper, we study a prototype learning problem, called {\em Median Point-Set}, whose objective is to construct a prototype for a set of given point-sets so as to minimize the total earth mover's distances (EMD) between the prototype and the point-sets, where EMD between two point-sets is measured under affine transformation. For this problem, we present the first purely geometric approach. Comparing to existing graph-based approaches ({\em e.g.,} median graph, shock graph), our approach has several unique advantages: (1) No encoding and decoding procedures are needed to map between objects and graphs, and therefore avoid errors caused by information losing during the mappings; (2) Staying only in the geometric domain makes our approach computationally more efficient and robust to noise. As a key ingredient of our approach, we present the first quality guaranteed algorithm for minimizing EMD between two point-sets under affine transformation. We evaluate the performance of our technique for prototype reconstruction on a random dataset and a benchmark dataset, handwriting Chinese characters. Experiments suggest that our technique considerably outperforms the existing graph-based methods.

15:35
Intra-view and Inter-view Supervised Correlation Analysis for Multi-view Feature Learning
SPEAKER: unknown

ABSTRACT. The object always can be observed at multiple views, and multi-view feature learning is an attractive research topic with great practical success. Canonical correlation analysis (CCA) has become an important technique in multi-view learning, since it can fully utilize the inter-view correlation. In this paper, we mainly study the CCA based multi-view supervised feature learning technique where the labels of training samples are known. Several supervised CCA based multi-view methods have been presented, which focus on investigating the supervised correlation across different views. However, they take no account of the intra-view correlation between samples. Researchers have also introduced the discriminant analysis technique into multi-view feature learning, such as multi-view discriminant analysis (MvDA). But they ignore the canonical correlation within each view and between all views. In this paper, we propose a novel multi-view feature learning approach based on intra-view and inter-view supervised correlation analysis (I2SCA), which can explore the useful correlation information of samples within each view and between all views. The objective function of I2SCA is designed to simultaneously extract the discriminatingly correlated features from both inter-view and intra-view. It can obtain an analytical solution without iterative calculation. And we provide a kernelized extension of I2SCA to tackle the linearly inseparable problem in the original feature space. Three widely-used datasets are employed as test data. Experimental results demonstrate that our proposed approaches outperform several representative multi-view supervised feature learning methods.

15:35
Internally Stable Matchings and Exchanges
SPEAKER: unknown

ABSTRACT. Stability is a central concept in matching-based mechanism design. It imposes a fundamental requirement that no subset of agents could beneficially deviate from the prescribed outcome. However, deployment of stability in the current design of kidney exchange mechanisms presents at least two challenges. First, it reduces social welfare of the mechanism and sometimes prevent the mechanism from producing any outcome at all. Second, it sometimes incurs computation cost to clear the mechanism.

In this paper, we propose an alternative notion of stability. Our theoretical and experimental studies demonstrate that the new notion of stability addresses both challenges above and could be deployed in the current kidney exchange design.

15:35
Testable Implications of Linear Structural Equation Models
SPEAKER: unknown

ABSTRACT. In causal inference, all methods of model learning rely on testable implications, namely, properties of the joint distribution that are dictated by the model structure. These constraints, if not satisfied in the data, allow us to reject or modify the model. Most common methods of testing a linear structural equation model (SEM) rely on the likelihood ratio or chi-square test which simultaneously tests all of the restrictions implied by the model. Local constraints, on the other hand, offer increased power (Bollen and Pearl, 2013; McDonald, 2002) and, in the case of failure, provide the modeler with insight for revising the model specification. One strategy of uncovering local constraints in linear SEMs is to search for overidentified path coefficients. While these overidentifying constraints are well known, no method has been given for systematically discovering them. In this paper, we extend the half-trek criterion of (Foygel et al., 2012) to identify a larger set of structural coefficients and use it to systematically discover overidentifying constraints. Still open is the question of whether our algorithm is complete.

15:35
Solving the Inferential Frame Problem in the General Game Description Language
SPEAKER: unknown

ABSTRACT. The Game Description Language GDL is the standard input language for general game-playing systems. While players can gain a lot of traction by an efficient inference algorithm for GDL, state-of-the-art reasoners suffer from a variant of a classical KR problem, the inferential frame problem. We present a method by which general game players can transform any given game description into a representation that solves this problem. Our experimental results demonstrate that with the help of automatically generated domain knowledge, a significant speedup can thus be obtained for the majority of the game descriptions from the AAAI competition.

15:35
Monte Carlo Filtering using Kernel Embedding of Distributions
SPEAKER: unknown

ABSTRACT. Recent advances of kernel methods have yielded the framework for representing probabilities using a reproducing kernel Hilbert space, called kernel embedding of distributions. In this paper, we propose a Monte Carlo filtering algorithm based on kernel embeddings. The proposed method is applied to state-space models where sampling from the transition model is possible, while the observation model is to be learned from training samples without assuming a parametric model. We derive convergence rates for the sampling method introduced to the kernel embedding approach. Experimental results on synthetic models and a real vision-based robot localization problem confirm the effectiveness of the proposed approach.

15:35
User Intent Identification from Online Discussions using a Joint Aspect-Action Topic Model
SPEAKER: unknown

ABSTRACT. Online discussions are growing as a popular, effective and reliable source of information for users because of their liveliness, flexibility and up-to-date information. Online discussions are usually developed and advanced by groups of users with various backgrounds and intents. However because of the diversities in topics and issues discussed by the users, supervised methods are not able to accurately model such dynamic conditions. In this paper, we propose a novel unsupervised generative model to derive aspect-action pairs from online discussions. The proposed method simultaneously captures and models these two features with their relationships that exist in each thread. We assume that each user post is generated by a mixture of aspect and action topics. Therefore, we design a model that captures the latent factors that incorporates the aspect types and intended actions, which describe how users develop a topic in a discussion. In order to demonstrate the effectiveness of our approach, we empirically compare our model against the state of the art methods on large-scale discussion dataset, crawled from apple discussions with over 3.3 million user posts from 340k discussion threads.

15:35
Envy-Free Division of Sellable Goods
SPEAKER: unknown

ABSTRACT. We study the envy-free allocation of indivisible goods between two players. Our novel setting includes an option to sell each good for a fraction of the minimum value any player has for the good. To rigorously quantify the efficiency gain from selling, we reason about the price of envy-freeness of allocations of sellable goods — the ratio between the maximum social welfare and the social welfare of the best envy-free allocation. We show that envy-free allocations of sellable goods are significantly more efficient than their unsellable counterparts.

15:35
Adaptive Knowledge Transfer for Multiple Instance Learning in Image Classification
SPEAKER: unknown

ABSTRACT. Multiple Instance Learning (MIL) is a popular learning technique in various vision tasks including image classification. However, most existing MIL methods do not consider the problem of insufficient examples in the given target category. In this case, it is difficult for traditional MIL methods to build an accurate classifier due to the lack of training examples. Motivated by the empirical success of transfer learning, this paper proposes a novel approach of Adaptive Knowledge Transfer for Multiple Instance Learning (AKT-MIL) in image classification. The new method transfers cross-category knowledge from source categories under multiple instance setting for boosting the learning process. A unified learning framework with a data-dependent mixture model is designed to adaptively combine the transferred knowledge from sources with a weak classifier built in the target domain. Based on this framework, an iterative coordinate descent method with Constraint Concave-Convex Programming (CCCP) is proposed as the optimization procedure. An extensive set of experimental results demonstrate that the proposed AKT-MIL approach substantially outperforms several state-of-the-art algorithms on two benchmark datasets, especially in the scenario when very few training examples are available in the target domain.

15:35
PAC Rank Elicitation through Adaptive Sampling of Stochastic Pairwise Preferences
SPEAKER: unknown

ABSTRACT. We introduce the problem of PAC rank elicitation, which consists of sorting a given set of options based on adaptive sampling of stochastic pairwise preferences. More specifically, the goal is to predict a ranking that is sufficiently close to a target order with high probability. We instantiate this setting with combinations of two different distance measures and ranking procedures for determining the target order. For these instantiations, we devise efficient sampling strategies and analyze the corresponding sample complexity. We also present first experiments to illustrate the practical performance of our methods.

15:35
Parallel Restarted Search
SPEAKER: unknown

ABSTRACT. We consider the problem of parallelizing restarted backtrack search. With few notable exceptions, most commercial and academic constraint programming solvers do not learn no-goods during search. Depending on the branching heuristics used, this means that there are little to no side-effects between restarts, making them an excellent target for parallelization. We develop a simple technique for parallelizing restarted search deterministically and demonstrate experimentally that we can achieve near-linear speed-ups in practice.

15:35
Optimal Neighborhood Preserving Visualization by Maximum Satisfiability
SPEAKER: Kerstin Bunte

ABSTRACT. We present a novel approach to low-dimensional neighbor embedding for visualization: we formulate an information retrieval based neighborhood preservation cost function as Maximum satisfiability on a discretized output display, and maximize the number of clauses preserved. The method has a rigorous interpretation as optimal visualization for neighbor retrieval. Unlike previous low-dimensional neighbor embedding methods, our satisfiability formulation is guaranteed to yield a global optimum and does so reasonably fast. Unlike previous manifold learning methods yielding global optima of their cost functions, our cost function and method are designed for low-dimensional visualization where evaluation and minimization of visualization errors are crucial. Our method performs well in experiments, yielding clean embeddings of data sets where a state-of-the-art comparison method yields poor arrangements. In a real-world case study for semi-supervised WLAN positioning in buildings we outperform state-of-the-art methods, especially when having few measurements.

15:35
A Local Non-negative Pursuit Method for Intrinsic Manifold Structure Preservation
SPEAKER: unknown

ABSTRACT. The local neighborhood selection plays a crucial role for most representation based manifold learning algorithms. This paper reveals that an improper selection of neighborhood for learning representation will introduce negative components in the learnt representations. Importantly, the representations with negative components will affect the intrinsic manifold structure preservation. In this paper, a local non-negative pursuit (LNP) method is proposed for neighborhood selection and non-negative representations are learnt. Moreover, it is proved that the learnt representations are sparse and convex. Theoretical analysis and experimental results show that the proposed method achieves or outperforms the state-of-the-art results on various manifold learning problems.

15:35
Deep Salience: Visual Salience Modeling via Deep Belief Propagation
SPEAKER: Richard Jiang

ABSTRACT. Visual salience has been considered as an intriguing phenomenon observed in biologic neural systems. Numerous efforts have been made on mathematical modeling of visual salience using various feature contrasts, either locally or at global range. However, these algorithmic models treat this biologic phenomenon more like a mathematic problem and somehow ignores its biological instinct that visual salience arouses from the deep propagation of visual stimuli along the visual cortex. In this paper, we present a Deep Salience model that emulates this bio-inspired task, where a multi-layer successive Markov random fields (sMRF) is proposed to analyze the input image successively through its deep belief propagation. As its outcome, the foreground object can be automatically separated from the background in a fully unsupervised way. Experimental evaluation on benchmark datasets validated that our model can consistently outperform state-of-the-art salience models, yielding the highest recall rates, precision and F-measure scores in object detection. With this experimental validation, it is shown that the proposed bio-plausible deep belief network as an emulation of successive visual signal propagation along human visual cortex can functionally work well on solving real-world computational problems.

15:35
Context-aware Collaborative Topic Regression with Social Matrix Factorization for Recommender Systems
SPEAKER: unknown

ABSTRACT. Online social networking sites have become popular platforms on which users can link with each other and share information, not only basic rating information but also information such as contexts, social relationships, and item contents. However, as far as we know, no existing works systematically combine diverse types of information to build more accurate recommender systems. In this paper, we propose a novel context-aware hierarchical Bayesian method. First, we propose the use of spectral clustering for user-item subgrouping, so that users and items in similar contexts are grouped. We then propose a novel hierarchical Bayesian model that can make predictions for each user-item subgroup, our model incorporate not only topic modeling to mine item content but also social matrix factorization to handle ratings and social relationships. Experiments on an Epinions dataset show that our method significantly improves recommendation performance compared with six categories of state-of-the-art recommendation methods in terms of both prediction accuracy and recall. We have also conducted experiments to study the extent to which ratings, contexts, social relationships, and item contents contribute to recommendation performance in terms of prediction accuracy and recall.

15:35
Delivering Guaranteed Display Ads under Reach and Frequency Requirements
SPEAKER: unknown

ABSTRACT. We propose a new idea in the allocation and serving of online advertising. We show that by using predetermined fixed-length streams of ads (which we call patterns) to serve advertising, we can incorporate a variety of interesting features into the ad allocation optimization problem. In particular, our formulation optimizes for representativeness as well as user-level diversity and pacing of ads, under reach and frequency requirements. We show how the problem can be solved efficiently using a column generation scheme in which only a small set of best patterns are kept in the optimization problem. Our numerical tests show that with parallelization of the pattern generation process, the algorithm has a promising run time and memory usage.

15:35
Direct Semantic Analysis for Social Image Classification
SPEAKER: unknown

ABSTRACT. This paper presents a direct semantic analysis method for learning the correlation matrix between visual and textual words from socially tagged images. In the literature, to improve the traditional visual bag-of-words (BOW) representation, latent semantic analysis has been studied extensively for learning a compact visual representation, where each visual word may be related to multiple latent topics. However, these latent topics do not convey any true semantic information which can be understood by human. In fact, it remains a challenging problem how to recover the relationships between visual and textual words. Motivated by the recent advances in dealing with socially tagged images, we develop a direct semantic analysis method which can explicitly learn the correlation matrix between visual and textual words for social image classification. To this end, we formulate our direct semantic analysis from a graph-based learning viewpoint. Once the correlation matrix is learnt, we can readily first obtain a semantically refined visual BOW representation and then apply it to social image classification. Experimental results on two benchmark image datasets show the promising performance of the proposed method.

15:35
Double Configuration Checking in Stochastic Local Search for Satisfiability
SPEAKER: unknown

ABSTRACT. Stochastic local search (SLS) algorithms have shown effectiveness on satisfiable instances of the Boolean satisfiability (SAT) problem. However, their performance is still unsatisfactory on random k-SAT at the phase transition, which is of significance and is one of the most empirically hardest distributions of SAT instances. In this paper, we propose a new heuristic called DCCA, which combines two configuration checking (CC) strategies with different definitions of configuration in a novel way. We use the DCCA heuristic to design an efficient SLS solver for SAT dubbed DCCASat. The experiments show that the DCCASat solver significantly outperforms a number of state-of-the-art solvers on extensive random k-SAT benchmarks at the phase transition. Moreover, further empirical analyses on structured benchmarks indicate the robustness of DCCASat.

15:35
GenEth: A General Ethical Dilemma Analyzer
SPEAKER: unknown

ABSTRACT. We contend that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. As it is likely that in many particular cases of ethical dilemmas ethicists agree on the ethically relevant features and the right course of action, generalization of such cases can be used to help discover principles needed for ethical guidance of the behavior of autonomous systems. Such principles help ensure the ethical behavior of complex and dynamic systems and further serve as a basis for justification of their actions as well as a control abstraction for managing unanticipated behavior. To provide assistance in developing ethical principles, we have developed GENETH, a general ethical dilemma analyzer that, through a dialog with ethicists, codifies ethical principles in any given domain. GENETH has been used to codify principles in a number of domains pertinent to the behavior of autonomous systems and these principles have been verified using an Ethical Turing Test.

16:30-17:30 Session 47: AAAI-14 Poster Session III:F
Location: Hall 200C
16:30
Mapping Users Across Networks by Manifold Alignment on Hypergraph
SPEAKER: unknown

ABSTRACT. Nowadays many people are members of multiple online social networks simultaneously, such as Facebook, Twitter and some other instant messaging circles. But these networks are usually isolated from each other. Mapping common users cross these social networks will be beneficial for cross network recommendation or expanding one’s social circle. Methods based on username comparison perform well on parts of users, however they can not work in the following situations: (a) users choose completely different usernames in different networks; (b) a unique username corresponds to different individuals. In this paper, we propose to utilize social structures to improve the mapping performance. Specifically, a novel subspace learning algorithm, Manifold Alignment on Hypergraph (MAH), is proposed. Different from traditional semi-supervised manifold alignment methods, we use hypergraph to model high-order relations here. For a target user in one network, the proposed algorithm ranks all users in the other network by their probabilities of being the corresponding user. Moreover, methods based on username comparison can be incorporated with our algorithm easily to further boost the mapping accuracy. In experiments, we use both simulation data and real world data to test the proposed method. Experiment results have demonstrated the effectiveness of our proposed algorithm in mapping users cross networks.

16:30
Worst-Case Solution Quality Analysis When Not Re-Expanding Nodes in Best-First Search
SPEAKER: unknown

ABSTRACT. The use of inconsistent heuristics with A* can result in increased runtime due to the need to re-expand nodes. Poor performance can also be seen with Weighted A* if nodes are re-expanded. While the negative impact of re-expansions can often be minimized by setting these algorithms to never expand nodes more than once, the result can be a lower solution quality. In this paper, we formally show that the loss in solution quality can be bounded based on the amount of inconsistency along optimal solution paths. This bound holds regardless of whether the heuristic is admissible or inadmissible, though if the heuristic is admissible the bound can be used to show that not re-expanding nodes can have at most a quadratic impact on the quality of solutions found when using A*. We then show that the bound is tight by describing a process for the construction of graphs for which a best-first search that does not re-expand nodes will find solutions whose quality is arbitrarily close to that given by the bound. Finally, we will use the bound to extend a known result regarding the solution quality of WA* when weighting a consistent heuristic, so that it applies to other types of heuristic weighting.

16:30
Coactive Learning for Locally Optimal Problem Solving
SPEAKER: unknown

ABSTRACT. Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a global or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a candidate solution. The goal of learning in this new setting is to minimize the cost as measured by the expert effort over time. We first establish theoretical bounds on the average cost of the existing coactive Perceptron algorithm. In addition, we consider new online algorithms that use cost-sensitive and Passive-Aggressive (PA) updates, showing similar or improved theoretical bounds. We provide an empirical evaluation of the learners in 5 domains, which show that the Perceptron based algorithms are quite effective and that unlike the case for online classification, the PA algorithms do not yield significant performance gains.

16:30
Prediction of Helpful Reviews using Emotions Extraction
SPEAKER: unknown

ABSTRACT. Reviews keep playing an increasingly important role in the decision process of buying products and booking hotels. However, the large amount of available information can be confusing to users. A more succinct interface, gathering only the most helpful reviews, can reduce information processing time and save effort. To create such an interface in real time, we need reliable prediction algorithms to classify and predict new reviews which have not been voted but are potentially helpful. So far such helpfulness prediction algorithms have benefited from structural aspects, such as the length and readability score. Since emotional words are at the heart of our written communication and are powerful to trigger listeners’ attention, we believe that emotional words can serve as important parameters for predicting helpfulness of review text.

Using GALC, a general lexicon of emotional words associated with a model representing 20 different categories, we extracted the emotionality from the review text and applied supervised classification method to derive the emotion-based helpful review prediction. As the second contribution, we propose an evaluation framework comparing three different real-world datasets extracted from the most well-known product review websites. This framework shows that emotion-based methods are outperforming the structure-based approach, by up to 9%.

16:30
Supervised Scoring with Monotone Multidimensional Splines

ABSTRACT. Scoring involves the compression of a number of quantitative attributes into a single meaningful value. We consider the problem of how to generate scores in a setting where they should be weakly monotone (either non-increasing or non-decreasing) in their dimensions. Our approach allows an expert to score an arbitrary set of points to produce meaningful, continuous, monotone scores over the entire domain, while exactly interpolating through those inputs. In contrast, existing monotone interpolating methods only work in two dimensions and typically require exhaustive grid input. Our technique significantly lowers the bar to score creation, allowing domain experts to develop mathematically coherent scores. The method is used in practice to create the LEED Performance energy and water scores that gauge building sustainability.

16:30
On the Axiomatic Characterization of Runoff Voting Rules
SPEAKER: unknown

ABSTRACT. Runoff voting rules such as single transferable vote (STV) and Baldwin's rule are of particular interest in computational social choice due to their recursive nature and hardness of manipulation, as well as in (human) practice because they are relatively easy to understand. However, they are not known for their compliance with desirable axiomatic properties, which we attempt to rectify here. We characterize runoff rules that are based on scoring rules using two axioms: a weakening of local independence of irrelevant alternatives and a variant of population-consistency. We then show, as our main technical result, that STV is the only runoff scoring rule satisfying an independence-of-clones property. Furthermore, we provide axiomatizations of Baldwin's rule and Coombs' rule.

16:30
Huffman Coding for Storing Non-uniformly Distributed Messages in Networks of Neural Cliques
SPEAKER: unknown

ABSTRACT. Associative memories are data structures that allow retrieval of previously stored messages given part of their content. They thus behave similarly to human brain's memory that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Nevertheless, it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. Recently, a new family of sparse associative memories achieving almost-optimal efficiency has been proposed. Their structure induces a direct mapping between input messages and stored patterns. In this work, we show the impact of non-uniformity on the performance of this recent model and we exploit the structure of the model to introduce several strategies to allow for efficient storage of non-uniform messages. We show that a technique based on Huffman coding is the most efficient.

16:30
Using Timed Game Automata to Synthesize Execution Strategies for Simple Temporal Networks with Uncertainty
SPEAKER: unknown

ABSTRACT. A Simple Temporal Network with Uncertainty (STNU) is a structure for representing and reasoning about temporal constraints in domains where some temporal durations are not controlled by the executor. The most important property of an STNU is whether it is dynamically controllable (DC); that is, whether there exists a strategy for executing the controllable time-points that guarantees that all constraints will be satisfied no matter how the uncontrollable durations turn out.

This paper provides a novel mapping from STNUs to Timed Game Automata (TGAs) that: (1) explicates the deep theoretical relationships between STNUs and TGAs; and (2) enables the memoryless strategies generated from the TGA to be transformed into equivalent STNU execution strategies that reduce the real-time computational burden for the executor. The paper formally proves that the STNU-to-TGA encoding properly captures the execution semantics of STNUs. It also provides experimental evidence of the proposed approaches, generating offline execution strategies for dynamically controllable STNUs encoded as TGAs.

16:30
Semantic Data Representation for Improving Tensor Factorization
SPEAKER: unknown

ABSTRACT. Predicting human activities is important for improving recommendation systems or analyzing social relationships among users. Those human activities are usually represented as multi-object relationships (e.g. user's tagging activity for items or user's tweeting activity at some location). Since multi-object relationships are naturally represented as a tensor, tensor factorization is becoming more important for predicting users' possible activities. However, the prediction accuracy of tensor factorization is weak for ambiguous and/or sparsely observed objects. Our solution, Semantic data Representation for Tensor Factorization (SRTF), tackles these problems by incorporating semantics into tensor factorization based on the following ideas: (1) it links objects to vocabularies/taxonomies and resolves the ambiguity caused by objects that can be used for multiple purposes. (2) it links objects to composite classes that merge classes in different kinds of vocabularies/taxonomies (e.g. classes in vocabularies for movie genres and those for directors) to avoid low prediction accuracy caused by rough-grained semantic space. (3) it lifts sparsely observed objects into their classes to solve the sparsity problem for rarely observed objects. Experiments show that SRTF achieves 10\% higher accuracy than current best methods.

16:30
Uncorrelated Multi-View Discrimination Dictionary Learning for Recognition
SPEAKER: unknown

ABSTRACT. Dictionary learning (DL) has now become an important feature learning technique that owns state-of-the-art recog-nition performance. Due to sparse characteristic of data in real-world applications, dictionary learning uses a set of learned dictionary bases to represent the linear decomposi-tion of a data point. Fisher discrimination DL (FDDL) is a representative supervised DL method, which constructs a structured dictionary whose atoms correspond to the class labels. Recent years have witnessed a growing interest in multi-view (more than two views) feature learning tech-niques. Although some multi-view (or multi-modal) DL methods have been presented, there still exists much room for improvement. How to enhance the total discriminability of dictionaries and reduce their redundancy is a crucial re-search topic. To boost the performance of multi-view dic-tionary learning technique, we propose an uncorrelated multi-view fisher discrimination DL (UMFDDL) approach for recognition. By making dictionary atoms correspond to the class labels such that the obtained reconstruction error is discriminative, UMFDDL aims to jointly learn multiple dic-tionaries with totally favorable discriminative power. Fur-thermore, we design the uncorrelated constraint for multi-view DL, so as to reduce the redundancy among dictionaries learned from different views. Experiments on several public datasets demonstrate the effectiveness of the proposed approach.

16:30
Exploiting Support Sets for Answer Set Programs with External Evaluations
SPEAKER: unknown

ABSTRACT. Answer set programs (ASP) with external evaluations are a declarative means to capture advanced applications. However, their evaluation can be expensive due to external source accesses. In this paper we consider hex-programs that provide external atoms as a bidirectional interface to external sources and present a novel evaluation method based on support sets, which informally are portions of the input to an external atom that will determine its output for any completion of the partial input. Support sets allow one to shortcut the external source access, which can be completely eliminated. This is particularly attractive if a compact representation of suitable support sets is efficiently constructible. We discuss some applications with this property, among them description logic programs over DL-Lite ontologies, and present experimental results showing that support sets can significantly improve efficiency.

16:30
Binary Aggregation by Selection of the Most Representative Voter
SPEAKER: unknown

ABSTRACT. In binary aggregation, a group of voters each express yes/no choices regarding a number of possibly correlated issues and we are asked to decide on a collective choice that accurately reflects the views of this group. A good collective choice will minimise the distance to each of the individual choices, but using such a distance-based aggregation rule is computationally intractable. Instead, we explore a class of low-complexity aggregation rules that select the most representative voter in any given situation and return that voter's choice as the collective outcome.

16:30
Preprocessing for Propositional Model Counting
SPEAKER: unknown

ABSTRACT. This paper is concerned with preprocessing techniques for propositional model counting. We have implemented a preprocessor which includes many elementary preprocessing techniques, including occurrence reduction, vivification, backbone identification, as well as equivalence, AND and XOR gate identification and replacement. We performed intensive experiments, using a huge number of benchmarks coming from a large number of families. Two approaches to model counting have been considered downstream: ”direct” model counting using Cachet and compilation-based model counting, based on the C2D compiler. The experimental results we have obtained show that our preprocessor is both efficient and robust.

16:30
Locality Preserving Projection for Domain Adaptation with Multi-Objective Learning
SPEAKER: unknown

ABSTRACT. In many practical cases, we need to generalize a model trained in a source domain to a new target domain.However, the distribution of these two domains may differ very significantly, especially sometimes some crucial target features may not have support in the source domain. This paper propose a novel locality preserving projection methods for domain adaptation task,which can find a linear mapping preserving the 'intrinsic structure' for both the source and the target domain. In this work, we first construct two graphs encoding the neighborhood information for the source domain and target domain separately. We then try to find linear projection coefficients which have the property of locality preserving for each graph.Instead of combing the two objective function under compatibility assumption and requiring the user to decide the importance of each objective function. We propose a multi-objective formulation for this problem and solve it simultaneously using pareto optimization.The pareto frontier captures all possible good linear projection vectors that are prefered by one or more objectives.The effectiveness of our approach is justified by both theoretical analysis and empirical results on real world data sets. The new feature representation shows better prediction accuracy as our experiment clearly demonstrated.

16:30
Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization
SPEAKER: unknown

ABSTRACT. Missing values are a common problem when applying classification algorithms to real-world medical data. This is especially true for trauma patients, where clinical variables frequently go uncollected due to the severity and urgency of their condition. Standard approaches to handling missingness first learn a model to estimate missing data values, and subsequently train and evaluate a classifier using data imputed with this model. Recently, several works have demonstrated the benefits of jointly estimating the imputation model and classifier parameters. However existing approaches make assumptions that limit their utility with many real-world medical datasets, particularly that data elements are missing at random, which is often invalid. We present a novel approach to jointly learning the imputation model and classifier. Unlike existing methods, the proposed approach makes no assumptions about the missingness of the data, can be used with arbitrary probabilistic data models and classification loss functions, and can be used when both training and testing data have missing values. We investigate the approach's utility in the prediction of several patient outcomes in a large national registry of trauma patients, and find that it significantly outperforms standard sequential methods.

16:30
Qualitative Reasoning with Modelica Models
SPEAKER: unknown

ABSTRACT. Applications of qualitative reasoning to engineering design face a knowledge acquisition challenge. Designers are not fluent in qualitative modeling languages and techniques. To overcome this barrier, we perform qualitative simulation using models solely written in Modelica, a popular language among designers for modeling hybrid systems. This paper has two contributions: (1) a formalization of the relationship between the results of the Modelica and qualitative simulations for the same model along with a novel algorithm for computing the consequences of events in qualitative simulation, and (2) three classes of additional constraints that reduce the number of unrealizable trajectories when performing qualitative simulation with Modelica models. We support these contributions with examples and a case study that shows a reduction by a factor of six the size of the qualitative simulation.

16:30
Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
SPEAKER: unknown

ABSTRACT. Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs.

16:30
Hybrid Heterogeneous Transfer Learning through Deep Learning
SPEAKER: unknown

ABSTRACT. Most previous heterogeneous transfer learning methods learn a cross-domain feature mapping between heterogeneous feature spaces based on a few cross-domain instance-correspondences, and these corresponding instances are assumed to be representatives in the source and target domains respectively. However, in many real-world scenarios, this assumption may not hold. As a result, the feature mapping may not be constructed precisely or the mapped data from the source (or target) domain may suffer from a data or feature shift issue in the target (or source) domain. In this case, a classifier trained on the labeled transformed-source-domain data may not be useful for the target domain. In this paper, we present a new transfer learning framework called {\em Hybrid Heterogeneous Transfer Learning} (HHTL), which allows choosing the corresponding instances to be biased in either the source or target domain. Moreover, we propose a deep learning approach to learn better feature representations of each domain data to reduce the data shift issue, and a better feature mapping between cross-domain heterogeneous features for the accurate transfer simultaneously. Extensive experiments on several multilingual sentiment classification tasks verify the effectiveness of our proposed approach compared with the baseline methods.

16:30
Agent Behavior Prediction and Its Generalization Analysis
SPEAKER: unknown

ABSTRACT. Machine learning algorithms have been applied to predict agent behaviors in real-world dynamic systems, such as advertiser behaviors in sponsored search and worker behaviors in crowdsourcing, and the prediction models have been used for the optimization of these systems. Note that the behavior data in these systems are generated by \emph{live} agents: once the systems change due to the adoption of the prediction models learnt from the behavior data, agents will observe (directly or indirectly) and respond to these changes by changing their own behaviors accordingly. As a result, the behavior data will evolve and will not be identically and independently distributed, which poses great challenges to the theoretical analysis on the machine learning algorithms for behavior prediction. To tackle this challenge, in this paper, we propose to use \emph{Markov Chain in Random Environments} (MCRE) to describe the behavior data, and perform generalization analysis of the machine learning algorithms on its basis. However, since the one-step transition probability matrix of MCRE depends on both previous states and the random environment, conventional techniques for generalization analysis cannot be directly applied. To address this issue, we propose a novel technique that transforms the original MCRE into a higher-dimensional time-homogeneous Markov chain. The new Markov chain involves more variables but is more regular, and thus easier to deal with. We prove the convergence of the new Markov chain when time approaches infinity. Then we prove a generalization bound for the machine learning algorithms on the behavior data generated by the new Markov chain, which depends on both the Markovian parameters and the covering number of the function class compounded by the loss function for behavior prediction and the behavior prediction model. To the best of our knowledge, this is the first work that performs the generalization analysis on data generated by complex processes in real-world dynamic systems.

16:30
Efficiently Implementing GOLOG with Answer Set Programming
SPEAKER: Malcolm Ryan

ABSTRACT. In this paper we investigate four different approaches to encoding domain-dependent control knowledge for Answer-Set Planning. Starting with a standard implementation of the answer-set planning language B, we add control knowledge expressed in the GOLOG logic programming language. A naive encoding, following the original definitions of Reiter et al., is shown to scale poorly. We investigate three alternative codings based on the finite-state machine semantics of ConGOLOG. These perform better, although there is no clear winner. We discuss the pros and cons of each approach.

16:30
Partial Multi-View Clustering
SPEAKER: unknown

ABSTRACT. Real data are often with multiple modalities or coming from multiple channels, while multi-view clustering provides a natural formulation for generating clusters from such data. Previous studies assumed that each example appears in all views, or at least there is one view containing all examples. In real tasks, however, it is often the case that every view suffers from the missing of some data and therefore results in many partial examples, i.e., examples with some views missing. In this paper, we present possibly the first study on partial multi-view clustering. Our proposed approach, PVC, works by establishing a latent subspace where the instances corresponding to the same example in different views are close to each other, and the instances (belonging to different examples) in the same view are gathering smoothly. Experiments demonstrate the advantages of our proposed approach.

16:30
Potential-Aware Imperfect-Recall Abstraction with Earth Mover's Distance in Imperfect-Information Games
SPEAKER: unknown

ABSTRACT. There is often a large disparity between the size of a game we wish to solve and the size of the largest instances solvable by the best algorithms; for example, a popular variant of poker has about $10^{165}$ nodes in its game tree, while the currently best approximate equilibrium-finding algorithms scale to games with around $10^{12}$ nodes. In order to approximate equilibrium strategies in these games, the leading approach is to create a sufficiently small strategic approximation of the full game, called an abstraction, and to solve that smaller game instead. The leading abstraction algorithm for imperfect-information games generates abstractions that have imperfect recall and are distribution aware, using $k$-means with the earth mover's distance metric to cluster similar states together. A distribution-aware abstraction groups states together at a given round if their full distributions over future strength are similar (as opposed to, for example, just the expectation of their strength). The leading algorithm considers distributions over future strength at the final round of the game. However, one might benefit by considering the distribution over strength in all future rounds, not just the final round. An abstraction algorithm that takes all future rounds into account is called potential aware. We present the first algorithm for computing potential-aware imperfect-recall abstractions using earth mover's distance. Experiments on no-limit Texas Hold'em show that our algorithm improves performance over the previously best approach.

16:30
Finding the k-best Equivalence Classes of Bayesian Network Structures for Model Averaging
SPEAKER: unknown

ABSTRACT. In this paper we develop an algorithm to find the k-best equivalence classes of Bayesian networks. Our algorithm is capable of finding much more best DAGs than the previous algorithm that directly finds the k-best DAGs (Tian, He and Ram 2010). We demonstrate our algorithm in the task of Bayesian model averaging. Empirical results show that our algorithm significantly outperforms the k-best DAG algorithm in both time and space to achieve the same quality of approximation. Our algorithm goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery.

16:30
Machine Translation with Real-time Web Search
SPEAKER: unknown

ABSTRACT. Contemporary machine translation systems usually rely on offline data retrieved from the web for individual model training, such as translation models and language models. Distinct from existing methods, we propose a novel approach that treats machine translation as a web search task and utilizes the web on the fly to acquire translation knowledge. This end-to-end approach takes advantage of fresh web search results that are capable of leveraging tremendous web knowledge to obtain phrase-level candidates on demand and then compose sentence-level translations. Experimental results show that our web-based machine translation method demonstrates very promising performance in leveraging fresh translation knowledge and making translation decisions. Furthermore, when combined with offline models, it significantly outperforms a state-of-the-art phrase-based statistical machine translation system.

16:30
Novel Density-based Clustering Algorithms for Uncertain Data
SPEAKER: unknown

ABSTRACT. Density-based techniques seem promising for handling data uncertainty in uncertain data clustering. Nevertheless, some issues have not been addressed well in existing algorithms. In this paper, we firstly propose a novel density-based uncertain data clustering algorithm, which improves upon existing algorithms from the following two aspects: (1) it employs an exact method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in previous work; (2) it introduces new definitions of core object probability and direct reachability probability, thus reducing the complexity and avoiding sampling. We then further improve the algorithm by using a novel assignment strategy to ensure that every object will be assigned to the most appropriate cluster. Experimental results show the superiority of our proposed algorithms over existing ones.

16:30
Feature Selection at the Discrete Limit
SPEAKER: unknown

ABSTRACT. Feature selection plays an important role in many machine learning and data mining applications. In this paper, we propose to use L2,p norm for feature selection with emphasis on small p. As p appoaches 0, feature selection becomes discrete feature selection problem.We provide two algorithms, proximal gradient algorithm and rank one update algorithm, which is more efficient at large regularization . Experiments on real life data sets show that features selected at small p consistently outperform features selected at p = 1, the standard L2,1 approach and other popular feature selection methods.

16:30
Dynamic Bayesian Probabilistic Matrix Factorization

ABSTRACT. Collaborative filtering algorithms generally rely on the assumption that user preference patterns remain stationary. However, real-world relational data are seldom stationary. User preference patterns may change over time, giving rise to the requirement of designing collaborative filtering systems capable of detecting and adapting to preference pattern shifts. Motivated by this observation, in this paper we propose a dynamic Bayesian probabilistic matrix factorization model, designed for modeling time-varying distributions. Formulation of our model is based on imposition of a dynamic hierarchical Dirichlet process (dHDP) prior over the space of probabilistic matrix factorization models to capture the time-evolving statistical properties of modeled sequential relational datasets. We develop a simple Markov Chain Monte Carlo sampler to perform inference. We present experimental results to demonstrate the superiority of our temporal model.

16:30
Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization
SPEAKER: unknown

ABSTRACT. This paper proposes the Proximal Iteratively REweighted (PIRE) algorithm for solving a general problem, which involves a large body of nonconvex sparse and structured sparse related problems. Comparing with previous iterative solvers for nonconvex sparse problem, PIRE is much more general and efficient. The computational cost of PIRE in each iteration is usually as low as the state-of-the-art convex solvers. We further propose the PIRE algorithm with Parallel Splitting (PIRE-PS) and PIRE algorithm with Alternative Updating (PIRE-AU) to handle the multi-variable problems. In theory, we prove that our proposed methods converge and any limit solution is a stationary point. Extensive experiments on both synthesis and real data sets demonstrate that our methods achieve comparative learning performance, but are much more efficient, by comparing with previous nonconvex solvers.

16:30
Trust Prediction with Propagation and Similarity Regularization
SPEAKER: unknown

ABSTRACT. Online social networks have been used for a variety of rich activities in recent years, such as investigating potential employees or seeking recommendations of high quality services and service providers. In such activities trust is one of the most vital factors for users' decision-making. In the literature, the state-of-the-art trust prediction approaches either focus on dispositional bias and propagated trust value of the pair-wise trust relationship along a path or on the similarity of trust rating values. However, another factor, the distribution of trust ratings, also affects the trust between users. In addition, bias, propagated trust and similarity are of different types, but were treated the same. Therefore, how to utilize the factors needs further improvement. In this paper we propose a new trust prediction model based on trust decomposition and matrix factorization, considering all the above essential factors to predict the trust between two users who are not directly connected. In this model, we firstly decompose trust into biased trust and bias-reduced trust. Then based on bias-reduced trust ratings, matrix factorization with a similarity regularization term, which takes advantages of both users' rating habits and propagated trust, is proposed to predict missing trust values. In the end, the missing trust is recomposed with predicted trust values and bias. Experiments conducted on a real-world dataset illustrate significantly improved prediction accuracy over the state-of-the-art approaches.

16:30
Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition
SPEAKER: unknown

ABSTRACT. Multi-view clustering, which seeks a partition of the data in multiple views that often provide complementary information to each other, has received considerable attention in recent years. In real life clustering problems, the data in each view may have considerable noise. However, existing clustering methods blindly combine the information from multi-view data with possibly considerable noise, which often degrades their performance. In this paper, we propose a novel Markov chain method for \textit{Robust Multi-view Spectral Clustering} (RMSC). Our method has a flavor of low-rank and sparse decomposition, where we firstly construct a transition probability matrix from each single view, and then use these matrices to recover a shared low-rank transition probability matrix as a crucial input to the standard Markov chain method for clustering. The optimization problem of RMSC has a low-rank constraint on the transition probability matrix, and simultaneously a probabilistic simplex constraint on each of its rows. To solve this challenging optimization problem, we propose an optimization procedure based on the Augmented Lagrangian Multiplier scheme. Experimental results on various real world datasets show that the proposed method has superior performance over several state-of-the-art methods for multi-view clustering.

16:30
Similarity-Preserving Binary Signature for Linear Subspaces
SPEAKER: unknown

ABSTRACT. Linear subspace is an important representation for many kinds of real-world data in computer vision and pattern recognition, e.g. faces, motion videos, speech. In this paper, first we define pairwise angular similarity and angular distance for linear subspaces. The angular distance satisfies non-negativity, identity of indiscernibles, symmetry and triangle inequality, and thus it is a metric. Then we propose a method to compress linear subspaces into compact similarity-preserving binary signatures, between which the normalized Hamming distance is an unbiased estimator of the angular distance. We provide a lower bound on the length of binary signatures which suffices to guarantee a uniform distance-preservation within a set of subspaces. Experiments on face recognition demonstrate the effectiveness of this binary signature in terms of recognition accuracy, speed and storage requirement. The results show that, compared with the exact method, the approximation with binary signatures achieves an order of magnitude speed-up, while requiring significantly smaller amount of storage space, yet it still accurately preserves the similarity, and achieves high recognition accuracy comparable to the exact method in face recognition.

16:30
Leveraging Fee-Based, Imperfect Advisors in Human-Agent Games of Trust
SPEAKER: unknown

ABSTRACT. This paper explores whether the addition of costly, imperfect, and exploitable advisors to Berg's investment game enhances or detracts from investor performance in both one-shot and multi-round interactions. We then leverage our findings to develop an automated investor agent that performs as well as or better than humans in these games. To gather this data, we extended Berg's game and conducted a series of experiments using Amazon's Mechanical Turk to determine how humans behave in these potentially adversarial conditions. Our results indicate that, in games of short duration, advisors do not stimulate positive behavior and are not useful in providing actionable advice. In long-term interactions, however, advisors do stimulate positive behavior with significantly increased investments and returns. By modeling human behavior across several hundred participants, we were then able to develop agent strategies that maximized return on investment and performed as well as or significantly better than humans. In one-shot games, we identified an ideal investment value that, on average, resulted in positive returns as long as advisor exploitation was not allowed. For the multi-round games, our agents relied on the corrective presence of advisors to stimulate positive returns on maximum investment.

17:30-18:30 Session 48: AAAI-14 AI Video Competition Awards
Location: Main Hall, Foyer 4, Loggia
Disclaimer | Powered by EasyChair Smart Program