AAAI-14: TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE

# PROGRAM FOR WEDNESDAY, JULY 30TH

Days:
previous day
next day
all days
08:30-09:15 Session 25A: AAAI-14: What's Hot Session (RL/ECMLPKDD/RSS)
Location: Room 301A
 08:30 What's Hot in Reinforcement LearningSPEAKER: Volodymyr Mnih 08:45 What's Hot in Machine LearningSPEAKER: Animashree Anandkumar 09:00 Challenges in RoboticsSPEAKER: Stefanie Tellex
08:30-09:15 Session 25B: AAAI-14: Senior Member Session
Location: Room 301B
 08:30 Program Induction in the Real WorldSPEAKER: Stephen MuggletonABSTRACT. Much of the world’s population use computers for everyday tasks, but most fail to benefit from the power of computation due to their inability to program. Inductive Program Synthesis, the subject of a long tradition of ongoing aca- demic AI research1 [6, 12], has recently been incorporated for the first time into a mass market product in the form of the FlashFill subsystem found in Microsoft Excel 2013. FlashFill [3, 4] employs Formal Methods techniques to support inductive synthesis of string processing programs for spreadsheets, typ- ically from a single user-provided example. The ability to learn from a single example is typical of human learning [7], while being uncharacteristic of most forms of machine learning. Despite the power of FlashFill to support a natural interface for everyday inductive programming, its present implementation falls short of the early goals of AI areas such as Inductive Logic Programming (ILP) [12] in its lack of 1) Automatic Hierarchical Program Decomposition, 2) Lack of iteration and recursion in the synthesised program, 3) Inability to re-use parts of learned programs in new tasks, 4) Inability to provide simple human understandable feedback on the form of the learned program to the user. This presentation will survey recent progress in AI approaches to inductive program synthesis and will demonstrate how these techniques are able to overcome the shortcomings of FlashFill on string transformation tasks. 08:45 Implications of Algorithmic and High-Frequency TradingSPEAKER: Michael WellmanABSTRACT. Algorithms are taking over financial markets. AI researchers have a unique perspective to contribute in helping the world understand the implications of this development. 09:00 Spectral Learning Methods for Natural Language ProcessingSPEAKER: Lyle UngarABSTRACT. This talk will explain recent advances in spectral methods for latent variable models. We will focus on spectral methods for estimating HMMs, as this setting serves best for giving the intuition behind the algorithms and theorems about them. Spectral HMM methods form the basis for a variety of NLP applications, including spectral dependency parsing and latent variable PCFGs. We show that using such spectral methods to map from words to vectors provides features that improve performance in a number of natural language processing applications including named entity recognition (is a mention of "Paris" a place, person or perfume?) and parsing. Spectral HMM estimation methods can also be elegantly generalized to include continuous emissions, for applications including speech recognition, machine vision and robotics. Examples will be drawn primarily from NLP, but these techniques are increasing also used in robotics and machine vision.
08:30-09:15 Session 25C: AAAI-14: What's Hot Session (ICWSM/SoCS/CogSci)
Location: Room 303A
 08:30 What's Hot in Social MediaSPEAKER: Eytan Adar 08:45 Challenges in Combinatorial SearchSPEAKER: Nathan Sturtevant 09:00 Challenges in Cognitive ScienceSPEAKER: Vincent C. Mueller
08:30-09:15 Session 25D: AAAI-14: Senior Member Session
Location: Room 303B
09:00-10:00 Session 26: IAAI-14 Invited Talk
Location: Hall 200A
 09:00 The Conversational User InterfaceSPEAKER: Ron KaplanABSTRACT. Work on both the graphical user interface (GUI) and the conversational user interface (CUI) started at about the same time, about 40 years ago. The GUI was a lot easier to implement, and it made computing and information resources available to ordinary people. But over the years it has lost much of its simplicity and charm. The CUI has taken many more years to develop, requiring major scientific and engineering advances in speech, natural language processing, user-modeling, and reasoning, not to mention increases in cost-effective computation. But the infrastructure is now in place for the creation and widespread distribution of conversational interfaces. This talk describes some natural modes of conversational interaction and some of the supporting technologies that are now under development.
09:15-10:00 Session 27A: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301A
 09:15 Kernelized Bayesian Transfer LearningSPEAKER: unknownABSTRACT. Transfer learning considers related but distinct tasks defined on heterogenous domains and tries to transfer knowledge between these tasks to improve generalization performance. It is particularly useful when we do not have sufficient amount of labeled training data in some tasks, which may be very costly, laborious, or even infeasible to obtain. Instead, learning the tasks jointly enables us to effectively increase the amount of labeled training data. In this paper, we formulate a kernelized Bayesian transfer learning framework that is a principled combination of kernel-based dimensionality reduction models with task-specific projection matrices to find a shared subspace and a coupled classification model for all of the tasks in this subspace. Our two main contributions are: (i) two novel probabilistic models for binary and multiclass classification, and (ii) very efficient variational approximation procedures for these models. We illustrate the generalization performance of our algorithms on two different applications. In computer vision experiments, our method outperforms the state-of-the-art algorithms on nine out of 12 benchmark supervised domain adaptation experiments defined on two object recognition data sets. In cancer biology experiments, we use our algorithm to predict mutation status of important cancer genes from gene expression profiles using two distinct cancer populations, namely, patient-derived primary tumor data and in-vitro-derived cancer cell line data. We show that we can increase our generalization performance on primary tumors using cell lines as an auxiliary data source. 09:30 Dropout Training for Support Vector MachinesSPEAKER: unknownABSTRACT. Dropout and other feature noising schemes have shown promising results in controlling over-fitting by artificially corrupting the training data. Though extensive theoretical and empirical studies have been performed for generalized linear models, little work has been done for support vector machines (SVMs), one of the most successful approaches for supervised learning. This paper presents dropout training for linear SVMs. To deal with the intractable expectation of the non-smooth hinge loss under corrupting distributions, we develop an iteratively re-weighted least square (IRLS) algorithm by exploring data augmentation techniques. Our algorithm iteratively minimizes the expectation of a re-weighted least square problem, where the re-weights have closed-form solutions. The similar ideas are applied to develop a new IRLS algorithm for the expected logistic loss under corrupting distributions. Our algorithms offer insights on the connection and difference between the hinge loss and logistic loss in dropout training. Empirical results on several real datasets demonstrate the effectiveness of dropout training on significantly boosting the classification accuracy of linear SVMs. 09:45 ReLISH: Reliable Label Inference via Smoothness HypothesisSPEAKER: unknownABSTRACT. The smoothness hypothesis is critical for graph-based semi-supervised learning. This paper defines local smoothness, based on which a new algorithm, Reliable Label Inference via Smoothness Hypothesis (ReLISH), is proposed. ReLISH has produced smoother labels than some existing methods for both labeled and unlabeled examples. Theoretical analyses demonstrate good stability and generalizability of ReLISH. Using real-world datasets, our empirical analyses reveal that ReLISH is promising for both transductive and inductive tasks, when compared with representative algorithms, including Harmonic Functions, Local and Global Consistency, Constraint Metric Learning, Linear Neighborhood Propagation, and Manifold Regularization.
09:15-10:00 Session 27B: AAAI-14: Knowledge Representation and Reasoning / Applications
Location: Room 301B
 09:15 Using Model-Based Diagnosis to Improve Software TestingSPEAKER: unknownABSTRACT. We propose a combination of AI techniques to improve software testing. When a test fails, a model-based diagnosis (MBD) algorithm is used to propose a set of possible explanations. We call these explanations "diagnoses". Then, a planning algorithm is used to suggest further tests to identify the correct diagnosis. A tester preforms these tests and reports their outcome back to the MBD algorithm, which uses this information to prune incorrect diagnoses. This iterative process continues until the correct diagnosis is returned. We call this testing paradigm Test, Diagnose and Plan (TDP). Several test planning algorithms are proposed to minimize the number of TDP iterations, and consequently the number of tests required until the correct diagnosis is found. Experimental results show that benefits of using an MDP-based planning algorithms over greedy test planning. 09:30 Pay-as-you-go OWL Query Answering Using a Triple StoreSPEAKER: unknownABSTRACT. We present an enhanced hybrid approach to OWL query answering that combines an RDF triple-store with a fully-fledged OWL reasoner in order to provide scalable pay as you go'' performance. The enhancements presented here include an extension to deal with arbitrary OWL ontologies, and several optimisations that significantly improve scalability. We have implemented these techniques in a prototype system, a preliminary evaluation of which has produced very encouraging results. 09:45 The Most Uncreative Examinee: A First Step toward Wide Coverage Natural Language Math Problem SolvingSPEAKER: unknownABSTRACT. We report on a project aiming at developing a system that solves a wide range of math problems written in natural language. In the system, formal analysis of natural language semantics is coupled with automated reasoning technologies including computer algebra, using logic as their common language. We have developed a prototype system that accepts as its input a linguistically annotated problem text. Using the prototype system as a reference point, we analyzed real university entrance examination problems from the viewpoint of end-to-end automated reasoning. Further, evaluation on entrance exam mock tests revealed that an optimistic estimate of the system’s performance already matches human averages on a few test sets.
09:15-10:00 Session 27C: AAAI-14: Heuristic Search and Optimization
Location: Room 302A
 09:15 Exponential Deepening A* for Real-Time Agent-Centered SearchSPEAKER: unknownABSTRACT. In the Real-Time Agent-Centered Search (RTACS) problem, an agent has to arrive at a goal location while acting and reasoning in the physical state space. Traditionally, RTACS problems are solved by propagating and updating heuristic values of states visited by the agent. In existing RTACS algorithms (e.g., the LRTA* family) the agent may revisit each state a many times causing the entire procedure to be quadratic in the state space. In this paper we study the Iterative Deepening (ID) approach for solving RTACS. We then present Exponential Deepening A* (EDA*), a RTACS algorithm where the threshold between successive Depth-First calls is increased exponentially. We prove that EDA* results in a linear worst case bound and support this experimentally by demonstrating up to 10x reduction over existing RTACS solvers WRT. states expanded and CPU runtime. 09:30 Identifying Hierarchies for Fast Optimal SearchSPEAKER: unknownABSTRACT. Search with Subgoal Graphs (Uras, Koenig, and Hernandez 2013) was a non-dominated optimal path-planning algorithm in the Grid-Based Path Planning Competitions 2012 and 2013. During a preprocessing phase, it computes a Simple Subgoal Graph, which is analogous to a visibility graph for continuous terrain, and then partitions the subgoals into global and local subgoals to obtain a Two-Level Subgoal Graph. During the path-planning phase, it performs an A* search that ignores local subgoals that are not relevant to the search, which significantly reduces the size of the graph being searched. In this paper, we generalize this partitioning process to any undirected graph and show that it can be recursively applied to generate more than two levels, which reduces the size of the graph being searched even further. We distinguish between basic partitioning, which only partitions the vertices into different levels, and advanced partitioning, which can also add new edges. We show that the construction of Simple Subgoal Graphs from grids and the construction of Two-Level Subgoal Graphs from Simple Subgoal Graphs are instances of generalized partitioning. We then report on experiments on Subgoal Graphs that demonstrate the effects of different types and levels of partitioning. We also report on experiments that demonstrate that our new N-Level Subgoal Graphs with several additional improvements achieve a better performance compared to Two-Level Subgoal graphs from (Uras, Koenig, and Hernandez 2013). 09:45 Elimination Ordering in Lifted First-Order Probabilistic InferenceSPEAKER: unknownABSTRACT. Various representations and inference methods have been proposed for lifted probabilistic inference in relational models. Many of these methods choose an order to eliminate (or branch on) the parametrized random variables. Similar to such methods for non-relational probabilistic inference, the order of elimination has a significant role in the performance of the algorithm. Since finding the best order is NP-complete even for non-relational models, heuristics have been proposed to find good orderings in the non-relational models. We show that these heuristics are inefficient for relational models, because they fail to consider the population sizes associated with logical variables in the parametrized random variable. In this paper, we extend existing heuristics for non-relational models and propose new heuristics for relational models. We evaluate the existing and new heuristics on a range of generated relational graphs.
09:15-10:00 Session 27D: AAAI-14: Cognitive Modeling
Location: Room 303A
 09:15 The Importance of Cognition and Affect for Artificially Intelligent Decision MakersSPEAKER: unknownABSTRACT. Agency – the capacity to plan and act – and experience – the capacity to sense and feel – are two critical aspects that determine whether people will perceive non-human entities, such as autonomous agents, to have a mind. There is evidence that the absence of either can reduce cooperation. We present an experiment that tests the necessity of both for cooperation with agents. In this experiment we manipulated people’s perceptions about the cognitive and affective abilities of agents, when engaging in the ultimatum game. The results indicated that people offered more money to agents that were perceived to make decisions according to their intentions, rather than randomly. Additionally, the results showed that people offered more money to agents that expressed emotion, when compared to agents that did not. We discuss the implications of this agency-experience theoretical framework for the design of artificially intelligent decision makers. 09:30 Efficient Codes for Inverse Dynamics during WalkingSPEAKER: unknownABSTRACT. Efficient codes have been used effectively in both computer science and neuroscience to better understand the information processing in visual and auditory encoding and discrimination tasks. In this paper, we explore the use of efficient codes for representing information relevant to human movements during locomotion. Specifically, we apply motion capture data to a physical model of the human skeleton to compute joint angles (inverse kinematics) and joint torques (inverse dynamics); then, by treating the resulting data as a regression problem, we investigate the effect of sparsity in mapping from angles to torques. The results of our investigation suggest that sparse codes can indeed represent salient features of both the kinematic and dynamic views of locomotion movements in humans. However, sparsity appears to be only one parameter in building a model of inverse dynamics; we also show that the "encoding" process benefits significantly by integrating with the "regression" process for this task. Finally, we use our results to argue that representations of movement are critical to modeling and understanding these movements. 09:45 Modeling Subjective Experience-based Learning under Uncertainty and FramesSPEAKER: unknownABSTRACT. In this paper we computationally examine how subjective experience may help or harm the decision maker's learning under uncertain outcomes, frames and their interactions. To model subjective experience, we propose the experienced-utility function'' based on a prospect theory (PT)-based parameterized subjective value function. Our analysis and simulations of two-armed bandit tasks present that the task domain (underlying outcome distributions) and framing (reference point selection) influence experienced utilities and in turn, the subjective discriminability'' of choices under uncertainty. Experiments demonstrate that subjective discriminability improves on objective discriminability by the use of the experienced-utility function with appropriate framing for a domain, and that bigger subjective discriminability leads to more optimal decisions in learning under uncertainty.
09:15-10:00 Session 27E: AAAI-14: Game Theory and Economic Paradigms / Game Playing
Location: Room 303B
 09:15 A Parameterized Complexity Analysis of Generalized CP-NetsSPEAKER: Andreas PfandlerABSTRACT. Generalized CP-nets (GCP-nets) allow a succinct representation of preferences over multi-attribute domains. As a consequence of their succinct representation, many GCP-net related tasks are computationally hard. Even finding the more preferable of two outcomes is PSPACE-complete. In this work, we employ the framework of parameterized complexity to achieve two goals: First, we want to gain a deeper understanding of the complexity of GCP-nets. Second, we search for efficient fixed-parameter tractable algorithms. 09:30 Evolutionary Dynamics of Q-Learning over the Sequence FormSPEAKER: unknownABSTRACT. Multi-agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi-agent learning algorithms and evolutionary game theory, showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence-form replicator dynamics, we develop a new version of the Q-learning algorithm working on the sequence form of an extensive-form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence-form replicator dynamics with a mutation term. We show that, although sequence-form and normal-form replicator dynamics are realization equivalent, the Q-learning algorithm applied to the two forms have non-realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi-agent learning, we produce an experimental evaluation to show the accuracy of the model. 09:45 Two Case Studies for Trading Multiple Indivisible Goods with IndifferencesSPEAKER: unknownABSTRACT. Designing mechanisms that satisfy individual rationality, Pareto efficiency, and strategyproofness is one of the most important problems in mechanism design. In this paper we investigate mechanism design for exchange models where each agent is initially endowed with a set of goods, each agent may have indifferences on distinct bundles of goods, and monetary transfers are not allowed. Sönmez (1999) showed that in such models, those three properties are not compatible in general. The impossibility, however, only holds under an assumption on preference domains. The purpose of this paper is to give a discussion on the compatibility of those three properties when the assumption does not hold. We first establish a preference domain called top-only preferences, which violates the assumption, and develop a class of exchange mechanisms satisfying all those properties. Each mechanism in the class utilizes one instance of mechanisms introduced by Saban and Sethuraman (2013). We also find a class of preference domains called m-chotomous preferences, where the assumption fails and those properties are incompatible.
10:00-10:20Coffee Break
10:20-11:50 Session 28A: AAAI-14: Novel Machine Learning Algorithms
Location: Room 301A
10:20-11:50 Session 28B: AAAI-14: NLP and Knowledge Representation / NLP and Vision
Location: Room 301B
 10:20 SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment AnalysisSPEAKER: Erik CambriaABSTRACT. SenticNet is a publicly available semantic and affective resource for concept-level opinion mining and sentiment analysis. Rather than using graph-mining and dimensionality-reduction techniques, SenticNet 3 makes use of energy flows' to connect various parts of extended common and common-sense knowledge representations to one another. SenticNet 3 models nuanced semantics and sentics (that is, the conceptual and affective information associated with multi-word natural language expressions), representing information with a symbolic opacity intermediate between that of neural networks and of typical symbolic systems. 10:35 Improving Domain-independent Cloud-based Speech Recognition with Domain-dependent Phonetic Post-processingSPEAKER: unknownABSTRACT. Automated speech recognition (ASR) technology has been developed to such a level that off-the-shelf distributed speech recognition services are available (free of cost) that allow researchers to integrate speech into their applications with little development effort or expert knowledge leading to better results compared with previously used open-source tools. Often, however, such services do not accept language models or grammars but process free speech from any domain. While results are very good given the enormous size of the search space, results frequently contain out-of-domain words or constructs that cannot be understood by subsequent domain-dependent natural language understanding (NLU) components. In this paper we present a versatile post-processing technique based on phonetic distance that integrates domain knowledge with open-domain ASR results, leading to improved ASR performance. Notably, our technique is able to make use of domain restrictions using various degrees of domain knowledge, ranging from pure vocabulary restrictions via grammars or N-grams to restrictions of the acceptable utterances. We present results for a variety of corpora (mainly from human-robot interaction) where our combined approach significantly outperforms Google ASR as well as a plain open-source ASR solution. 10:50 Knowledge Graph Embedding by Translating on HyperplanesSPEAKER: unknownABSTRACT. We deal with embedding a large scale knowledge graph composed of entities and relations into a continuous vector space. TransE is a promising method proposed recently which is very efficient while achieving the state-of-the-art predictive performance. We discuss about some mapping properties of relations which should be considered in embedding, such as symmetric, one-to-many, many-to-one, and many-to-many. We point out that TransE does not do well in dealing with these properties. Some complex models are capable to preserve these mapping properties but sacrificing the efficiency. To make a good trade-off between model capacity and efficiency, in this paper we propose a method where a relation is modeled as a hyperplane together with a translation operation on it. In this way, we can well preserve the above mapping properties of relations with almost the same model complexity of TransE. Meanwhile, as a practical knowledge graph is often far from completed, how to construct negative examples to reduce false negative labels in training is very important. Utilizing the one-to-many/many-to-one mapping property of a relation, we propose a simple trick to reduce the possibility of false negative labeling. We conduct extensive experiments on the tasks of link prediction, triplet classification and fact extraction on benchmark data sets from WordNet and Freebase. They show impressive improvements on predictive accuracy and also the capability to scale up. 11:05 Mind the Gap: Machine Translation by Minimizing the Semantic Gap in Embedding SpaceSPEAKER: Jiajun ZhangABSTRACT. The conventional statistical machine translation (SMT) methods perform the decoding process by compositing a set of the translation rules which have the highest probability. However, the probabilities of the translation rules are calculated only according to the cooccurrence statistics in the bilingual corpus rather than the semantic meaning similarity. In this paper, we propose a Recursive Neural Network (RNN) based model that converts each translation rule into a compact real-valued vector in the semantic embedding space and performs the decoding process by minimizing the semantic gap between the source language string and its translation candidates at each state in a bottom-up structure. The RNN-based translation model is trained using a max-margin objective function. Extensive experiments on Chinese-to-English translation show that our RNN-based model can significantly improve the translation quality by up to 1.68 BLEU score. 11:20 Hybrid Singular Value Thresholding for Tensor CompletionSPEAKER: unknownABSTRACT. In this paper, we study the low-rank tensor completion problem, where a high-order tensor with missing entries is given and the goal is to complete the tensor. We propose to minimize a new convex objective function, based on log sum of exponentials of nuclear norms, that promotes the low-rankness of unfolding matrices of the completed tensor. We show for the first time that the proximal operator to this objective function is readily computable through a hybrid singular value thresholding scheme. This leads to a new solution to high-order (low-rank) tensor completion via convex relaxation. We show that this convex relaxation and the resulting solution are much more effective than existing tensor completion methods (including those also based on minimizing ranks of unfolding matrices). The hybrid singular value thresholding scheme can be applied to any problem where the goal is to minimize the maximum rank of a set of low-rank matrices. 11:35 PREGO: An Action Language for Belief-Based Cognitive Robotics in Continuous DomainsSPEAKER: unknownABSTRACT. Cognitive robotics is often subject to the criticism that the proposals investigated in the literature are far removed from the kind of continuous uncertainty and noise seen in actual real-world robotics. This paper proposes a new language and an implemented system, called PREGO, based on the situation calculus, that is able to reason effectively about degrees of belief against noisy sensors and effectors in continuous domains. It embodies the representational richness of conventional logic-based action languages, such as context-sensitive successor state axioms, but is still shown to be efficient using a number of empirical evaluations. We believe that PREGO is a simple yet powerful dialect to explore real-time reactivity and an interesting bridge between logic and probability for cognitive robotics applications.
10:20-11:50 Session 28C: AAAI-14: Vision
Location: Room 302A
 10:20 Learning Low-Rank Representations with Classwise Block-Diagonal Structure for Robust Face RecognitionSPEAKER: unknownABSTRACT. Face recognition has been widely studied due to its importance in various applications. However, the case that both training images and testing images are corrupted is not well addressed. Motivated by the success of low-rank matrix recovery, we propose a novel semi-supervised low-rank matrix recovery algorithm for robust face recognition. The proposed method can learn robust discriminative representations for both training images and testing images simultaneously by exploiting the classwise block-diagonal structure. Specifically, low-rank matrix approximation can handle the possible contamination of data and the sparse noises. Moreover, the classwise block-diagonal structure is exploited to promote discrimination between different classes. The above issues are formulated into a unified objective function and we design an efficient optimization procedure based on augmented Lagrange multiplier method to solve it. Extensive experiments on three public databases are performed to validate the effectiveness of our approach. The strong identification capability of representations with block-diagonal structure is verified. 10:35 Semantic Segmentation Using Multiple Graphs with Block-Diagonal ConstraintsSPEAKER: unknownABSTRACT. In this paper we propose a novel method for image semantic segmentation using multiple graphs. The multi-view affinity graph is constructed by leveraging the consistency between semantic space and multiple visual spaces. With block-diagonal constraints, we enforce the affinity matrix to be sparse such that the pairwise potential for dissimilar superpixels is close to zero. By a divide-and-conquer strategy, the optimization for learning affinity matrix is decomposed into several subproblems that can be solved in parallel. Using the $neighborhood$ $relationship$ between superpixels and the $consistency$ between affinity matrix and label-confidence matrix, we infer the semantic label for each superpixel of unlabeled images by minimizing an objective whose closed form solution can be easily obtained. Experimental results on two real-world image datasets demonstrate the effectiveness of our method. 10:50 Locality-constrained Low-rank Coding for Image ClassificationSPEAKER: unknownABSTRACT. Low-rank coding (LRC), originated from matrix decomposition, is recently introduced into image classification. Following the standard bag-of-words (BOW) pipeline, coding the data matrix in the sense of low-rankness incorporates contextual information into the traditional BOW model and it can capture the dependency relationship among neighbor patches. This differs from the traditional sparse coding paradigms which encode patches independently. Current LRC-based methods use l1 norm to increase the discrimination and sparsity of the learned codes. However, such methods fail to consider the local manifold structure between data space and dictionary space. To solve this problem, we propose a locality-constrained low-rank coding (LCLR) algorithm for image representations. By using the geometric structure information as a regularization term, the obtained representations are more discriminative. In addition, we present a fast and stable online algorithm to solve the optimization problem. In the experiments, we evaluate LCLR on four benchmarks, including one face recognition dataset (extended Yale B), one handwritten digit recognition dataset (USPS), and two image datasets (Scene13 for scene recognition and Caltech101 for object recognition). Experimental results show that our approach outperforms many state-of-the-art algorithms even with a linear classifier. 11:05 Diagram Understanding in Geometry ProblemsSPEAKER: unknownABSTRACT. Automatically solving geometry questions is a long-standing AI problem. A geometry question typically includes a textual description accompanied by a diagram. The first step in solving geometry questions is diagram understanding, which consists of identifying visual elements in the diagram, their location, their geometric properties, and aligning them to corresponding textual descriptions. In this paper, we present a method for diagram understanding that identifies visual elements in a diagram while maximizing agreement between textual and visual data. We show that the method's objective function is submodular; thus we are able to introduce an efficient method for diagram understanding that is close to optimal. To empirically evaluate our method, we compile a new dataset of geometry questions (textual descriptions and diagrams) and compare with baselines that utilize standard vision techniques. Our experimental evaluation shows an F1 boost of more than 17\% in identifying visual elements and 25\% in aligning visual elements with their textual descriptions. 11:20 Latent Domains Modeling for Visual Domain AdaptationSPEAKER: unknownABSTRACT. To improve robustness to significant mismatches between source domain and target domain - arising from changes such as illumination, pose and image quality - domain adaptation is increasingly popular in computer vision. But most of methods assume that the source data is from single domain, or that multi-domain datasets provide the domain label for training instances. In practice, most datasets are mixtures of multiple latent domains, and difficult to manually provide the domain label of each data point. In this paper, we propose a model that automatically discovers latent domains in visual datasets. We first assume the visual images are sampled from multiple manifolds, each of which represents different domain, and which are represented by different subspaces. Using the neighborhood structure estimated from images belonging to the same category, we approximate the local linear invariant subspace for each image based on its local structure, eliminating the category-specific elements of the feature. Based on the effectiveness of this representation, we then propose a squared-loss mutual information based clustering model with category distribution prior in each domain to infer the domain assignment for images. In experiment, we test our approach on two common image datasets, the results show that our method outperforms the existing state-of-the-art methods, and also show the superiority of multiple latent domain discovery 11:35 Low-Rank Tensor Learning with Discriminant Analysis for Action Classification and Image RecoverySPEAKER: unknownABSTRACT. Tensor completion is an important topic in the area of image processing and computer vision research, which is generally built on extraction of the intrinsic structure of tensor data. However, tensor completion techniques are rarely used for action classification, which heavily relies on the extracted features of high-dimensional tensors as well. In this paper, we proposed a method for video based action classification via low-rank tensor completion. Since there could be distortion and corruption in the tensor, we projected the tensor into the subspace, which contains the invariant structure of the tensor and be used as the input of the classifier. The key point is that we aim to calculate the optimal projection matrices, which have the low-rank structure and are used to obtain the subspace. In order to integrate the useful supervisory information of data, we adopt the discriminant analysis criterion to learn the projection matrices and the augmented Lagrange multiplier (ALM) algorithm to solve the multi-variate optimization problem. We explained the properties of the projection matrices, which indicate the different meanings in the row space, column space and frame space, respectively. Experiments demonstrate that our method has obtained better accuracy compared with some other state-of-the-art low-rank tensor methods on MSR Hand Gesture 3D database and MSR Action 3D database.
10:20-11:50 Session 28D: AAAI-14: Planning and Scheduling
Location: Room 302B
10:20-11:50 Session 28E: AAAI-14: AI and the Web
Location: Room 303A
 10:20 Fraudulent Support Telephone Number Identification Based on Co-occurrence Information on the WebSPEAKER: unknownABSTRACT. "Fraudulent support phones" refers to the misleading telephone numbers placed on Web pages or other media that claim to provide services with which they are not associated. Most fraudulent support phone information is found on search engine result pages (SERPs), and such information substantially degrades the search engine user experience. In this paper, we propose an approach to identify fraudulent support telephone numbers on the Web based on the co-occurrence relations between telephone numbers that appear on SERPs. We start from a small set of seed official support phone numbers and seed fraudulent numbers. Then, we construct a co-occurrence graph according to the co-occurrence relationships of the telephone numbers that appear on Web pages. Additionally, we take the page layout information into consideration on the assumption that telephone numbers that appear in nearby page blocks should be regarded as more closely related. Finally, we develop a propagation algorithm to diffuse the trust scores of seed official support phone numbers and the distrust scores of the seed fraudulent numbers on the co-occurrence graph to detect additional fraudulent numbers. Experimental results based on over 1.5 million SERPs produced by a popular Chinese commercial search engine indicate that our approach outperforms TrustRank, Anti-TrustRank and Good-Bad Rank algorithms by achieving an AUC value of over 0.90. 10:35 Influence Maximization with Novelty Decay in Social NetworksSPEAKER: unknownABSTRACT. Influence maximization problem is to find a set of seed nodes in a social network such that their influence spread is maximized under certain propagation models. A few algorithms have been proposed for this problem. However, they have not considered the impact of novelty decay in influence propagation, i.e., repeated exposures will have diminishing influence on users. In this paper, we consider the problem of influence max- imization with novelty decay (IMND). We investigate the effect of novelty decay on influence propagation in real-life datasets and formulate the IMND problem. We further analyze its relevant properties and propose an influence estimation technique. We demonstrate perfor- mance of our algorithms over four social networks. 10:50 Online Social Spammer DetectionSPEAKER: unknownABSTRACT. The explosive use of social media also makes it a popular platform for malicious users, known as social spammers, to overwhelm normal users with unwanted content. One effective way for social spammer detection is to build a classifier based on content and social network information. However, social spammers are sophisticated and adaptable to game the system with fast evolving content and network patterns. First, social spammers continually change their spamming content patterns to avoid being detected. Second, reflexive reciprocity makes it easier for social spammers to establish social influence and pretend to be normal users by quickly accumulating a large number of human" friends. It is challenging for existing anti-spamming systems based on batch-mode learning to quickly respond to newly emerging patterns for effective social spammer detection. In this paper, we present a general optimization framework to collectively use content and network information for social spammer detection, and provide the solution for efficient online processing. Experimental results on Twitter datasets confirm the effectiveness and efficiency of the proposed framework. 11:05 Stochastic PrivacySPEAKER: unknownABSTRACT. Online services such as web search and e-commerce applications typically rely on the collection of data about users, including details of their activities on the web. Such personal data is used to enhance the quality of service via personalization of content and to maximize revenues via better targeting of advertisements and deeper engagement of users on sites. To date, service providers have largely followed the approach of either requiring or requesting consent for opting-in to share their data. Users may be willing to share private information in return for better quality of service or for incentives, or in return for assurances about the nature and extend of the logging of data. We introduce \emph{stochastic privacy}, a new approach to privacy centering on a simple concept: A guarantee is provided to users about the upper-bound on the probability that their personal data will be used. Such a probability, which we refer to as \emph{privacy risk}, can be assessed by users as a preference or communicated as a policy by a service provider. Service providers can work to personalize and to optimize revenues in accordance with preferences about privacy risk. We present procedures, proofs, and an overall system for maximizing the quality of services, while respecting bounds on allowable or communicated privacy risk. We demonstrate the methodology with a case study and evaluation of the procedures applied to web search personalization. We show how we can achieve near-optimal utility of accessing information with provable guarantees on the probability of sharing data. 11:20 k-CoRating: Filling up Data to Obtain Privacy and UtilitySPEAKER: unknownABSTRACT. For datasets in Collaborative Filtering (CF) recommendations, even if the identifier is deleted and some trivial perturbation operations are applied to ratings before they are released, there are research results claiming that the adversary could discriminate the individual's identity with a little bit of information. In this paper, we propose $k$-coRating, a novel privacy-preserving model, to retain data privacy by replacing some null ratings with significantly predicted scores. They do not only mask the original ratings such that a $k$-anonymity-like data privacy is preserved, but also enhance the data utility (measured by prediction accuracy in this paper), which shows that the traditional assumption that accuracy and privacy are two goals in conflict is not necessarily correct. We show that the optimal $k$-coRated mapping is an NP-hard problem and design a naive but efficient algorithm to achieve $k$-coRating. All claims are verified by experimental results. 11:35 Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo SimulationsSPEAKER: unknownABSTRACT. Influence maximization is a problem to find small sets of highly influential individuals in a social network to maximize the spread of influence under stochastic cascade models of propagation. Although the problem has been well-studied, it is still highly challenging to find solutions of high quality in large-scale networks of the day. While Monte-Carlo-simulation-based methods produce nearly optimal solutions with a theoretical guarantee, they are prohibitively slow for large graphs. As a result, many heuristic methods without any theoretical guarantee have been developed, but all of them substantially compromise solution quality. To address this issue, we propose a new method for the influence maximization problem. Unlike other recent heuristic methods, the proposed method is a Monte-Carlo-simulation-based method, and thus it consistently produces solutions of high quality with the theoretical guarantee. On the other hand, unlike other previous Monte-Carlo-simulation-based methods, it runs as fast as other state-of-the-art methods, and can be applied to large networks of the day. Through our extensive experiments, we demonstrate the scalability and the solution quality of the proposed method.
10:20-11:50 Session 28F: AAAI-14: Game Theory and Economic Paradigms
Location: Room 303B
 10:20 Game-theoretic Resource Allocation for Protecting Large Public EventsSPEAKER: unknownABSTRACT. High profile large scale public events are attractive targets for terrorist attacks. The recent Boston Marathon bombings on April 15, 2013 have further emphasized the importance of protecting public events. The security challenge is exacerbated by the dynamic nature of such events: e.g., the impact of an attack at different locations changes over time as the Boston marathon participants and spectators move along the race track. In addition, the defender can relocate security resources among potential attack targets at any time and the attacker may act at any time during the event. This paper focuses on developing efficient patrolling algorithms for such dynamic domains with continuous strategy spaces for both the defender and the attacker. We aim at computing optimal pure defender strategies, since an attacker does not have an opportunity to learn and respond to mixed strategies due to the relative infrequency of such events. We propose SCOUT-A, which makes assumptions on relocation cost, exploits payoff representation and computes optimal solutions efficiently. We also propose SCOUT-C to compute the exact optimal defender strategy for general cases despite the continuous strategy spaces. SCOUT-C computes the optimal defender strategy by constructing an equivalent game with discrete defender strategy space, then solving the constructed game. Experimental results show that both SCOUT-A and SCOUT-C significantly outperform other existing strategies. 10:35 Fixing a Balanced Knockout TournamentSPEAKER: unknownABSTRACT. Balanced knockout tournaments are one of the most common formats for sports competitions, and are also used in elections and decision-making. We consider the computational problem of finding the optimal draw for a particular player in such a tournament. The problem has generated considerable research within AI in recent years. We prove that checking whether there exists a draw in which a player wins is NP-complete, thereby settling an outstanding open problem. Our main result has a number of interesting implications on related counting and approximation problems. We present a memoization-based algorithm for the problem that is faster than previous approaches. Moreover, we highlight two natural cases that can be solved in polynomial time. All of our results also hold for the more general problem of counting the number of draws in which a given player is the winner. 10:50 Lazy Defenders Are Almost Optimal Against Diligent AttackersSPEAKER: unknownABSTRACT. Most work building on the Stackelberg security games model assumes that the attacker can perfectly observe the defender's randomized assignment of resources to targets. This assumption has been challenged by recent papers, which designed tailor-made algorithms that compute optimal defender strategies for security games with limited surveillance. We analytically demonstrate that in zero-sum security games, lazy defenders, who simply keep optimizing against perfectly informed attackers, are almost optimal against diligent attackers, who go to the effort of gathering a reasonable number of observations. This result implies that, in some realistic situations, limited surveillance may not need to be explicitly addressed. 11:05 Preference Elicitation and Interview Minimization in Stable MatchingsSPEAKER: unknownABSTRACT. While stable matching problems are widely studied, little work has investigated schemes for effectively eliciting agent preferences using either preference (e.g., comparison) queries or interviews (to form such comparisons); and no work has addressed how to combine both. We develop a new model for representing and assessing agent preferences that accommodates both forms of information and (heuristically) minimizes the number of queries and interviews required to determine a stable matching. Our Refine-then-Interview (RtI) scheme uses coarse preference queries to refine knowledge of agent preferences and relies on interviews only to assess comparisons of relatively "close" options. Empirical results show RtI to compare favorably to a recent pure interview minimization algorithm, and that the number of interviews is generally independent of the size of the market. 11:20 Regret Transfer and Parameter OptimizationSPEAKER: unknownABSTRACT. Regret matching is a widely-used algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our knowledge, the first warm-starting method for no-regret learning. It also extends to warm-starting the widely-adopted counterfactual regret minimization (CFR) algorithm for large incomplete-information games; we show this experimentally as well. We then study optimizing a parameter vector for a player in a two-player zero-sum game (e.g., optimizing bet sizes to use in poker). We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warm-start theory to significantly save regret-matching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in no-limit Leduc Hold'em and no-limit Texas Hold'em to optimize bet sizing. This amounts to the first action abstraction algorithm (algorithm for selecting a small number of discrete actions to use from a continuum of actions---a key preprocessing step for solving large games using current equilibrium-finding algorithms) with convergence guarantees for extensive-form games. 11:35 On Detecting Nearly Structured Preference ProfilesSPEAKER: unknownABSTRACT. Structured preference domains, such as, e.g., the domains of single-peaked and single-crossing preferences, are known to admit efficient algorithms for many problems in computational social choice. Some of these algorithms extend to preferences that are close to having the respective structural property, i.e., can be made to enjoy this property by making minor changes to voters’ preferences, such as deleting a small number of voters or candidates. However, it has recently been shown that finding the optimal number of voters or candidates to delete in order to achieve the desired structural property is NP-hard for many such domains. In this paper, we show that these problems admit efficient approximation algorithms. Our results apply to all domains that can be characterized in terms of forbidden configurations; this includes, in particular, single-peaked and single-crossing elections. For a large range of scenarios, our approximation results are optimal under a plausible complexity-theoretic assumption. We also provide parameterized complexity results for this class of problems.
10:20-11:50 Session 28G: IAAI-14: Deployed Applications
Location: Room 304A/B
11:50-13:00Lunch Break
13:00-14:00 Session 29: AAAI-14 Presidential Address
Location: Hall 200A
 13:00 AI and AAAI: Fascinating Research and Engaged CommunitySPEAKER: Manuela VelosoABSTRACT. AI research and development are currently in great demand. As we experience an enormous increase in a wide variety of cyber-physical-social systems and the goals to make them increasingly intelligent, AI is needed in all its multiple and varied technical facets. The talk will make the case for the need of an increased integration of our worldwide research, development, and organizational efforts and initiatives. Highlights of the past, of the recent present, and proposed directions for the future of AAAI will be presented. The talk will include and acknowledge input from the AAAI office, members, and current and past leadership.
14:00-14:40 Session 30A: AAAI-14 Plenary Technical Session III
Location: Hall 200A
 14:00 Feature-Cost Sensitive Learning with Submodular Trees of ClassifiersSPEAKER: unknownABSTRACT. During the past decade, machine learning algorithms have become commonplace in large-scale real-world industrial applications. In these settings, the computation time to train and test machine learning algorithms is a key consideration. At training-time the algorithms must scale to very large data set sizes. At testing-time, the cost of feature extraction can dominate the CPU runtime. Recently, a promising method was proposed to account for the feature extraction cost at testing time, called Cost-sensitive Tree of Classifiers (CSTC). Although the CSTC problem is NP-hard, the authors suggest an approximation through a mixed-norm relaxation across many classifiers. This relaxation is slow to train and requires involved optimization hyperparameter tuning. We propose a different relaxation using approximate submodularity, called Approximately Submodular Tree of Classifiers (ASTC). ASTC is much simpler to implement, yields equivalent results but requires no optimization hyperparameter tuning and is up to two orders of magnitude faster to train. 14:30 A Computational Challenge Problem in Materials Discovery: Synthetic Problem Generator and Real-World DatasetsSPEAKER: unknownABSTRACT. Newly-discovered materials have been central to recent technological advances. They have contributed significantly to breakthroughs in electronics, renewable energy and green buildings, and overall, have promoted the advancement of global human welfare. Yet, only a fraction of all possible materials have been explored. Accelerating the pace of discovery of materials would foster technological innovations, and would potentially address pressing issues in sustainability, such as energy production or consumption. The bottleneck of this discovery cycle lies, however, in the analysis of the materials data. As materials scientists have recently devised techniques to efficiently create thousands of materials and experimentalists have developed new methods and tools to characterize these materials, the limiting factor has become the data analysis itself. Hence, the goal of this paper is to stimulate the development of new computational techniques for the analysis of materials data, by bringing together the complimentary expertise of materials scientists and computer scientists. In collaboration with two major research laboratories in materials science, we provide the first publicly available dataset for the phase map identification problem. In addition, we provide a parameterized synthetic data generator to assess the quality of proposed approaches, as well as tools for data visualization and solution evaluation. 15:00 Collaborative Models for Referring Expression Generation in Situated DialogueSPEAKER: unknownABSTRACT. In situated dialogue with artificial agents (e.g., robots), although a human and an agent are co-present, the agent's representation and the human's representation of the shared environment are significantly mismatched. Because of this misalignment, previous work has shown that when the agent applies traditional approaches to generate referring expressions to describe target objects, the intended objects often cannot be correctly identified by the human. To address this problem, motivated by collaborative behaviors in human referential communication, we have developed two collaborative models - an episodic model and an installment model - for referring expression generation. In both models, instead of generating a single referring expression to describe a target object as in the previous work, it generates multiple small expressions that lead to the target object with a goal to minimize the collaborative effort. In particular, our installment model incorporates human feedback in a reinforcement learning framework to learn the optimal generation strategies. Our empirical results have shown that the episodic model and the installment model outperform previous non-collaborative models with an absolute gain of 6% and 21% respectively. 15:30 On Boosting Sparse ParitiesSPEAKER: Lev ReyzinABSTRACT. While the ensemble method of boosting has been extensively studied, considerably less attention has been devoted to the task of designing good weak learning algorithms. In this paper we consider the problem of designing weak learners that are especially adept to the boosting procedure and specifically the AdaBoost algorithm. First we describe conditions desirable for a weak learning algorithm. We then propose using sparse parity functions as weak learners, which have many of our desired properties, as weak learners in boosting. Our experimental tests show the proposed weak learners to be competitive with the most widely used ones: decision stumps and pruned decision trees. 16:00 Regret-Based Multi-Agent Coordination with Uncertain Task RewardsSPEAKER: Feng WuABSTRACT. Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overall worst-case loss. This is a challenging problem for centralized algorithms because the search space grows exponentially with the number of agents and is nontrivial for existing algorithms for standard DCOPs. To address this, we propose a novel decentralized algorithm that incorporates Max-Sum with iterative constraint generation to solve the problem by passing messages among agents. By so doing, our approach scales well and can solve instances of the task allocation problem with hundreds of agents and tasks. 16:30 How Long Will It Take? Accurate Prediction of Ontology Reasoning PerformanceSPEAKER: unknownABSTRACT. For expressive ontology languages such as OWL 2 DL, classification is a computationally expensive task---2\textsc{NExpTime}-complete in the worst case. Hence, it is highly desirable to be able to accurately estimate classification time, especially for large and complex ontologies. Recently, machine learning techniques have been successfully applied to predicting the reasoning \emph{hardness category} for a given (ontology, reasoner) pair. In this paper, we further develop predictive models to estimate actual classification time using regression techniques, with ontology metrics as features. Our large-scale experiments on 6 state-of-the-art OWL 2 DL reasoners and more than 450 significantly diverse ontologies demonstrate that the prediction models achieve high accuracy, good generalizability and statistical significance. Such prediction models have a wide range of applications. We demonstrate how they can be used to efficiently and accurately identify \emph{performance hotspots} in an large and complex ontology, an otherwise very time-consuming and resource-intensive task. 17:00 Avoiding Plagiarism in Markov Sequence GenerationSPEAKER: unknownABSTRACT. Markov processes are widely used to generate sequences that imitate a given style, using random walk. Random walk generates sequences by iteratively concatenating states to prefixes of length equal or less than the given Markov order. However, at higher orders, Markov chains tend to replicate chunks of the corpus with a size possibly higher than the order, a primary form of plagiarism. In fact, the Markov order defines a maximum length for training but not for generation. In the framework of constraint satisfaction (CSP), we introduce MaxOrder. This global constraint ensures that generated sequences do not include chunks larger than a given maximum order. We exhibit an automaton that recognises the solution set, with a size linear in the size of the corpus. We propose a linear-time procedure to generate this automaton from a corpus and a given max order. We then use this automaton to achieve generalised arc consistency for the MaxOrder constraint, holding on a sequence of size n, in O(n.T) time, where T is the size of the automaton. We illustrate our approach by generating text sequences from text corpora with a maximum order guarantee, effectively controlling plagiarism. 17:30 Fast Multi-Instance Multi-Label LearningSPEAKER: unknownABSTRACT. In multi-instance multi-label learning (MIML), one object is represented by multiple instances and simultaneously associated with multiple labels. Existing MIML approaches have been found useful in many applications; however, most of them can only handle moderate-sized data. To efficiently handle large data sets, we propose the MIMLfast approach, which first constructs a low-dimensional subspace shared by all labels, and then trains label specific linear models to optimize approximated ranking loss via stochastic gradient descent. Although the MIML problem is complicated, MIMLfast is able to achieve excellent performance by exploiting label relations with shared space and discovering sub-concepts for complicated labels. Experiments show that the performance of MIMLfast is highly competitive to state-of-the-art techniques, whereas its time cost is much less; particularly, on a data set with 30K bags and 270K instances, where none of existing approaches can return results in 24 hours, MIMLfast takes only 12 minutes. Moreover, our approach is able to identify the most representative instance for each label, and thus providing a chance to understand the relation between input patterns and output semantics. 18:00 Modeling and Mining Spatiotemporal Patterns of Infection Risk from Heterogeneous Data for Active Surveillance PlanningSPEAKER: unknownABSTRACT. Active surveillance is a desirable way to prevent the spread of infectious diseases in that it aims to timely discover individual incidences through an active searching for patients. However, in practice active surveillance is difficult to implement especially when monitoring space is large but the available resources are limited. Therefore, it is extremely important for public health authorities to know how to distribute their very sparse resources to high-priority regions so as to maximize the outcomes of active surveillance. In this paper, we raise the problem of active surveillance planning and provide an effective method to address it via modeling and mining spatiotemporal patterns of infection risk from heterogeneous data sources. Taking malaria as an example, we perform an empirical study on real-world data to validate our method and provide our new findings. 18:30 Scalable Sparse Covariance Estimation via Self-ConcordanceSPEAKER: unknownABSTRACT. We consider the class of convex minimization problems, composed of a self-concordant function, such as the logdet metric, a convex data fidelity term h() and, a regularizing -- possibly non-smooth -- function g(), accompanied with an easily computable proximity operator. These type of problems have recently attracted a great deal of interest, mainly due to their omnipresence in top-notch applications. Under this locally Lipschitz continuous gradient setting, we analyze the convergence behavior of proximal Newton schemes with the added twist of a probable presence of inexact evaluations; a scenario that has not been considered yet, to the best of our knowledge. By using standard convex tools combined with self-concordance machinery, we provide a concise convergence theory with attractive convergence rate guarantees, and enhance state-of-the-art optimization schemes to accommodate such developments. Experimental results on sparse covariance estimation show the merits of our algorithm, both in terms of recovery efficiency and complexity, rendering the proposed framework a suitable choice for such problems. 19:00 Spatio-Temporal Consistency as a Means to Identify Unlabeled Objects in a Continuous Data FieldSPEAKER: unknownABSTRACT. Mesoscale ocean eddies are a critical component of the Earth System as they dominate the ocean's kinetic energy and impact the global distribution of oceanic heat, salinity, momentum, and nutrients. Thus, accurately representing these dynamic features is critical for our planet's sustainability. The majority of methods that identify eddies from satellite observations analyze the data in a frame-by-frame basis despite the fact that eddies are dynamic objects that propagate across space and time. We introduce the notion of spatio-temporal consistency to identify eddies in a continuous spatio-temporal field, to simultaneously ensure that the features detected are both spatially consistent and temporally persistent. Our spatio-temporal consistency approach allows us to remove most of the expert criteria used in traditional methods and enables us to better render eddy dynamics by identifying smaller and longer lived eddies than existing methods. 19:30 Active Learning for Crowdsourcing Using Knowledge TransferSPEAKER: unknownABSTRACT. This paper studies the problems associated with active learning, in which multiple users with varying levels of expertise are available for labeling data. Annotations collected from different users may be noisy and unreliable, and the quality of labeled data needs to be maintained for data mining tasks. Previous solutions have included estimating individual user reliability based on existing knowledge in each task, but for this to be effective each task requires a large quantity of labeled data to provide accurate estimates. In practice, annotation budgets for a given target task are limited, so each example can be presented to only a few users, each of whom can only label a few examples. To overcome data scarcity we propose a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise. Based on this model we present a novel active learning algorithm that: a) simultaneously selects the most informative example and b) queries its label from the labeler with the best expertise. Experiments on both text and image datasets demonstrate that our proposed method outperforms other state-of-the-art active learning methods. 20:00 Sequential Click Prediction for Sponsored Search with Recurrent Neural NetworksSPEAKER: unknownABSTRACT. Click prediction is one of the fundamental problems in sponsored search. Most of existing studies took advantage of machine learning approaches to predict ad click for each event of ad view independently. While these studies aimed at providing a stationary interpretation on ad clicks, they were lack of the capability to understand user clicks in a dynamic way. As observed in real-world sponsored search system, user's behavior on the ad yield high dependency on how the user previously behaved along the time, especially in terms of the queries user submitted, click / non-click on ads, dwell time on landing page, etc. Inspired by these observations, we introduce a novel framework based on Recurrent Neural Networks (RNN) to model user's sequential behaviors into the click prediction process. Compared to traditional methods, this framework aims at effective click prediction by leveraging not only user's stationary historical behaviors but the rich information and patterns implied by user's dynamic sequential behaviors. Large scale evaluations on the click-through logs from a commercial search engine demonstrate that our approach can significantly improve the click prediction accuracy, compared to other time-independent approaches. 20:30 Learning Parametric Models for Social Infectivity in Multi-dimensional Hawkes ProcessesSPEAKER: unknownABSTRACT. Efficient and effective learning of social infectivity is a critical challenge in modeling diffusion phenomenons in social networks and other applications. Existing methods require substantial amount of event cascades to guarantee the learning accuracy, while only time-invariant infectivity is considered. Our paper overcomes those two drawbacks by constructing a more compact model and parameterizing the infectivity using time-varying features, thus dramatically reduces the data requirement, and enable the learning of time-varying infectivity which also takes into account the underlying network topology. We replace the pairwise infectivity in the multidimensional Hawkes processes with linear combinations of those time-varying features, and optimize the associated coefficients with lasso regularization on coefficients. To efficiently solve the resulting optimization problem, we employ the technique of alternating direction method of multipliers, and under that framework update each coefficient independently, by optimizing a surrogate function which upper-bounds the original objective function. On both synthetic and real world data, the proposed method performs better than alternatives in terms of both recovering the hidden diffusion network and predicting the occurrence time of social events. 21:00 A Convex Formulation for Semi-supervised Multi-Label Feature SelectionSPEAKER: unknownABSTRACT. Explosive growth of multimedia data has brought challenge of how to efficiently browse, retrieve and organize these data. Under this circumstance, different approaches have been proposed to facilitate multimedia analysis. Several semi-supervised feature selection algorithms have been proposed to exploit both labeled and unlabeled data. However, they are implemented based on graphs, such that they cannot handle large-scale datasets. How to conduct semi-supervised feature selection on large-scale datasets has become a challenging research problem. Moreover, existing multi-label feature selection algorithms rely on eigen-decomposition with heavy computational burden, which further prevent current feature selection algorithms from being applied for big data. In this paper, we propose a novel semi-supervised multi-label feature selection for large-scale multimedia analysis. We evaluate performance of the proposed algorithm over five benchmark datasets and compare the results with state-of-the-art supervised and semi-supervised feature selection algorithms as well as baseline using all features. The experimental results demonstrate that our proposed algorithm consistently achieve superiors performances. 21:30 TacTex’13: A Champion Adaptive Power Trading AgentSPEAKER: unknownABSTRACT. Sustainable energy systems of the future will no longer be able to rely on the current paradigm that energy supply follows demand. Many of the renewable energy resources do not necessarily produce the energy when it is needed, and therefore there is a need for new market structures that motivate sustainable behaviors by participants. The Power Trading Agent Competition ($\powertac$) is a new annual competition that focuses on the design and operation of future retail power markets, specifically in smart grid environments with renewable energy production, smart metering, and autonomous agents acting on behalf of customers and retailers. It uses a rich, open-source simulation platform that is based on real-world data and state-of-the-art customer models. Its purpose is to help researchers understand the dynamics of customer and retailer decision-making, as well as the robustness of proposed market designs. This paper introduces OurAgent'13, the champion agent from the inaugural competition in 2013. OurAgent is an adaptive agent that learns and reacts to the environment in which it operates, by heavily relying on reinforcement-learning and prediction methods. This paper describes the constituent components of our agent and examines the success of the complete agent through analysis of competition results and subsequent controlled experiments.
14:00-15:30 Session 30B: IAAI-14: Interfaces and Patterns
Chair: Reid Smith
Location: Room 304A/B
 14:00 Emerging: A Smart Range Helping Cognitively-Impaired Persons CookingSPEAKER: Bruno BouchardABSTRACT. People suffering from a loss of autonomy caused by a cognitive deficit generally have to perform important daily tasks (such as cooking) using devices and appliances designed for healthy people, which do not take into consideration their cognitive impairment. Using these devices is risky and may lead to a tragedy (e.g. fire). A potential solution to this issue is to provide automated systems, which perform tasks on behalf of the patient. However, clinical studies have shown that encouraging users to maintain their autonomy greatly help to preserve health, dignity, and motivation. Therefore, we present in this paper a new smart range prototype allowing monitoring and guiding a cognitively-impaired user in the activity of preparing a meal. This new original prototype is capable of giving adapted prompting to the user in the completion of several recipes by exploiting load cells, heat sensors and electromagnetic contacts embedded in the range. We currently own a provisional patent on this new invention, and we completed a first experimental phase. 14:20 Emerging: A Speech-Driven Second Screen Application for TV Program DiscoverySPEAKER: Peter Z. YehABSTRACT. In this paper, we present a speech-driven second screen application for TV program discovery. We give an overview of the application and its architecture. We also present a user study along with a failure analysis. The results from the study are encouraging, and demonstrate our application's effectiveness in the target domain. We conclude with a discussion of follow-on efforts to further enhance our application. 14:40 Emerging: Clustering Species Accumulation Curves to Identify Skill Levels of Citizen Scientists Participating in the eBird ProjectSPEAKER: Jun YuABSTRACT. Although citizen science projects such as eBird can compile large volumes of data over broad spatial and temporal extents, the quality of this data is a concern due to differences in the skills of volunteers at identifying bird species. Species accumulation curves, which plot the number of unique species observed over time, are an effective way to quantify the skill level of an eBird participant. Intuitively, more skilled observers can identify a greater number of species per unit time than inexperienced birders, resulting in a steeper curve. We propose a mixture model for clustering species accumulation curves. These clusters enable the identification of distinct skill levels of eBird participants, which can then be used to build more accurate species distribution models and to develop automated data quality filters. 15:00 Emerging: Pattern Discovery in Protein Networks Reveals High-Confidence Predictions of Novel InteractionsSPEAKER: Hazem Radwan AhmedABSTRACT. Pattern discovery in protein interaction networks can reveal crucial biological knowledge on the inner workings of cellular machinery. Although far from complete, extracting meaningful patterns from proteomic networks is a non-trivial task due to their size-complexity. This paper proposes a computational framework to efficiently discover topologically-similar patterns from large proteomic networks using Particle Swarm Optimization (PSO). PSO is a robust and low-cost optimization technique that demonstrated to work effectively on the complex, mostly sparse proteomic networks. The resulting topologically-similar patterns of close proximity are utilized to systematically predict new high-confidence protein-protein interactions (PPIs). The proposed PSO-based PPI prediction method (3PI) managed to predict high-confidence PPIs, validated by more than one computational/experimental source, through a proposed PPI knowledge transfer process between topologically-similar interaction patterns of close proximity. In three case studies, over 50% of the predicted interactions for EFGR, ERBB2, ERBB3, GRB2 and UBC are overlapped with publically available interaction databases, ~80% of the predictions are found among the Top 1% results of another PPI prediction method and their genes are significantly co-expressed across different tissues. Moreover, the only single prediction example that did not overlap with any of our validation sources was recently experimentally supported by two PubMed publications.
14:45-15:30 Session 31A: AAAI-14: Machine Learning Applications
Location: Room 301A
 14:45 Accurate Integration of Aerosol Predictions by Smoothing on a ManifoldSPEAKER: unknownABSTRACT. Accurately measuring the aerosol optical depth (AOD) is essential for our understanding of the climate. Currently, AOD can be measured by (i) satellite instruments, which operate on a global scale but have limited accuracies; and (ii) ground-based instruments, which are more accurate but not widely available. Recent approaches focus on integrating measurements from these two sources to complement each other. In this paper, we further improve the prediction accuracy by using the observation that the AOD varies slowly in the spatial domain. Using a probabilistic approach, we impose this smoothness constraint by a Gaussian random field on the Earth's surface, which can be considered as a two-dimensional manifold. The proposed integration approach is computationally simple, and experimental results on both synthetic and real-world data sets show that it significantly outperforms the state-of-the-art. 15:00 Robust Distance Metric Learning in the Presence of Label NoiseSPEAKER: unknownABSTRACT. Many distance learning algorithms have been developed in recent years. However, few of them consider the problem when the class labels of training data are noisy, and this may lead to serious performance deterioration. In this paper, we present a robust distance learning method in the presence of label noise, by extending a previous non-parametric discriminative distance learning algorithm, i.e., Neighbourhood Components Analysis (NCA). Particularly, we model the conditional probability of each point’s true label over all the possible classe labels, and use it for a more robust estimation of intra-class scatter matrix. The model is then optimized within the EM framework. In addition, considering that the model tends to be complex under the situation of label noise, we propose to regularize its objective function to prevent overﬁtting. Our experiments on several UCI datasets and a real dataset with unknown noise patterns show that the proposed RNCA is more tolerant to class label noise compared to the original NCA method. 15:15 Predicting Postoperative Atrial Fibrillation from Independent ECG ComponentsSPEAKER: unknownABSTRACT. Postoperative atrial fibrillation (PAF) occurs in 10\% to 65\% of the patients undergoing cardiac surgery. It is associated with increased postoperative mortality and morbidity, and also results in longer and more expensive hospital stays. Accurately stratifying patients for PAF allows for the selective use of prophylactic therapies (e.g., amiodarone) to reduce this burden. Our proposed work addresses this need through the development of novel electrocardiographic (ECG) markers that can be easily deployed in a clinical setting to identify patients at risk of PAF. Specifically, we explore a novel eigen-decomposition approach that first partitions ECG signals into atrial and ventricular components by exploiting knowledge of the underlying cardiac cycle. We then quantify cardiac instability manifesting as probabilistic variations in atrial ECG morphology to assess the risk of PAF. When evaluated on a cohort of 385 patients undergoing cardiac surgery, our proposed approach based on an analysis of decoupled ECG components demonstrated substantial promise in identifying patients at risk of PAF and improved clinical models (both in terms of discrimination and reclassification) relative to the use of existing clinical metrics.
14:45-15:30 Session 31B: AAAI-14: Heuristic Search and Optimization / RL
Location: Room 301B
 14:45 Generalizing Policy Advice with Gaussian Process Bandits for Dynamic Skill ImprovementSPEAKER: unknownABSTRACT. We present a ping-pong-playing robot that learns to improve its swings with human advice. Our method learns a reward function over the joint space of task and policy parameters. This allows the robot to explore policy space more intelligently by leveraging active learning techniques to explore the reward surface in a way that trades off exploration vs. exploitation to maximize the total cumulative reward over time. Multimodal stochastic polices can also easily be learned with this approach when the reward function is multimodal in the policy parameters. We extend the recently-developed Gaussian Process Bandit Optimization framework to include advice from human domain experts. 15:00 Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior NoiseSPEAKER: unknownABSTRACT. Inverse reinforcement learning (IRL) aims to recover the reward function underlying a Markov Decision Process from behaviors of experts in support of decision-making. Most recent work on IRL assumes the same level of trustworthiness of all expert behaviors, and frames IRL as a process of seeking reward function that makes those behaviors appear (near)-optimal. However, it is common in reality that noisy expert behaviors disobeying the optimal policy exist, which may degrade the IRL performance significantly. To address this issue, in this paper, we develop a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise. In particular, we focus on a special type of behavior noise referred to as sparse noise due to its wide popularity in real-world behavior data. To model such noise, we introduce a novel latent variable characterizing the reliability of each expert action and use Laplace distribution as its prior. We then device an EM algorithm with a novel variational inference procedure in the E-step, which can automatically identify and remove behavior noise in reward learning. Experiments on both synthetic data and real vehicle routing data with noticeable behavior noise show significant improvement of our method over previous approaches in learning accuracy, and also demonstrate its power in de-noising behavior data. 15:15 Multiagent Metareasoning Through Organizational DesignSPEAKER: unknownABSTRACT. We formulate an approach to multiagent metareasoning that uses organizational design to focus each agent's reasoning on the aspects of their respective local problems to which they can make the most worthwhile contributions to joint behavior. By employing the decentralized Markov decision process framework, we characterize an organizational design problem that explicitly considers the quantitative impact that a design has on both the quality of the agents' behaviors and their reasoning costs. We describe an automated organizational design process that can approximately solve our organizational design problem via incremental search, and present techniques that efficiently estimate the incremental impact of a candidate organizational influence. Our empirical evaluation confirms that our process generates organizational designs that impart a desired metareasoning regime upon the agents.
14:45-15:30 Session 31C: AAAI-14: Novel Machine Learning Algorithms
Location: Room 302A
 14:45 Predicting the Hardness of Learning Bayesian NetworksSPEAKER: Brandon MaloneABSTRACT. There are various algorithms for finding a Bayesian network structure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve the problem. Estimating the running times directly is extremely difficult as they are complicated functions of the instance. The main contribution of this paper is characterization of the empirical hardness of an instance for a given algorithm based on a novel collection of non-trivial, yet efficiently computable features. Our empirical results, based on the largest evaluation of state-of-the-art BNS learning algorithms to date, demonstrate that we can predict the runtimes to a reasonable degree of accuracy, and effectively select algorithms that perform well on a particular instance. Moreover, we also show how the results can be utilized in building a portfolio algorithm that combines several individual algorithms in an almost optimal manner. 15:00 Relational One-Class Classification: A Non-Parametric ApproachSPEAKER: unknownABSTRACT. One-class classification approaches have been proposed in literature to learn classifiers from examples of only one class. But these approaches are not directly applicable to relational domains due to their reliance on feature vectors or distance measure. We propose a non-parametric relational one-class classification approach based on first-order trees. We learn a tree-based distance measure that iteratively introduces new relational features to differentiate relational examples. We update the distance measure so as to maximize the one-class classification performance of our model. We also relate our model definition to existing work on combination functions and density estimation. We also experimentally show that our approach can discover relevant features for this task and outperform three baseline approaches. 15:15 User Group Oriented Temporal Dynamics ExplorationSPEAKER: unknownABSTRACT. Dynamic online content becomes the zeitgeist. Vibrant user groups are essential participants and promoters behind it. Studying temporal dynamics is a valuable means to uncover group characters. Since different user groups usually tend to have highly diverse interest and show distinct behavior patterns, they will exhibit varying temporal dynamics. However, most current work only use global trends of temporal topics and fail to distinguish such fine-grained patterns across groups. In this paper, we propose GrosToT (Group Specific Topics-over-Time), a unified probabilistic model which infers latent user groups and temporal topics. It can model group-specific temporal topic variation from social media stream. By leveraging the comprehensive group-specific temporal patterns, GrosToT significantly outperforms state-of-the-art dynamics modeling methods. Our proposed approach shows advantage not only in temporal dynamics modeling but also group content exploration. Based on GrosToT, we uncover the interplay between group interest and temporal dynamics. Specifically, we find that group's attention to their medium-interested topics are event-driven, showing rich bursts; while its engagement in group's dominating topics are interest-driven, remaining stable over time. The dynamics over different groups vary and reflect the groups' intention.
14:45-15:30 Session 31D: AAAI-14: Multiagent Systems
Location: Room 302B
14:45-15:30 Session 31E: AAAI-14: AI and the Web / Vision
Location: Room 303A
 14:45 Cross-View Feature Learning for Scalable Social Image AnalysisSPEAKER: unknownABSTRACT. Nowadays images on social networking websites (e.g., Flickr) are mostly accompanied with user-contributed tags, which help cast a new light on the conventional content-based image analysis tasks such as image classification and retrieval. In order to establish a scalable social image analysis system, two issues need to be considered: 1) Supervised learning is a futile task in modeling the enormous number of concepts in the world, whereas unsupervised approaches overcome this hurdle; 2) Algorithms are required to be both spatially and temporally efficient to handle large-scale datasets. In this paper, we propose a cross-view feature learning (CVFL) framework to handle the problem of social image analysis effectively and efficiently. Through explicitly modeling the relevance between image content and tags (which is empirically shown to be visually and semantically meaningful), CVFL yields more promising results than existing methods in the experiments. More importantly, being general and descriptive, CVFL and its variants can be readily applied to other large-scale multi-view tasks in unsupervised setting. 15:00 Unsupervised Alignment of Natural Language Instructions with Video SegmentsSPEAKER: unknownABSTRACT. We propose an unsupervised learning algorithm for automatically inferring the mappings between English nouns and corresponding video objects. Given a sequence of natural language instructions and an unaligned video recording, we simultaneously align each instruction to its corresponding video segment, and also align nouns in each instruction to their corresponding objects in video. While existing grounded language acquisition algorithms rely on pre-aligned supervised data (each sentence paired with corresponding image frame or video segment), our algorithm aims to automatically infer the alignment from the temporal structure of the video and parallel text instructions. We propose two generative models that are closely related to the HMM and IBM 1 word alignment models used in statistical machine translation. We evaluate our algorithm on videos of biological experiments performed in wetlabs, and demonstrate its capability of aligning video segments to text instructions and matching video objects to nouns in the absence of any direct supervision. 15:15 Experiments on Visual Information Extraction with the Faces of WikipediaSPEAKER: unknownABSTRACT. We present a series of visual information extraction experiments using the Faces ofWikipedia database - a new resource that we release into the public domain for both recognition and extraction research containing over 50,000 identities and 60,000 disambiguated images of faces. We compare different techniques for automatically extracting the faces corresponding to the subject of a Wikipedia biography within the images appearing on the page. Our top performing approach is based on probabilistic graphical models and uses the text of Wikipedia pages, similarities of faces as well as various other features of the document, meta-data and image files. Our method resolves the problem jointly for all detected faces on a page. While our experiments focus on extracting faces from Wikipedia biographies, our approach is easily adapted to other types of documents and multiple documents. We focus onWikipedia because the content is a Creative Commons resource and we provide our database to the community including registered faces, hand labeled and automated disambiguations, processed captions, meta data and evaluation protocols. Our best probabilistic extraction pipeline yields an expected average accuracy of 77% compared to image only and text only baselines which yield 66% and 63% respectively.
14:45-15:30 Session 31F: AAAI-14: Game Theory and Economic Paradigms
Location: Room 303B
15:30-16:00Coffee Break
15:30-16:15 Session 32: AAAI-14 Poster Session II:C (Also includes Plenary Session III Papers)
Location: Hall 200C
 15:30 Learning Concept Embeddings for Query Expansion by Quantum Entropy MinimizationSPEAKER: unknownABSTRACT. In web search, users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query, the technique of query expansion (QE) consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper, we propose a novel method for learning, in a supervised way, semantic representations for words and phrases. By embedding queries and documents in special matrices, our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR mesures beyond expansion from current word-embeddings models and well-established traditional QE methods. 15:30 Learning Unknown Event ModelsSPEAKER: unknownABSTRACT. Agents with incomplete models of their environment are likely to be surprised. For agents in immense environments that defy complete modeling, this represents an opportunity to learn. We investigate approaches for situated agents to detect surprises, discriminate among different forms of surprise, and hypothesize new models for the unknown events that surprised them. We instantiate these approaches in a new goal reasoning agent (named FOOLMETWICE), investigate its performance in simulation studies, and show that it produces plans with significantly reduced execution cost when compared to not learning models for surprising events. 15:30 An Adversarial Interpretation of Information-Theoretic Bounded RationalitySPEAKER: unknownABSTRACT. Recently, there has been a growing interest in modelling planning with information constraints. Accordingly, an agent maximizes a regularized expected utility known as the free energy, where the regularizer is given by the information divergence from a prior to a posterior policy. While this approach can be justified in various ways, most importantly from statistical mechanics and information theory, it is still unclear how it relates to game theory. This connection has been suggested previously in work relating the free energy to risk-sensitive control and to extensive form games. In this work, we present an adversarial interpretation that is equivalent to the free energy optimization problem. The adversary can, by paying an exponential penalty, generate costs that diminish the decision maker's payoffs. It turns out that the optimal strategy of the adversary consists in choosing costs so as to render the decision maker indifferent among its choices, which is a definining property of a Nash equilibrium, thus tightening the connection between free energy optimization and game theory. 15:30 Flexible and Scalable Partially Observable Planning with Linear TranslationsSPEAKER: unknownABSTRACT. The problem of on-line planning in partially observable settings involves two problems: keeping track of beliefs about the environment, and selecting actions for achieving goals. While the two problems are computationally intractable in the worst case, significant progress has been achieved in recent years through the use of suitable reductions. In particular, the state-of-the-art CLG planner is based on a translation that maps deterministic partially observable problems into fully observable nondeterministic ones. The translation, which is quadratic in the number of problem fluents and gets rid of the belief tracking problem, is adequate for most benchmarks; it is in fact complete for problems that have width 1. The more recent K-replanner uses two translations that are linear, one for keeping track of beliefs and the other for selecting actions using off-the-shelf classical planners. As a result, the K-replanner scales up better but is not as general as CLG. In this work, we combine the benefits of these two approaches, the scope of the CLG planner and the efficiency of the K-replanner, by introducing a new planner, called LW1, that is based on a translation that is linear but which is complete for width-1 problems. The scope and performance of the new planner is evaluated by considering the existing benchmarks and new problems. 15:30 A Generalized Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles of Complex TypesSPEAKER: unknownABSTRACT. In this paper we introduce new types of square-piece jigsaw puzzles, where in addition to the unknown location and orientation of each piece, a piece might also need to be flipped. These puzzles, which are associated with a number of real world problems, are considerably harder, from a computational standpoint. Specifically, we present a novel generalized genetic algorithm (GA)-based solver that can handle puzzle pieces of unknown location and orientation (Type 2 puzzles), and puzzle pieces of unknown location, orientation and attitude (Type 4 puzzles). To the best of our knowledge, our solver provides a new state-of-the-art, solving previously attempted puzzles faster and far more accurately, solving puzzle sizes that have never been attempted before, and solving the newly introduced double-sided puzzle types automatically and effectively. This paper also presents, among other results, the most extensive set of experimental results compiled as of yet, on Type 2 puzzles. 15:30 The Fisher Market Game: Equilibrium and WelfareSPEAKER: unknownABSTRACT. The Fisher market model is one of the most fundamental resource allocation models in economics. In a Fisher market, the prices and allocations of goods are determined according to the preferences and budgets of buyers to clear the market. In a Fisher market game, however, buyers are strategic and report their preferences over goods; the market-clearing prices and allocations are then determined based on their reported preferences rather than their real preferences. We show that the Fisher market game always has a pure Nash equilibrium, for buyers with linear, Leontief, and Cobb-Douglas utility functions, which are three representative classes of utility functions in the important Constant Elasticity of Substitution (CES) family. Furthermore, to quantify the social efficiency, we prove Price of Anarchy bounds for the game when the utility functions of buyers fall into these three classes respectively. 15:30 Bounding the Support Size in Extensive Form Games with Imperfect InformationSPEAKER: unknownABSTRACT. It's a well known fact that in extensive form games with perfect information, there is a Nash equilibrium with support of size one. This doesn't hold for games with imperfect information, where the size of minimal support can be larger. We present a dependency between the level of uncertainty and the minimum support size. For many games, there is a big disproportion between the game uncertainty and the number of actions available. In Bayesian extensive games with perfect information, the only uncertainty is about the type of players. In card games, the uncertainty comes from dealing the deck. In these games, we can significantly reduce the support size. Our result applies to general-sum extensive form games with any finite number of players. 15:30 Point-Based POMDP Solving with Factored Value Function ApproximationSPEAKER: unknownABSTRACT. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for modeling autonomous decision-making problems. POMDP solutions are often represented by a value function comprised of a set of vectors. In the case of factored models, the size of these vectors grows exponentially with the number of state factors, leading to scalability issues. We consider an approximate value function representation based on a linear combination of basis functions. In particular, we present a backup operator that can be used in any point-based POMDP solvers. Furthermore, we show how independence between observation factors can be exploited for large computational gains. We experimentally verify our contributions and show that they can improve point-based methods in policy quality and solution size. 15:30 A Propagator Design Framework for Constraints over SequencesSPEAKER: unknownABSTRACT. Constraints over variable sequences are ubiquitous and many of their propagators have been inspired by dynamic programming (DP). We propose a conceptual framework for designing such propagators: pruning rules are refined upon the application of transformation operators to a DP-style formulation of a constraint; a representation of the variable domains is picked; and a coordination of the pruning rules is picked. 15:30 Pathway Specification and Comparative Queries: A High Level Language with Petri Net SemanticsSPEAKER: unknownABSTRACT. Understanding biological pathways is an important activity in the biological domain for drug development. Due to the parallelism and complexity inherent in pathways, computer models that can answer queries about pathways are needed. A researcher may ask `what-if'' questions comparing alternate scenarios, that require deeper understanding of the underlying model. In this paper, we present overview of such a system we developed and an English-like high level language to expressed pathways and queries. Our language is inspired by high level action and query languages and it uses Petri Net execution semantics. 15:30 A Relevance-Based Compilation Method for Conformant Probabilistic PlanningSPEAKER: unknownABSTRACT. Conformant probabilistic planning (CPP) differs from conformant planning (CP) by two key elements: the initial belief state is probabilistic, and the conformant plan must achieve the goal with probability $\geq\theta$, for some $0<\theta\leq 1$. Taig and Brafman observed that one can reduce CPP to CP by finding a set of initial states whose probability $\geq\theta$, for which a conformant plan exists. Previous solvers based on this idea used the underlying planner to select this set of states and to plan for them simultaneously. Here, we suggest an alternative approach: Our planner starts with a separate preprocessing relevance analysis phase that determines a promising set of initial states on which to focus, and then calls an off-the-shelf conformant planner to solve the resulting problem. This approach has three major advantages. First, we can introduce specific and efficient relevance reasoning techniques for introducing the set of initial states, rather than depend on the heuristic function used by the planner. Second, we can benefit from various optimizations used by existing conformant planners which are unsound if applied to the original CPP. Finally, we have the freedom to select among different existing CP solvers. Consequently, the new planner dominates previous solvers on almost all domains and scales to instances that were not solved before. 15:30 Simpler Bounded Suboptimal SearchSPEAKER: unknownABSTRACT. It is commonly appreciated that solving search problems optimally can take too long. Bounded suboptimal search algorithms trade increased solution cost for reduced solving time. Explicit Estimation Search (EES) is a recent state-of-the-art algorithm specifically designed for bounded suboptimal search. Although it tends to expand fewer nodes than alternative algorithms, such as weighted A* (WA*), its per-node expansion overhead is much higher, causing it to sometimes take longer. In this paper, we present simplified variants of EES (SEES) and an earlier algorithm, A*epsilon (SA*epsilon), that use different implementations of the same motivating ideas to significantly reduce search overhead and implementation complexity. In an empirical evaluation, we find that SEES, like EES, outperforms classic bounded suboptimal search algorithms, such as WA*, on domains tested where distance-to-go estimates enable better search guidance. We also confirm that, while SEES and SA*epsilon expand roughly the same number of nodes as their progenitors, they solve problems significantly faster and are much easier to implement. This work widens the applicability of state-of the-art bounded suboptimal search by making it easier to deploy. 15:30 Improving Context and Category Matching for Entity SearchSPEAKER: unknownABSTRACT. Entity search is to retrieve a ranked list of named entities of target types to a given query. In this paper, we propose an approach of entity search by formalizing both context matching and category matching. In addition, we propose a result re-ranking strategy that can be easily adapted to achieve a hybrid of two context matching strategies. Experiments on the INEX 2009 entity ranking task show that the proposed approach achieves a significant improvement of the entity search performance (xinfAP from 0.27 to 0.39) over the existing solutions. 15:30 Data Quality in Ontology-based Data Access: The Case of ConsistencySPEAKER: unknownABSTRACT. Ontology-based data access (OBDA) is a new paradigm aiming at accessing and managing data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. In the last years, this new paradigm has been used for providing users with abstract (independent from technological and system-oriented aspects), effective, and reasoning-intensive mechanisms for querying the data residing at the information system sources. In this paper we argue that OBDA, besides querying data, provides the right principles for devising a formal approach to data quality. In particular, we concentrate on one of the most important dimensions considered both in the literature and in the practice of data quality, namely consistency. We define a general framework for data consistency in OBDA, and present algorithms and complexity analysis for several relevant tasks related to the problem of checking data quality under this dimension, both at the extensional level (content of the data sources), and at the intensional level (schema of the data sources). 15:30 Modal Ranking: A Uniquely Robust Voting RuleSPEAKER: unknownABSTRACT. Motivated by applications to crowdsourcing, we study voting rules that output a correct ranking of alternatives by quality from a large collection of noisy input rankings. We seek voting rules that are supremely robust to noise, in the sense of being correct in the face of any "reasonable" type of noise. We show that there is such a voting rule, which we call the modal ranking rule. Moreover, we establish that the modal ranking rule is the unique rule with the preceding robustness property within a large family of voting rules, which includes a slew of well-studied rules. 15:30 Recommendation by Mining Multiple User Behaviors with Group SparsitySPEAKER: unknownABSTRACT. Recently, some recommendation methods try to improve the prediction results by integrating information from user’s multiple types of behaviors. How to model the dependence and independence between different behaviors is critical for them. In this paper, we propose a novel recommendation model, the Group-Sparse Matrix Factorization (GSMF), which factorizes the rating matrices for multiple behaviors into the user and item latent factor space with group sparsity regularization. It can (1) select out the different subsets of latent factors for different behaviors, addressing that users’ decisions on different behaviors are determined by different sets of factors; (2) model the dependence and independence between behaviors by learning the shared and private factors for multiple behaviors automatically; (3) allow the shared factors between different behaviors to be different, instead of all the behaviors sharing the same set of factors. Experiments on the real-world dataset demonstrate that our model can integrate users’multiple types of behaviors into recommendation better, compared with other state-of-the-arts.
16:00-17:00 Session 33: IAAI-14: Panel Discussion
Location: Room 304A/B
 16:00 Panel: From Research to Deployed Applications: Crossing the Valley of DeathSPEAKER: Craig Knoblock
16:15-17:00 Session 34: AAAI-14 Poster Session II:D
Location: Hall 200C
 16:15 Compact Aspect Embedding For Diversified Query ExpansionSPEAKER: unknownABSTRACT. Diversified query expansion (DQE) based approaches aim to select a set of expansion terms with less redundancy among them while covering as many query aspects as possible. Recently they have experimentally demonstrate their effectiveness for the task of search result diversification. One challenge faced by existing DQE approaches is how to ensure the aspect coverage. In this paper, we propose a novel method for DQE, called compact aspect embedding, which exploits trace norm regularization to learn a low rank vector space for the query, with each eigenvector of the learnt vector space representing an aspect, and the absolute value of its corresponding eigenvalue representing the association strength of that aspect to the query. Meanwhile, each expansion term is mapped into the vector space as well. Based on this novel representation of the query aspects and expansion terms, we design a greedy selection strategy to choose a set of expansion terms to explicitly cover all possible aspects of the query. We test our method on several TREC diversification data sets, and show our method significantly outperforms the state-of-the-art approaches. 16:15 Learning Scripts as Hidden Markov ModelsSPEAKER: unknownABSTRACT. Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural and synthetic datasets. The results show that our algorithm is superior to several informed baselines for predicting future events given some past history. 16:15 Using Response Functions to Measure Strategy StrengthSPEAKER: unknownABSTRACT. Extensive-form games are a powerful tool for representing complex multi-agent interactions. Nash equilibrium strategies are commonly used as a solution concept for extensive-form games, but many games are too large for the computation of Nash equilibria to be tractable. In these large games, exploitability has traditionally been used to measure deviation from Nash equilibrium, and thus strategies are aimed to achieve minimal exploitability. However, while exploitability measures a strategy's worst-case performance, it fails to capture how likely that worst-case is to be observed in practice. In fact, empirical evidence has shown that a less exploitable strategy can perform worse than a more exploitable strategy in one-on-one play against a variety of opponents. In this work, we propose a class of response functions that can be used to measure the strength of a strategy. We prove that standard no-regret algorithms can be used to learn optimal strategies for a scenario where the opponent uses one of these response functions. We demonstrate the effectiveness of this technique in Leduc poker against opponents that use the UCT Monte Carlo tree search algorithm. 16:15 Relaxation Search: a Simple Way of Managing Optional ClausesSPEAKER: unknownABSTRACT. A number of problems involve managing a set of optional clauses. For example, the soft clauses in a MaxSat formula are optional---they can be falsified for a cost. Similarly when computing a Minimum Correction Set for an unsatisfiable formula all clauses are optional---some can be falsified in order to make the remaining satisfiable. In both of these cases the task is to find a subset of the optional clauses that achieves some optimization criteria and is satisfiable. Relaxation search is a simple method of using a standard SAT solver to solve this task. Relaxation search is very easy to implement, sometimes requiring only a simple modification of the variable selection heuristic in the SAT solver. Furthermore, considerable flexibility and control can be achieved over the order in which subsets of optional clauses examined. We demonstrate how relaxation search can be used to solve MaxSat and to compute Minimum Correction Sets. In both cases relaxation search is able to achieve state-of-the-art performance and solve some instances other solvers are not able to solve. 16:15 Towards Scalable Exploration of Diagnoses in an Ontology StreamSPEAKER: Freddy LecueABSTRACT. Diagnosis, or the process of identifying the nature and cause of an anomaly in an ontology, has been largely studied by the Semantic Web community. In the context of ontology stream, diagnosis results are not captured by a unique fixed ontology but numerous time-evolving ontologies. Thus any anomaly can be diagnosed by a large number of different explanations depending on the version and evolution of the ontology. We address the problems of identifying, representing, exploiting and exploring the evolution of diagnoses representations. Our approach consists in a graph-based representation, which aims at (i) efficiently organizing and linking time-evolving diagnoses and (ii) being used for scalable exploration. The experiments have shown scalable diagnoses exploration in the context of real and live data from Dublin City. 16:15 Calibration-Free BCI Based ControlSPEAKER: unknownABSTRACT. Recent work has shown that it is possible to extract feedback information from EEG measurements of brain activity, such as error potentials, and use it to solve sequential tasks. As most Brain-Computer Interfaces, a calibration phase is required to build a decoder that translates raw EEG signals to understandable feedback signals. This paper proposes a method to solve sequential tasks based on feedback extracted from the brain without any calibration. The proposed method uses optimal policies to hallucinate the meaning of the EEG signals and select the target with the lowest expected error. Also, we use the task and symbol uncertainty as an exploration bonus for an active strategy to speed up the learning. We report online experiments where four users directly controlled an agent on a 2D grid world to reach a target without any previous calibration process. 16:15 Managing Change in Graph-structured Data Using Description LogicsSPEAKER: unknownABSTRACT. In this paper we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems, which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates, to deciding the (non-)existence of a sequence of actions that would take the data to an (un)desirable state, starting either from a specific data instance or from an incomplete description of it. We consider a simple action language in which actions are finite sequences of insertions and deletions of nodes and labels, and use Description Logics for describing integrity constraints and (partial) states of the data. We then formalize the data management problems mentioned above as a static verification problem and several planning problems. We provide algorithms and tight complexity bounds for the formalized problems, both for an expressive DL and for a variant of DL-Lite. 16:15 Joule Counting Correction for Electric Vehicles using Artificial Neural NetsSPEAKER: Michael TaylorABSTRACT. Estimating the remaining energy in high-capacity electric vehicle batteries is essential to safe and efficient operation. Accurate estimation remains a major challenge, however, because battery state cannot be observed directly. In this paper, we demonstrate a method for estimating battery remaining energy using real data collected from the Charge Car electric vehicle. This new method relies on energy integration as an initial estimation step, which is then corrected using a neural net that learns how error accumulates from recent charge/discharge cycles. In this way, the algorithm is able to adapt to nonlinearities and variations that are difficult to model or characterize. On the collected dataset, this method is demonstrated to be accurate to within 2.5% to 5% of battery remaining energy, which equates to approximately 1 to 2 miles of residual range for the Charge Car given its 10kWh battery pack. 16:15 Solving Zero-Sum Security Games in Discretized Spatio-Temporal DomainsSPEAKER: unknownABSTRACT. Security games model the problem of allocating multiple resources to defend multiple targets. In such settings, the defender's action space is usually exponential in the input size. Therefore, known general-purpose linear or mixed integer program formulations scale up exponentially. One method that has been widely deployed to address this computational issue is to instead compute the marginal probabilities, of which there are only polynomially many. In this paper, we address a class of problems that cannot be handled by previous approaches based on marginal probabilities. We consider security games in discretized spatio-temporal domains, in which the schedule set is so large that even the marginal probability formulation has exponential size. We develop novel algorithms under an oracle-based algorithmic framework and show that this framework allows us to efficiently compute Stackelberg mixed strategy when the problem allows a polynomial-time oracle. For the cases in which efficient oracles are difficult to find, we propose new direct algorithms or prove hardness results. All our algorithms are examined in experiments with realistic and artificial data. 16:15 Robust Non-Negative Dictionary LearningSPEAKER: unknownABSTRACT. Dictionary learning plays an important role in machine learning, where data vectors are modeled as a sparse linear combinations of basis factors (i.e., dictionary). However, how to conduct dictionary learning in noisy environment has not been well studied. Moreover, in practice, the dictionary (i.e., the lower rank approximation of the data matrix) and the sparse representations are required to be nonnegative, such as applications for image annotation, document summarization, microarray analysis. In this paper, we propose a new formulation for non-negative dictionary learning in noisy environment, where structure sparsity is enforced on sparse representation. The proposed new formulation is also robust for data with noises and outliers, due to a robust loss function used. We derive an efficient multiplicative updating algorithm to solve the optimization problem, where dictionary and sparse representation are updated iteratively. We prove the convergence and correctness of proposed algorithm rigorously. We show the differences of dictionary at different level of sparsity constraint. The proposed algorithm can be adapted for data clustering and semi-supervised learning purpose. Promising results in extensive experiments validate the effectiveness of proposed approach. 16:15 Doubly Regularized Portfolio with Risk MinimizationSPEAKER: unknownABSTRACT. Due to recent empirical success, machine learning algorithms have drawn sufficient attention and are becoming important analysis tools in financial industry. In particular, as the core engine of many financial services such as private wealth and pension fund management, portfolio management calls for the application of those novel algorithms. Most of portfolio allocation strategies do not account for costs from market frictions such as transaction costs and capital gain taxes, as the complexity of sensible cost models often causes the induced problem intractable. In this paper, we propose a doubly regularized sparse and consistent portfolio that provides a modest but effective solution to the above difficulty. Specifically, as all kinds of trading costs primarily root in large transaction volumes, to reduce volumes we synergistically combine two penalty terms with classic risk minimization models to ensure: (1) only a small set of assets are selected to invest in each period; (2) portfolios in subsequent trading periods are similar. To assess the new portfolio, we apply standard evaluation criteria and conduct extensive experiments on well-known benchmarks and market datasets. Compared to various state-of-the-art portfolios, the proposed portfolio demonstrates a superior performance of having both higher risk-adjusted returns and dramatically decreased transaction volumes. 16:15 Mixing-time Regularized Policy GradientSPEAKER: unknownABSTRACT. Policy gradient reinforcement learning (PGRL) methods have received substantial attention as a mean for seeking stochastic policies that maximize a cumulative reward. However, PGRL methods can often take a huge number of learning steps before it finds a reasonable stochastic policy. This learning speed depends on the mixing time of the Markov chains that are given by the policies that PGRL explores. In this paper, we give a new PGRL approach that regularizes the rule of updating the policy with the hitting time that bounds the mixing time. In particular, hitting-time regressions based on temporal-difference learning are proposed. The proposed approach will keep the Markov chain compact and can improve the learning efficiency. Numerical experiments show the proposed method outperforms the conventional policy gradient methods. 16:15 Generalized Higher-Order Tensor Decomposition via Parallel ADMMSPEAKER: unknownABSTRACT. Higher-order tensors are becoming prevalent in many scientific areas such as computer vision, social network analysis, data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges: model selecting, gross corruptions, and computational efficiency. To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem. This method does not require the rank of each model to be specified beforehand, and can automatically determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor, we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then, we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally, we develop two parallel alternating direction methods of multipliers (ADMM) to solve the proposed problems. Experimental results verify that our regularized formulation is reasonable, and our methods are very robust to noise or outliers. 16:15 Towards Topological-transformation Robust Shape Comparison: A Sparse Representation Based Manifold Embedding ApproachSPEAKER: unknownABSTRACT. Non-rigid shape comparison based on manifold embedding using Generalized Multidimensional Scaling (GMDS) has attracted a lot of attention for its high accuracy. However, this method requires that shape surface is not elastic. In other words, it is sensitive to topological transformations such as stretching and compressing. To tackle this problem, we propose a new approach that constructs a high-dimensional space to embed the manifolds of shapes, which could completely withstand rigid transformations and considerably tolerate topological transformations. Experiments on TOSCA shapes validate the proposed approach. 16:15 Propagating Regular Counting ConstraintsSPEAKER: unknownABSTRACT. Constraints over finite sequences of variables are ubiquitous in sequencing and timetabling. This led to general modeling techniques and generic propagators, often based on deterministic finite automata (DFA) and their extensions. We consider counter-DFAs (cDFA), which provide concise models for regular counting constraints, that is constraints over the number of times a regular-language pattern occurs in a sequence. We show how to enforce domain consistency in polynomial time for atmost and atleast regular counting constraints based on the frequent case of a cDFA with only accepting states and a single counter that can be increased by transitions. We also show that the satisfaction of exact regular counting constraints is NP-hard and that an incomplete propagator for exact regular counting constraints is faster and provides more pruning than the existing propagator from (Beldiceanu, Carlsson, and Petit 2004). Finally, by avoiding the unrolling of the cDFA used by COSTREGULAR, the space complexity reduces from O(n|\Sigma||Q|) to O(n(|\Sigma|+|Q|)), where \Sigma is the alphabet and Q the state set of the cDFA. 16:15 Manifold Spanning GraphsSPEAKER: unknownABSTRACT. Graph construction is the essential first step for nearly all manifold learning algorithms. While many applications assume that a simple k-nearest or epsilon-close neighbors graph will accurately model the topology of the underlying manifold, these methods often require expert tuning and may not produce high quality graphs. In this paper, the hyperparameter sensitivity of existing graph construction methods is demonstrated. We then present a new algorithm for unsupervised graph construction, based on minimal assumptions about the input data and its manifold structure. Notably, this method requires no hyperparameter tuning. 16:15 Tightening Bounds for Bayesian Network Structure LearningSPEAKER: unknownABSTRACT. A recent breadth-first branch and bound algorithm (BFBnB) for learning Bayesian network structures (Malone et al. 2011) uses two bounds to prune the search space for better efficiency; one is a lower bound calculated from pattern database heuristics, and the other is an upper bound obtained by a hill climbing search. Whenever the lower bound of a search path exceeds its upper bound, the path is guaranteed to lead to suboptimal solutions and is discarded immediately. This paper introduces methods for tightening the bounds. The lower bound is tightened by using more informed variable groupings in creating the pattern databases, and the upper bound is tightened using an anytime learning algorithm. Empirical results show that these bounds improve the efficiency of Bayesian network learning by two to three orders of magnitude. 16:15 Incentivizing High-Quality Content from Heterogeneous Users: On the Existence of Nash EquilibriumSPEAKER: unknownABSTRACT. In this paper, we study the existence of pure Nash equilibrium (PNE) for the mechanisms used in Internet services (e.g., online reviews and question-answer websites) to incentivize users to generate high-quality content. Most existing work assumes that users are homogeneous and have the same ability. However, real-world users are heterogeneous and their abilities can be very different from each other due to their diverse background, culture, and profession. In this work, we consider heterogeneous users with the following framework: (1) the users are heterogeneous and each of them has a private type indicating the best quality of the content she can generate; (2) there is a fixed amount of reward to allocate to the participated users. Under this framework, we study the existence of pure Nash equilibrium of several mechanisms composed by different allocation rules, action spaces, and information settings. We prove the existence of PNE for some mechanisms and the non-existence of PNE for some mechanisms. We also discuss how to find a PNE for those mechanisms with PNE either through a constructive way or a searching algorithm. 16:15 Voting with Rank Dependent Scoring RulesSPEAKER: unknownABSTRACT. Positional scoring rules in voting compute the score of an alternative by summing the scores for the alternative induced by every vote. This summation principle ensures that all votes contribute equally to the score of an alternative. We relax this assumption and, instead, aggregate scores by taking into account the rank of a score in the ordered list of scores obtained from the votes. This defines a new family of voting rules, rank-dependent scoring rules (RDSRs), based on ordered weighted average (OWA) operators, which include scoring rules, plurality, k-approval, and Olympic averages. We study some properties of these rules, and show, empirically, that certain RDSRs are less manipulable than Borda voting, across a variety of statistical cultures. 16:15 Latent Low-Rank Transfer Subspace Learning for Missing Modality RecognitionSPEAKER: unknownABSTRACT. We consider an interesting problem in this paper that using transfer learning in two directions to compensate missing knowledge from the target domain. Transfer learning is usually exploited as a powerful tool to mitigate the discrepancy between different databases for knowledge transfer; it can also be used for knowledge transfer between different modalities within one database. However, in either case, transfer learning will fail if the target data are missing. To overcome this, we consider knowledge transfer between different databases and modalities simultaneously in a single framework, where missing target data from one database are recovered to facilitate recognition task. We call this framework Latent Low-rank Bi-Directional Transfer Subspace Learning method (L2BTSL). First, we propose to use low-rank constraint as well as dictionary learning in a learned subspace to guide the knowledge transfer between and within different databases. Second, a latent factor is introduced to uncover the underlying structure of the missing target data. Third, bi-directional transfer learning is proposed to integrate auxiliary database for transfer learning with missing target data. Experimental results of multi-modalities knowledge transfer with missing target data demonstrate that our method can successfully inherit knowledge from the auxiliary database to complete the target domain, and therefore enhance the performance when recognizing data from the modality without any training data. 16:15 Diagnosing Analogue Linear Systems Using Dynamic Topological ReconfigurationSPEAKER: unknownABSTRACT. Fault diagnosis of analogue linear systems is a challenging task and no fully automated solution exists. Two challenges in this diagnosis task are the size of the search space that much be explored and the possibility of simulation instabilities introduced by particular fault classes. We study a novel algorithm that addresses both problems. This algorithm dynamically modifies the simulation model during diagnosis by pruning parametrized components that cause discontinuity in the model. We provide a theoretical framework for predicting the speedups, which depends on the topology of the model. We empirically validate the theoretical predictions through extensive experimentation on a benchmark of circuits. 16:15 Sparse Compositional Metric LearningSPEAKER: unknownABSTRACT. We propose a new approach for metric learning by framing it as learning a sparse combination of locally discriminative metrics that are inexpensive to generate from the training data. This flexible framework allows us to naturally derive formulations for global, multi-task and local metric learning. These new algorithms have several advantages over existing methods in the literature: a much smaller number of parameters to be estimated and a principled way to generalize learned metrics to new testing data points. To analyze the approach theoretically, we derive a generalization bound that justifies the sparse combination. Empirically, we evaluate our algorithms on several datasets against state-of-the-art metric learning methods. The results are consistent with our theoretical findings and demonstrate the superiority of our approach in terms of classification performance and scalability. 16:15 Role-Aware Conformity Influence Modeling and Analysis in Social NetworksSPEAKER: unknownABSTRACT. Conformity is the inclination of a person to be influenced by others. In this paper, we study how the conformity tendency of a person changes with her {\em role}, as defined by her structural properties in a social network. We first formalize conformity using a utility function based on the conformity theory from social psychology, and validate the proposed utility function by proving the existence of Nash Equilibria when all users in a network behave according to it. We then extend and incorporate the utility function into a probabilistic topic model, called the Role-Conformity Model (RCM), for modeling user behaviors under the effect of conformity. We apply the proposed RCM to several academic research networks, and discover that people with higher degree and lower clustering coefficient are more likely to conform to others. We also evaluate RCM through the task of word usage prediction in academic publications, and show significant improvements over the performance of baselines. 16:15 Planning as Model Checking in Hybrid DomainsSPEAKER: unknownABSTRACT. Planning in hybrid domains is an important and challenging task, and various planning algorithms have been proposed in the last years. From an abstract point of view, hybrid planning domains are based on hybrid automata, which have been studied intensively in the model checking community. In particular, powerful model checking algorithms and tools have emerged for this formalism. However, despite the quest for more scalable planning approaches, model checking algorithms have not been applied to planning in hybrid domains so far. In this paper, we make a first step in bridging the gap between these two worlds. We provide a formal translation scheme from PDDL+ to the standard formalism of hybrid automata, as a solid basis for using hybrid system model-checking tools for dealing with hybrid planning domains. As a case study, we use the SpaceEx model checker, showing how we can address PDDL+ domains that are out of the scope of state-of-the-art planners. 16:15 Computing General First-order Parallel and Prioritized CircumscriptionSPEAKER: unknownABSTRACT. This paper focuses on computing general first-order parallel and prioritized circumscription with varied constants. We propose polynomial translations from general first-order circumscription to first-order stable model semantics over arbitrary structures, including $Tr_v$ for parallel circumscription and $Tr^s_v$ for several parallel circumscriptions (further for prioritized circumscription). To improve the efficiency, we give an optimization, called $\Gamma_{\exists}$, to reduce auxiliary predicates in number and logic programs in size when eliminating existential quantifiers during the translations. Based on these results, a general first-order circumscription solver, named cfo2lp, is developed by calling answer set programming solvers. Using circuit diagnosis problem and extended stable marriage problem as benchmarks, we compare cfo2lp with a propositional circumscription solver circ2dlp on efficiency. Experimental results demonstrate that for problems represented by general first-order circumscription naturally and intuitively, cfo2lp can compute all solutions over finite structures. We also apply our approach to description logics with circumscription and repairs in inconsistent databases, which can be handled effectively. 16:15 A Joint Optimization Model for Image Summarization Based on Image Content and TagsSPEAKER: unknownABSTRACT. As an effective technology for navigating a large number of images, image summarization is becoming a promising task with the rapid development of image sharing sites and social networks. Most existing summarization approaches use the visual-based features for image representation without considering tag information. In this paper, we propose a novel framework, named JOINT, which employs both image content and tag information to summarize images. Our model generates the summary images which can best reconstruct the original collection. Based on the assumption that an image with representative content should also have typical tags, we introduce a similarity-inducing regularizer to our model. Furthermore, we impose the lasso penalty on the objective function to yield a concise summary set. Extensive experiments demonstrate our model outperforms the state-of-the-art approaches. 16:15 Pre-trained Multi-view Word Embedding using Two-side Neural NetworkSPEAKER: unknownABSTRACT. Word embedding aims to learn a continuous representation for each word. It attracts increasing attention due to its effectiveness in various tasks such as named entity recognition and language modeling. Most existing word embedding results are generally trained on one individual data source such as news pages or Wikipedia articles. However, when we apply them to other tasks such as web search, the performance suffers. To obtain a robust word embedding for different applications, multiple data sources could be leveraged. In this paper, we proposed a two-side multimodal neural network to learn a robust word embedding from multiple data sources including free text, user search queries and search click-through data. This framework takes the word embeddings learned from different data sources as pre-train, and then uses a two-side neural network to unify these embeddings. The pre-trained embeddings are obtained by adapting the recently proposed CBOW algorithm. Since the proposed neural network does not need to re-train word embeddings for a new task, it is highly scalable in real world problem solving. Besides, the network allows weighting different sources differently when applied to different application tasks. Experiments on two real-world applications including web search ranking and word similarity measuring show that our neural network with multiple sources outperforms state-of-the-art word embedding algorithm with each individual source. It also outperforms other competitive baselines using multiple sources. 16:15 Regret-based Optimization and Preference Elicitation for Stackelberg Security Games with UncertaintySPEAKER: unknownABSTRACT. Stackelberg security games (SSGs) have been deployed a number of real-world security domains. One key challenge in these applications is the assessment of attacker payoffs which may not be perfectly known. Previous work has studied SSGs with uncertain payoffs modeled by interval uncertainty and maximin-based robust optimization. In contrast, this paper is the first to propose the use of the less conservative minimax regret as a decision criterion for payoff-uncertain SSGs and to present several algorithms for computing minimax regret for such games. This paper also for the first time addresses the challenge of preference elicitation in SSGs, providing novel regret-based solution strategies. Experimental results validate the runtime performance and solution quality of our approaches. 16:15 Globally and Locally Consistent Unsupervised ProjectionSPEAKER: unknownABSTRACT. In this paper, we propose an unsupervised projection method for feature extraction to preserve both global and local consistencies of the input data in the projected space. Traditional unsupervised feature extraction methods, such as principal component analysis and locality preserving projections, can only explore either the global or local geometric structures of the input data, but not the both. In our new method, we introduce a new measurement using the neighborhood data variances to assess the data locality, by which we propose to learn an optimal projection by rewarding both the global and local structures of the input data. Moreover, to improve the robustness of the proposed learning model against outlier data samples and outlier features, which is of particular importance in unsupervised learning, we propose a new objective that simultaneously minimizes and maximizes (minmax) the L1-norm distances instead of the traditional squared L2-norm distances. Solving the formulated optimization problem is very challenging, because it minimizes and maximizes a number of non-smooth L1- norm terms at the same time. In this paper, as an important theoretical contribution, we propose a simple yet effective optimization method to solve the L1-norm minmax problem, and theoretically prove its convergence and correctness. To the best of our knowledge, our paper makes the first attempt to solve the general L1-norm minmax problem with orthogonal constraints. Extensive experiments have been performed on six benchmark data sets, where the promising results validate the proposed method. 16:15 Low-Rank Tensor Completion with Spatio-Temporal ConsistencySPEAKER: unknownABSTRACT. Video completion is a computer vision technique to recover the missing values in video sequences by filling the unknown regions with the known information. In recent research, tensor completion, a generalization of matrix completion for higher order data, emerges as a new solution to estimate the missing information in video with the assumption that the video frames are homogenous and correlated. However, each video clip often stores the heterogeneous episodes and the correlations among all video frames are not high. Thus, the regular tenor completion methods are not suitable to recover the video missing values in practical applications. To solve this problem, we propose a novel spatially-temporally consistent tensor completion method for recovering the video missing data. Instead of minimizing the average of the trace norms of all matrices unfolded along each mode in a tensor data, we introduce a new smoothness regularization along video time direction to utilize the temporal information between consecutive video frames. Meanwhile, we also minimize the trace norm of each individual video frame to employ the spatial correlations among pixels. Different to previous tensor completion approaches, our new method can keep the spatio-temporal consistency in video and do not assume the global correlation in video frames. Thus, the proposed method can be applied to the general and practical video completion applications. Our method shows promising results in all evaluations on 3D biomedical image sequence and video benchmark data sets. 16:15 Sample-Adaptive Multiple Kernel LearningSPEAKER: unknownABSTRACT. Existing multiple kernel learning (MKL) algorithms indiscriminately apply a same set of kernel combination weights to all samples. However, the utility of base kernels could vary across samples and a base kernel useful for one sample could become noisy for another. In this case, rigidly applying a same set of kernel combination weights could adversely affect the learning performance. To improve this situation, we propose a sample-adaptive MKL algorithm, in which base kernels are allowed to be adaptively switched on/off with respect to each sample. We achieve this goal by assigning a latent binary variable to each base kernel when it is applied to a sample. The kernel combination weights and the latent variables are jointly optimized via margin maximization principle. As demonstrated on five benchmark data sets, the proposed algorithm consistently outperforms the comparable ones in the literature. 16:15 Gradient Descent with Proximal Average for Nonconvex and Composite RegularizationSPEAKER: unknownABSTRACT. Sparse modeling has been highly successful on high-dimensional data. While a lot of interests have been on convex regularization, recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However, the resulting nonconvex optimization problems are often challenging, especially for composite regularizers such as the nonconvex overlapping group lasso. In this paper, by using a recent tool known as the proximal average, we propose a novel proximal gradient descent method for a wide class of composite and nonconvex problems. Instead of directly solving the proximal step with a composite regularizer, we average the solutions from the proximal problems of the individual regularizers. This simple strategy has similar convergence guarantee as existing nonconvex optimization approaches,but its per-iteration complexity is much lower. Experimental results on synthetic and real-world data sets demonstrate the effectiveness and efficiency of the proposed optimization algorithm, and also the improved prediction performance resulting from the nonconvex regularizers.
17:00-18:00 Session 35: AAAI-14 Speed Dating
Chair: Toby Walsh
Location: Room 206A/B
18:00-19:15 Session 36: AAAI-14 Pasta Buffet (open to all)
Location: Hall 200C
19:30-22:30 Session 37: Fun and Games Night
Location: Room 206A/B