PROGRAM FOR THURSDAY, JULY 31ST
View: session overviewtalk overview
08:40 | Challenges in Machine Learning SPEAKER: Thorsten Joachims |
08:55 | What's Hot in Robotics SPEAKER: Ashutosh Saxena |
09:10 | What's Hot in Learning Representations SPEAKER: Aaron Courville |
08:40 | Automating Science: A Grand Challenge for AI SPEAKER: Vasant Honavar ABSTRACT. The emergence of “big data” – data that are far more voluminous, diverse, and inter-related than we know how to cope with – has resulted in the transformation of many historically data poor disciplines into increasingly data rich disciplines. Much attention has focused on the challenges of management, processing, and analysis of big data. Advances in computing, storage, and communication technologies make it possible to organize, annotate, link, share, and analyze big data, setting the stage, in many disciplines, for a transformation from descriptive science to predictive science. However, there is a huge gap between our ability to acquire and process data and our ability to make effective use of data to advance science. While we understand how to automate routine aspects of data management and analytics, humans are still largely responsible for most aspects of the scientific process: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Accelerating the transformation of the once data poor, but increasingly data rich disciplines from descriptive sciences to predictive sciences calls for (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science). Formalization and automation of (key aspects of) science presents a grand challenge for AI, with the attendant opportunities for foundational advances in AI. Against this background, this talk will highlight some AI challenges in automating science; review some developments that set the stage for automating science; briefly review some advances in AI (e.g., in computational discovery, active learning, causal inference, literature-based discovery, argumentation systems, scientific knowledge representation and scientific applications of semantic web that signal progress on many of the AI problems in automating science, and conclude with a vision of AI research aimed at making the most of data, big and small, to not only accelerate science, but also enable new kinds of science. |
08:55 | Themes and Progress in Computational Scientific Discovery SPEAKER: Pat Langley ABSTRACT. I propose to review themes and progress in computational discovery of scientific knowledge. Topics will include viewing discovery as search through a space of models, the role of knowledge in aiding this search, and a focus on explaining observatinons rather than merely summarizing data. I will contrast research in this tradition with the more recent work on 'data-intensive' science. |
09:10 | Task Learning: a Challenge Problem for Integrated Intelligent Agents SPEAKER: John Laird ABSTRACT. The goal of this presentation is to establish task learning as a challenge problem for AI. Our vision for a general task learning agent is one where a human interactively teaches the agent new tasks, using a combination of language, gestures, and demonstrations, with reference to the agent’s available background knowledge. As the task is defined, the agent asks questions about concepts it does not know, and about instructions that are not clear.The need for taskable agents is going to become critical as intelligent autonomous agents play increasing roles in our lives. It will be impossible to preprogram all the tasks that a human user will want an agent to perform. We will want to dynamically instruct our agents with new tasks and incrementally customize their behavior so that they do things the way we want them to be done. This presentation will build off of the results of a (proposed) NSF workshop on task learning being held in May that will bring together experts in autonomous agents, cognitive architecture, human-robot interaction, learning, and cognitive robots to establish a roadmap for research in task learning. |
08:40 | Senior Member: Tackling Real World Data Streams: a Call to Arms SPEAKER: Bernhard Pfahringer ABSTRACT. Data stream mining has been a very active and successful research area in the last decade. It is best characterised as the online and resource-bounded processing and analysis of unbounded streams of data originating from non-stationary data sources. Still, when looking at real world data streams, the current state of the art in stream mining research is lacking in several regards. Shortcomings include preprocessing, labeling, scalability, evaluation, complex data, and a dearth of streaming algorithms for non-classification problems. |
08:55 | Senior Member: Computational Social Choice SPEAKER: Francesca Rossi ABSTRACT. Computational social choice is an expanding fields of research that merges classical disciplines like voting theory, political science, and economics with more modern ones like artificial intelligence and computational complexity. The main aim of this research area is to study AI settings where one needs to aggregate preferences coming from multiple sources, or agents, and to do so with computational concerns in mind. I plan to describe this very exciting and inter-disciplinary area of research, which in the past few years has been more and more present in large AI conférences like AAAI and IJCAI, as well as AAMAS and EC, by introducing the main lines of work and hinting at its potential for the future. |
09:10 | Wikipedia-based Semantic Interpretation for Natural Language Processing SPEAKER: Evgeniy Gabrilovich |
08:40 | Challenges in Interactive Entertainment SPEAKER: Kevin Dill |
08:55 | Challenges in Constraint Programming SPEAKER: Pascal Van Hentenryck |
09:10 | Challenges Beyond Factoid Question Answering SPEAKER: Ken Barker |
09:00 | Emerging: Robust Protection of Fisheries with COmPASS SPEAKER: William Haskell ABSTRACT. Fish stocks around the world are in danger from ille- gal fishing. In collaboration with the U.S. Coast Guard (USCG), we work to defend fisheries from illegal fish- erman (henceforth called Lanchas) in the U.S. Gulf of Mexico. We have developed the COmPASS (Conserva- tive Online Patrol ASSistant) system to design USCG patrols against the Lanchas. In this application, we face a population of Lanchas with heterogeneous behavior who fish frequently. We have some data about these Lanchas, but not enough to fit a statistical model. Previ- ous security patrol assistants have focused on counter- terrorism in one-shot games where adversaries are as- sumed to be perfectly rational, and much less data about their behavior is available. COmPASS is novel because: (i) it emphasizes environmental crime; (ii) it is based on a repeated Stackelberg game; (iii) it allows for bounded rationality of the Lanchas and it offers a robust approach against the heterogeneity of the Lancha population; and (iv) it can learn from sparse Lancha data. We report the effectiveness of COmPASS in the Gulf in our numeri- cal experiments based on real fish data. The COmPASS system is to be tested by USCG. |
09:20 | Emerging: Swissnoise: Online Polls with Game-Theoretic Incentives SPEAKER: Florent Garcin ABSTRACT. There is much interest in crowdsourcing information that is distributed among many individuals, such as the likelihood of future events, election outcomes, the quality of products, or the consequence of a decision. To obtain accurate outcomes, various game-theoretic incentive schemes have been proposed. However, only prediction markets have been tried in practice. In this paper, we describe an experimental platform, swissnoise, that compares prediction markets with peer prediction schemes developed in recent AI research. It shows that peer prediction schemes can achieve similar performance while being applicable to a much broader range of questions. |
09:40 | Emerging: Crowdsourcing for Multiple-Choice Question Answering SPEAKER: Bahadir Ismail Aydin ABSTRACT. We leverage crowd wisdom for multiple-choice question answering, and employ lightweight machine learning techniques to improve the aggregation accuracy of crowdsourced answers to these questions. In order to develop more effective aggregation methods and evaluate them empirically, we developed and deployed a crowdsourced system for playing the "Who wants to be a millionaire?" quiz show.Analyzing our data (which consist of more than 200,000 answers), we find that by just going with the most selected answer in the aggregation, we can answer over 90% of the questions correctly, but the success rate of this technique plunges to 60% for the later/harder questions in the quiz show. To improve the success rates of these later/harder questions, we investigate novel weighted aggregation schemes for aggregating the answers obtained from the crowd.By using weights optimized for reliability of participants (derived from the participants' confidence), we show that we can pull up the accuracy rate for the harder questions by 15%, and to overall 95% average accuracy.Our results provide a good case for the benefits of applying machine learning techniques for building more accurate crowdsourced question answering systems. |
10:15 | Game Theory for Security: Key Algorithmic Principles, Deployed Applications, Research Challenges SPEAKER: Milind Tambe ABSTRACT. Security is a global concern, requiring efficient, randomized allocation and scheduling of limited security resources. To that end, we have used computational game theory to build decision aids for security agencies around the world. These decision aids are in use by agencies such as the US Coast Guard for protection of ports and ferry traffic, and the Federal Air Marshals Service and LAX police for protecting air traffic; our game-theoretic algorithms are also under evaluation for suppression of urban crime and for protection of wildlife and fisheries. I will overview my group's research in this growing area of security games. |
10:15 | Emerging: A Unified Framework for Augmented Reality and Knowledge-Based Systems in Maintaining Aircraft SPEAKER: Geun-Sik Jo ABSTRACT. Aircraft maintenance and training play one of the most important roles in ensuring flight safety. The maintenance process usually involves massive numbers of components and substantial procedural knowledge of maintenance procedures. Maintenance tasks require technicians to follow rigorous procedures to prevent operational errors in the maintenance process. In addition, the maintenance time is a cost-sensitive issue for airlines. This paper proposes intelligent augmented reality (IAR) system to minimize operation errors and time-related costs and help aircraft technicians cope with complex tasks by using an intuitive UI/UX interface for their maintenance tasks. The IAR system is composed mainly of three major modules: 1) the AR module 2) the knowledge-based system (KBS) module 3) a unified platform with an integrated UI/UX module between the AR and KBS modules. The AR module addresses vision-based tracking, annotation, and recognition. The KBS module deals with ontology-based resources and context management. Overall testing of the IAR system is conducted at Korea Air Lines (KAL) hangars. Tasks involving the removal and installation of pitch trimmers in landing gear are selected for benchmarking purposes, and according to the results, the proposed IAR system can help technicians to be more effective and accurate in performing their maintenance tasks. |
10:35 | Emerging: Optimizing a Start-Stop Controller Using Policy Search SPEAKER: Noel Hollingsworth ABSTRACT. We applied a policy search algorithm to the problem of optimizing a start-stop controller — a controller used in a car to turn off the vehicle's engine, and thus save energy, when the vehicle comes to a temporary halt. We were able to improve the existing policy by approximately 12% using real driver trace data. We also experimented with using multiple policies, and found that doing so could lead to a further 8% improvement if we could determine which policy to apply at each stop. The driver's behaviors before stopping were found to be uncorrelated with the policy that performed best; however, further experimentation showed that the driver's behavior during the stop may be more useful, suggesting a useful direction for adding complexity to the underlying start-stop policy. |
10:55 | Emerging: Advice Provision for Energy Saving in Automobile Climate Control Systems SPEAKER: Amos Azaria ABSTRACT. Reducing energy consumption of climate control systems is important in order to reduce human environmental footprint. The need to save energy becomes even greater when considering an electric car, since heavy use of the climate control system may exhaust the battery. In this paper we consider a method for an automated agent to provide advice to drivers which will motivate them to reduce the energy consumption of their climate control unit. Our approach takes into account both the energy consumption of the climate control system and the expected comfort level of the driver. We therefore build two models, one for assessing the energy consumption of the climate control system as a function of the system's settings, and the other, models human comfort level as a function of the climate control system's settings. Using these models, the agent provides advice to the driver considering how to set the climate control system. The agent advises settings which try to preserve a high level of comfort while consuming as little energy as possible. We empirically show that drivers equipped with our agent which provides them with advice significantly save energy as compared to drivers not equipped with our agent. |
11:15 | Emerging: StrokeBank: Automating Personalized Chinese Handwriting Generation SPEAKER: Alfred Zong ABSTRACT. Machine learning techniques have been successfully applied to Chinese character recognition; nonetheless, automatic generation of stylized Chinese handwriting remains a challenge. In this paper, we propose StrokeBank, a novel approach to automating personalized Chinese handwriting generation. We use a semi-supervised algorithm to construct a dictionary of component mappings from a small seeding set. Unlike previous work, our approach does not require human supervision in stroke extraction or knowledge of the structure of Chinese characters. This dictionary is used to generate handwriting that preserves stylistic variations, including cursiveness and spatial layout of strokes. We demonstrate the effectiveness of our model by a survey-based evaluation. The results show that our generated characters are nearly indistinguishable from ground truth handwritings. |
13:30 | 30 Year of Probability in AI and Machine Learning SPEAKER: Padhraic Smyth |
14:35 | A Generalization of Probabilistic Serial to Randomized Social Choice SPEAKER: unknown ABSTRACT. The probabilistic serial (PS) rule is one of the most well-established and desirable rules for the random assignment problem. We present the egalitarian simultaneous reservation (ESR) social decision scheme — an extension of PS to the more general setting of randomized social choice. ESR also generalizes an egalitarian rule from the literature which is defined only for dichotomous preferences. We consider various desirable fairness, efficiency, and strategic properties of ESR and show that it compares favourably against other social decision schemes. Finally, we define a more general class of social decision schemes called Simultaneous Reservation (SR), that contains ESR as well as the serial dictatorship rules. We show that outcomes of SR characterize efficiency with respect to a natural refinement of stochastic dominance. |
14:50 | Biased Games SPEAKER: unknown ABSTRACT. We present a novel extension of normal form games that we call biased games. In these games, a player's utility is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased games capture important aspects of the interaction between software agents. Our main result is that biased games satisfying certain mild conditions always admit an equilibrium. We also tackle the computation of equilibria in biased games. |
15:05 | Simultaneous Cake Cutting SPEAKER: unknown ABSTRACT. We introduce the simultaneous model for cake cutting (the fair allocation of a divisible good), in which agents simultaneously send messages containing a sketch of their preferences over the cake. We show that this model enables the computation of divisions that satisfy proportionality — a popular fairness notion — using a protocol that circumvents a standard lower bound via parallel information elicitation. Cake divisions satisfying another prominent fairness notion, envy-freeness, are impossible to compute in the simultaneous model, but such allocations admit arbitrarily good approximations. |
15:20 | Incomplete Preferences in Single-Peaked Electorates SPEAKER: Martin Lackner ABSTRACT. Incomplete preferences are likely to arise in real-world preference aggregation and voting systems. This paper deals with determining whether an incomplete preference profile is single-peaked. This is essential information since many intractable voting problems become tractable for single-peaked profiles. We prove that for incomplete profiles the problem of determining single-peakedness is NP-complete. Despite this computational hardness result, we find four polynomial-time algorithms for reasonably restricted settings. |
15:35 | Chinese Zero Pronoun Resolution: An Unsupervised Approach Combining Ranking and Integer Linear Programming SPEAKER: unknown ABSTRACT. Compared to overt pronoun resolution, there is less work on the more challenging task of zero pronoun resolution. State-of-the-art approaches to zero pronoun resolution are supervised, requiring the availability of documents containing manually resolved zero pronouns. In contrast, we propose in this paper an unsupervised approach to this task. Underlying our approach is the novel idea of employing a model trained on manually resolved overt pronouns to resolve zero pronouns. Experimental results on the OntoNotes corpus are encouraging: our unsupervised model rivals its supervised counterparts in performance. |
15:35 | Contraction and Revision over DL-Lite TBoxes SPEAKER: unknown ABSTRACT. An essential task in managing DL ontologies is to deal with changes over the ontologies. In particular, outdated axioms have to be removed from the ontology and newly formed axioms have to be incorporated into the ontology. Such changes are formalised as the operations of contraction and revision in the literatures. The operations can be defined in various ways. To investigate properties of a defined operation, it is best to identify some postulates that completely characterise the operation such that on the one hand the operation satisfies the postulates and on the other hand it is the only operation that satisfies all the postulates. Such characterisation results have never been shown for contractions under DLs. In this paper, we define model-based contraction and revision for DL-Lite$_{core}$ TBoxes and provide characterisation results for both operations. As a first step for applying the operations in practice, we also provide tractable algorithms for both operations. Since DL semantics incurs infinite numbers of models for DL-Lite TBoxes, it is not feasible to develop algorithms involving DL models. The key to our operations and algorithms is the development of an alternative semantics called type semantics. Type semantics closely resembles the semantics underlays propositional logic, thus it is more succinct than DL semantics. Most importantly, given a finite signature, any DL-Lite$_{core}$ TBox has finite numbers of type models. |
15:35 | Optimal and Efficient Stochastic Motion Planning in Partially-Known Environments SPEAKER: unknown ABSTRACT. A framework capable of computing optimal control policies for a continuous system in the presence of both action and environment uncertainty is presented in this work. The framework decomposes the planning problem into two stages: an offline phase that reasons only over action uncertainty and an online phase that quickly reacts to the uncertain environment. Offline, a bounded-parameter Markov decision process (BMDP) is employed to model the evolution of the stochastic system over a discretization of the environment. Online, an optimal control policy over the BMDP is computed. Upon the discovery of an unknown environment feature during policy execution, the BMDP is updated and the optimal control policy is efficiently recomputed. Depending on the desired quality of the control policy, a suite of methods is presented to incorporate new information into the BMDP with varying degrees of detail online. Experiments confirm that the framework recomputes high-quality policies in seconds and is orders of magnitude faster than existing methods. |
15:35 | Natural Temporal Difference Learning SPEAKER: unknown ABSTRACT. In this paper we investigate the application of natural gradient descent to Bellman error based reinforcement learning algorithms. This combination is interesting because natural gradient descent is invariant to the parameterization of the value function. This invariance property means that natural gradient descent adapts its update directions to correct for poorly conditioned representations. We present and analyze quadratic and linear time natural temporal difference learning algorithms, and prove that they are covariant. We conclude with experiments which suggest that the natural algorithms can match or outperform their non-natural counterparts using linear function approximation, and drastically improve upon their non-natural counterparts when using non-linear function approximation. |
15:35 | Large Scale Analogical Reasoning SPEAKER: unknown ABSTRACT. It has been argued that one can use cognitive simulation of analogical processing to answer comparison questions. In the context of a knowledge base (KB) system, a comparison question takes the form: What are the similarities and/or differences between A and B?, where \concept{A} and \concept{B} are concepts in the KB. Previous attempts to use a general purpose analogical reasoner to answer this question revealed three major problems: (a) the system presented too much information in the answer and the salient similarity or difference was not highlighted (b) analogical inference found some incorrect differences (c) some expected similarities were not found. The primary cause of these problems was the lack of availability of a well-curated KB, and secondarily, there were also some algorithmic deficiencies. In this paper, we present an of comparison questions that is inspired by the general model of analogical reasoning, but is specific to the questions at hand. We also rely on a well-curated biology KB. We present numerous examples of answers produced by the system and empirical data on the quality of the answers to claim that we have addressed many of the problems faced in the previous system. |
15:35 | Fused Feature Representation Discovery for High-dimensional and Sparse Data SPEAKER: unknown ABSTRACT. The automatic discovery of a significant low-dimensional feature representation from given data set is a fundamental problem in machine learning. This paper specifically focuses on to develop feature representation discovery methods appropriate for high-dimensional and sparse data, which are remain a frontier, but are now becoming a highly important tool. We formulate our feature representation discovery problem as a variant of semi-supervised learning problem, namely, an optimization problem over unsupervised data whose objective is evaluating the impact of each feature with respect to modeling a target task according to the initial model constructed by using supervised data. The most notable characteristic of our method is that it offers feasible processing speed even if the numbers of data and features both exceed the billions, and successfully provides significantly small feature sets, i.e., less than 10, that can also offer improved performance comparing with those obtained with using the original feature sets. We demonstrate the effectiveness of our method on experiments of two well-known natural language processing tasks. |
15:35 | Q-intersection Algorithms for Constraint-Based Robust Parameter Estimation SPEAKER: unknown ABSTRACT. Given a set of axis-parallel n-dimensional boxes, the q-intersection is defined as the smallest box encompassing all the points that belong to at least q boxes. Computing the q-intersection is a combinatorial problem that allows us to handle robust parameter estimation with a numerical constraint programming approach. The q-intersection can be viewed as a filtering operator for soft constraints that model measurements subject to outliers. This paper highlights the equivalence of this operator with the search of q-cliques in a graph whose boxicity is bounded by the number of variables in the constraint network. We present a computational study of the q-intersection. We also propose a fast incomplete algorithm and a sophisticated exact q-intersection algorithm. First experimental results show that our exact algorithm outperforms the existing one while our heuristic performs an efficient filtering on hard problems. |
15:35 | CoreCluster: A Degeneracy Based Graph Clustering Framework SPEAKER: unknown ABSTRACT. Graph clustering or community detection constitutes an important task for investigating the internal structure of graphs, with a plethora of applications in several diverse domains. Traditional tools for graph clustering, such as spectral methods, typically suffer from high time and space complexity. In this article, we present CoreCluster, an efficient graph clustering framework based on the concept of graph degeneracy, that can be used along with any known graph clustering algorithm. Our approach capitalizes on processing the graph in a hierarchical manner provided by its core expansion sequence, an ordered partition of the graph into different levels according to the k-core decomposition. Such a partition provides a way to process the graph in an incremental manner that preserves its clustering structure, while making the execution of the chosen clustering algorithm much faster due to the smaller size of the graph’s partitions onto which the algorithm operates. |
15:35 | Tree-Based On-line Reinforcement Learning SPEAKER: Andre Barreto ABSTRACT. Fitted Q-iteration (FQI) stands out among reinforcement-learning algorithms for its flexibility and easy of use. FQI can be combined with any regression method, and this choice determines the algorithm's theoretical and computational properties. The combination of FQI with an ensemble of regression trees gives rises to an algorithm, FQIT, that is computationally efficient, scalable to high dimensional spaces, and robust to irrelevant variables, outliers, and noise. Despite its nice properties and good performance in practice, FQIT also has some limitations: the fact that an ensemble of trees must be constructed (or updated) at each iteration confines the algorithm to the batch scenario. This paper aims to address this specific issue. Based on a strategy recently proposed in the literature, called the stochastic-factorization trick, we propose a modification of FQIT that makes it fully incremental, and thus suitable for on-line learning. We call the resulting method tree-based stochastic factorization (TBSF). We derive an upper bound for the difference between the value functions computed by FQIT and TBSF, and also show in which circumstances the approximations coincide. A series of computational experiments is presented to illustrate the properties of TBSF and to show its usefulness in practice. |
15:35 | Local-To-Global Consistency Implies Tractability of Abduction SPEAKER: Michał Wrona ABSTRACT. Abduction is a form of nonmonotonic reasoning that looks for an explanation, built from a given set of hypotheses, for an observed manifestation according to some knowledge base. Following the concept behind the Schaefer's parametrization CSP(Gamma) of the Constraint Satisfaction Problem (CSP), we study here the complexity of the abduction problem Abduction(Gamma, Hyp, M) parametrized by certain (omega-categorical) infinite relational structures Gamma, Hyp, and M from which a knowledge base, hypotheses and a manifestation are built, respectively. We say that Gamma has local-to-global consistency if there is k such that establishing strong k-consistency on an instance of CSP(Gamma) yields a globally consistent (whose every solution may be obtained straightforwardly from partial solutions) set of constraints. In this case CSP(Gamma) is solvable in polynomial time. Our main contribution is an algorithm that under some natural conditions decides Abduction(Gamma, Hyp, M) in P when Gamma has local-to-global consistency. As we show in the number of examples, our approach offers an opportunity to consider abduction in the context of spatial and temporal reasoning (qualitative calculi such as Allen's interval algebra or RCC-5) and that our procedure solves some related abduction problems in polynomial time. |
15:35 | Pairwise-Covariance Linear Discriminant Analysis SPEAKER: unknown ABSTRACT. In machine learning, linear discriminant analysis (LDA) is a popular dimension reduction method. In this paper, we first pro- vide a new perspective of LDA from information theory per- spective. From this new perspective, we propose a new formula- tion of LDA, which uses the pairwise averaged class covariance instead of the globally averaged class covariance used in stan- dard LDA. This pairwise (averaged) covariance describes data distribution more accurately. The new perspective also provides a natural way to properly weight different pairwise distances, which emphasizes the pairs of class with small distances, and this leads to the proposed pairwise covariance properly weight- ed LDA (pcLDA). The kernel version of pcLDA is presented to handle nonlinear projections. Efficient algorithms are presented to efficiently compute the proposed models. |
15:35 | A Multiarmed Bandit Incentive Mechanism for Crowdsourcing Demand Response in Smart Grids SPEAKER: unknown ABSTRACT. Demand response is a critical part of renewable integration and energy cost reduction goals across the world. Motivated by the need to reduce costs arising from electricity shortage and renewable energy fluctuations, we propose a novel multiarmed bandit mechanism for demand response (MAB-MDR) which makes monetary offers to strategic consumers who have unknown response characteristics. Our work is inspired by connection to and intuition from crowdsourcing mechanisms. The proposed mechanism incorporates realistic features such as a time varying and quadratic cost function. The mechanism marries auctions, that allow users to report their preferences, with online algorithms, that allow distribution companies to learn user-specific parameters. We show that MAB-MDR is dominant strategy incentive compatible, individually rational, and achieves sublinear regret. Such mechanisms can be effectively deployed in smart grids using new information and control architecture innovations and lead to welcome savings in energy costs. |
15:35 | Strategyproof exchange with multiple private endowments SPEAKER: unknown ABSTRACT. We study a mechanism design problem for exchange economies where each agent is initially endowed with a set of indivisible goods and side payments are not allowed. We assume each agent can withhold some endowments, as well as misreport her preference. Under this assumption, strategyproofness requires that for each agent, reporting her true preference with revealing all her endowments is a dominant strategy, and thus implies individual rationality. Our objective in this paper is to analyze the effect of such private ownership in exchange economies with multiple endowments. As fundamental results, we first show that the revelation principle holds under a natural assumption and that strategyproofness and Pareto efficiency are incompatible even under the lexicographic preference domain. We then propose a class of exchange rules, each of which has a corresponding directed graph to prescribe possible trades, and provide a necessary and sufficient condition on the graph structure so that the rule satisfies strategy-proofness. |
15:35 | SOML: Sparse Online Metric Learning with Application to Image Retrieval SPEAKER: unknown ABSTRACT. Image similarity search plays a key role in many multimedia applications, where multimedia data (such as images and videos) are usually represented in high-dimensional feature space. In this paper, we propose a novel Sparse Online Metric Learning (SOML) scheme for learning sparse distance functions from large-scale high-dimensional data and explore its application to image retrieval. In contrast to many existing distance metric learning algorithms that are often designed for low-dimensional data, the proposed algorithms are able to learn sparse distance metrics from high-dimensional data in an efficient and scalable manner. Our experimental results show that the proposed method achieves better or at least comparable accuracy performance than the state-of-the-art non-sparse distance metric learning approaches, but enjoys a significant advantage in computational efficiency and sparsity, making it more practical for real-world applications. |
15:35 | Finding Median Point-Set Using Earth Mover's Distance SPEAKER: unknown ABSTRACT. In this paper, we study a prototype learning problem, called {\em Median Point-Set}, whose objective is to construct a prototype for a set of given point-sets so as to minimize the total earth mover's distances (EMD) between the prototype and the point-sets, where EMD between two point-sets is measured under affine transformation. For this problem, we present the first purely geometric approach. Comparing to existing graph-based approaches ({\em e.g.,} median graph, shock graph), our approach has several unique advantages: (1) No encoding and decoding procedures are needed to map between objects and graphs, and therefore avoid errors caused by information losing during the mappings; (2) Staying only in the geometric domain makes our approach computationally more efficient and robust to noise. As a key ingredient of our approach, we present the first quality guaranteed algorithm for minimizing EMD between two point-sets under affine transformation. We evaluate the performance of our technique for prototype reconstruction on a random dataset and a benchmark dataset, handwriting Chinese characters. Experiments suggest that our technique considerably outperforms the existing graph-based methods. |
15:35 | Intra-view and Inter-view Supervised Correlation Analysis for Multi-view Feature Learning SPEAKER: unknown ABSTRACT. The object always can be observed at multiple views, and multi-view feature learning is an attractive research topic with great practical success. Canonical correlation analysis (CCA) has become an important technique in multi-view learning, since it can fully utilize the inter-view correlation. In this paper, we mainly study the CCA based multi-view supervised feature learning technique where the labels of training samples are known. Several supervised CCA based multi-view methods have been presented, which focus on investigating the supervised correlation across different views. However, they take no account of the intra-view correlation between samples. Researchers have also introduced the discriminant analysis technique into multi-view feature learning, such as multi-view discriminant analysis (MvDA). But they ignore the canonical correlation within each view and between all views. In this paper, we propose a novel multi-view feature learning approach based on intra-view and inter-view supervised correlation analysis (I2SCA), which can explore the useful correlation information of samples within each view and between all views. The objective function of I2SCA is designed to simultaneously extract the discriminatingly correlated features from both inter-view and intra-view. It can obtain an analytical solution without iterative calculation. And we provide a kernelized extension of I2SCA to tackle the linearly inseparable problem in the original feature space. Three widely-used datasets are employed as test data. Experimental results demonstrate that our proposed approaches outperform several representative multi-view supervised feature learning methods. |
15:35 | Internally Stable Matchings and Exchanges SPEAKER: unknown ABSTRACT. Stability is a central concept in matching-based mechanism design. It imposes a fundamental requirement that no subset of agents could beneficially deviate from the prescribed outcome. However, deployment of stability in the current design of kidney exchange mechanisms presents at least two challenges. First, it reduces social welfare of the mechanism and sometimes prevent the mechanism from producing any outcome at all. Second, it sometimes incurs computation cost to clear the mechanism. In this paper, we propose an alternative notion of stability. Our theoretical and experimental studies demonstrate that the new notion of stability addresses both challenges above and could be deployed in the current kidney exchange design. |
15:35 | Testable Implications of Linear Structural Equation Models SPEAKER: unknown ABSTRACT. In causal inference, all methods of model learning rely on testable implications, namely, properties of the joint distribution that are dictated by the model structure. These constraints, if not satisfied in the data, allow us to reject or modify the model. Most common methods of testing a linear structural equation model (SEM) rely on the likelihood ratio or chi-square test which simultaneously tests all of the restrictions implied by the model. Local constraints, on the other hand, offer increased power (Bollen and Pearl, 2013; McDonald, 2002) and, in the case of failure, provide the modeler with insight for revising the model specification. One strategy of uncovering local constraints in linear SEMs is to search for overidentified path coefficients. While these overidentifying constraints are well known, no method has been given for systematically discovering them. In this paper, we extend the half-trek criterion of (Foygel et al., 2012) to identify a larger set of structural coefficients and use it to systematically discover overidentifying constraints. Still open is the question of whether our algorithm is complete. |
15:35 | Solving the Inferential Frame Problem in the General Game Description Language SPEAKER: unknown ABSTRACT. The Game Description Language GDL is the standard input language for general game-playing systems. While players can gain a lot of traction by an efficient inference algorithm for GDL, state-of-the-art reasoners suffer from a variant of a classical KR problem, the inferential frame problem. We present a method by which general game players can transform any given game description into a representation that solves this problem. Our experimental results demonstrate that with the help of automatically generated domain knowledge, a significant speedup can thus be obtained for the majority of the game descriptions from the AAAI competition. |
15:35 | Monte Carlo Filtering using Kernel Embedding of Distributions SPEAKER: unknown ABSTRACT. Recent advances of kernel methods have yielded the framework for representing probabilities using a reproducing kernel Hilbert space, called kernel embedding of distributions. In this paper, we propose a Monte Carlo filtering algorithm based on kernel embeddings. The proposed method is applied to state-space models where sampling from the transition model is possible, while the observation model is to be learned from training samples without assuming a parametric model. We derive convergence rates for the sampling method introduced to the kernel embedding approach. Experimental results on synthetic models and a real vision-based robot localization problem confirm the effectiveness of the proposed approach. |
15:35 | User Intent Identification from Online Discussions using a Joint Aspect-Action Topic Model SPEAKER: unknown ABSTRACT. Online discussions are growing as a popular, effective and reliable source of information for users because of their liveliness, flexibility and up-to-date information. Online discussions are usually developed and advanced by groups of users with various backgrounds and intents. However because of the diversities in topics and issues discussed by the users, supervised methods are not able to accurately model such dynamic conditions. In this paper, we propose a novel unsupervised generative model to derive aspect-action pairs from online discussions. The proposed method simultaneously captures and models these two features with their relationships that exist in each thread. We assume that each user post is generated by a mixture of aspect and action topics. Therefore, we design a model that captures the latent factors that incorporates the aspect types and intended actions, which describe how users develop a topic in a discussion. In order to demonstrate the effectiveness of our approach, we empirically compare our model against the state of the art methods on large-scale discussion dataset, crawled from apple discussions with over 3.3 million user posts from 340k discussion threads. |
15:35 | Envy-Free Division of Sellable Goods SPEAKER: unknown ABSTRACT. We study the envy-free allocation of indivisible goods between two players. Our novel setting includes an option to sell each good for a fraction of the minimum value any player has for the good. To rigorously quantify the efficiency gain from selling, we reason about the price of envy-freeness of allocations of sellable goods — the ratio between the maximum social welfare and the social welfare of the best envy-free allocation. We show that envy-free allocations of sellable goods are significantly more efficient than their unsellable counterparts. |
15:35 | Adaptive Knowledge Transfer for Multiple Instance Learning in Image Classification SPEAKER: unknown ABSTRACT. Multiple Instance Learning (MIL) is a popular learning technique in various vision tasks including image classification. However, most existing MIL methods do not consider the problem of insufficient examples in the given target category. In this case, it is difficult for traditional MIL methods to build an accurate classifier due to the lack of training examples. Motivated by the empirical success of transfer learning, this paper proposes a novel approach of Adaptive Knowledge Transfer for Multiple Instance Learning (AKT-MIL) in image classification. The new method transfers cross-category knowledge from source categories under multiple instance setting for boosting the learning process. A unified learning framework with a data-dependent mixture model is designed to adaptively combine the transferred knowledge from sources with a weak classifier built in the target domain. Based on this framework, an iterative coordinate descent method with Constraint Concave-Convex Programming (CCCP) is proposed as the optimization procedure. An extensive set of experimental results demonstrate that the proposed AKT-MIL approach substantially outperforms several state-of-the-art algorithms on two benchmark datasets, especially in the scenario when very few training examples are available in the target domain. |
15:35 | PAC Rank Elicitation through Adaptive Sampling of Stochastic Pairwise Preferences SPEAKER: unknown ABSTRACT. We introduce the problem of PAC rank elicitation, which consists of sorting a given set of options based on adaptive sampling of stochastic pairwise preferences. More specifically, the goal is to predict a ranking that is sufficiently close to a target order with high probability. We instantiate this setting with combinations of two different distance measures and ranking procedures for determining the target order. For these instantiations, we devise efficient sampling strategies and analyze the corresponding sample complexity. We also present first experiments to illustrate the practical performance of our methods. |
15:35 | Parallel Restarted Search SPEAKER: unknown ABSTRACT. We consider the problem of parallelizing restarted backtrack search. With few notable exceptions, most commercial and academic constraint programming solvers do not learn no-goods during search. Depending on the branching heuristics used, this means that there are little to no side-effects between restarts, making them an excellent target for parallelization. We develop a simple technique for parallelizing restarted search deterministically and demonstrate experimentally that we can achieve near-linear speed-ups in practice. |
15:35 | Optimal Neighborhood Preserving Visualization by Maximum Satisfiability SPEAKER: Kerstin Bunte ABSTRACT. We present a novel approach to low-dimensional neighbor embedding for visualization: we formulate an information retrieval based neighborhood preservation cost function as Maximum satisfiability on a discretized output display, and maximize the number of clauses preserved. The method has a rigorous interpretation as optimal visualization for neighbor retrieval. Unlike previous low-dimensional neighbor embedding methods, our satisfiability formulation is guaranteed to yield a global optimum and does so reasonably fast. Unlike previous manifold learning methods yielding global optima of their cost functions, our cost function and method are designed for low-dimensional visualization where evaluation and minimization of visualization errors are crucial. Our method performs well in experiments, yielding clean embeddings of data sets where a state-of-the-art comparison method yields poor arrangements. In a real-world case study for semi-supervised WLAN positioning in buildings we outperform state-of-the-art methods, especially when having few measurements. |
15:35 | A Local Non-negative Pursuit Method for Intrinsic Manifold Structure Preservation SPEAKER: unknown ABSTRACT. The local neighborhood selection plays a crucial role for most representation based manifold learning algorithms. This paper reveals that an improper selection of neighborhood for learning representation will introduce negative components in the learnt representations. Importantly, the representations with negative components will affect the intrinsic manifold structure preservation. In this paper, a local non-negative pursuit (LNP) method is proposed for neighborhood selection and non-negative representations are learnt. Moreover, it is proved that the learnt representations are sparse and convex. Theoretical analysis and experimental results show that the proposed method achieves or outperforms the state-of-the-art results on various manifold learning problems. |
15:35 | Deep Salience: Visual Salience Modeling via Deep Belief Propagation SPEAKER: Richard Jiang ABSTRACT. Visual salience has been considered as an intriguing phenomenon observed in biologic neural systems. Numerous efforts have been made on mathematical modeling of visual salience using various feature contrasts, either locally or at global range. However, these algorithmic models treat this biologic phenomenon more like a mathematic problem and somehow ignores its biological instinct that visual salience arouses from the deep propagation of visual stimuli along the visual cortex. In this paper, we present a Deep Salience model that emulates this bio-inspired task, where a multi-layer successive Markov random fields (sMRF) is proposed to analyze the input image successively through its deep belief propagation. As its outcome, the foreground object can be automatically separated from the background in a fully unsupervised way. Experimental evaluation on benchmark datasets validated that our model can consistently outperform state-of-the-art salience models, yielding the highest recall rates, precision and F-measure scores in object detection. With this experimental validation, it is shown that the proposed bio-plausible deep belief network as an emulation of successive visual signal propagation along human visual cortex can functionally work well on solving real-world computational problems. |
15:35 | Context-aware Collaborative Topic Regression with Social Matrix Factorization for Recommender Systems SPEAKER: unknown ABSTRACT. Online social networking sites have become popular platforms on which users can link with each other and share information, not only basic rating information but also information such as contexts, social relationships, and item contents. However, as far as we know, no existing works systematically combine diverse types of information to build more accurate recommender systems. In this paper, we propose a novel context-aware hierarchical Bayesian method. First, we propose the use of spectral clustering for user-item subgrouping, so that users and items in similar contexts are grouped. We then propose a novel hierarchical Bayesian model that can make predictions for each user-item subgroup, our model incorporate not only topic modeling to mine item content but also social matrix factorization to handle ratings and social relationships. Experiments on an Epinions dataset show that our method significantly improves recommendation performance compared with six categories of state-of-the-art recommendation methods in terms of both prediction accuracy and recall. We have also conducted experiments to study the extent to which ratings, contexts, social relationships, and item contents contribute to recommendation performance in terms of prediction accuracy and recall. |
15:35 | Delivering Guaranteed Display Ads under Reach and Frequency Requirements SPEAKER: unknown ABSTRACT. We propose a new idea in the allocation and serving of online advertising. We show that by using predetermined fixed-length streams of ads (which we call patterns) to serve advertising, we can incorporate a variety of interesting features into the ad allocation optimization problem. In particular, our formulation optimizes for representativeness as well as user-level diversity and pacing of ads, under reach and frequency requirements. We show how the problem can be solved efficiently using a column generation scheme in which only a small set of best patterns are kept in the optimization problem. Our numerical tests show that with parallelization of the pattern generation process, the algorithm has a promising run time and memory usage. |
15:35 | Direct Semantic Analysis for Social Image Classification SPEAKER: unknown ABSTRACT. This paper presents a direct semantic analysis method for learning the correlation matrix between visual and textual words from socially tagged images. In the literature, to improve the traditional visual bag-of-words (BOW) representation, latent semantic analysis has been studied extensively for learning a compact visual representation, where each visual word may be related to multiple latent topics. However, these latent topics do not convey any true semantic information which can be understood by human. In fact, it remains a challenging problem how to recover the relationships between visual and textual words. Motivated by the recent advances in dealing with socially tagged images, we develop a direct semantic analysis method which can explicitly learn the correlation matrix between visual and textual words for social image classification. To this end, we formulate our direct semantic analysis from a graph-based learning viewpoint. Once the correlation matrix is learnt, we can readily first obtain a semantically refined visual BOW representation and then apply it to social image classification. Experimental results on two benchmark image datasets show the promising performance of the proposed method. |
15:35 | Double Configuration Checking in Stochastic Local Search for Satisfiability SPEAKER: unknown ABSTRACT. Stochastic local search (SLS) algorithms have shown effectiveness on satisfiable instances of the Boolean satisfiability (SAT) problem. However, their performance is still unsatisfactory on random k-SAT at the phase transition, which is of significance and is one of the most empirically hardest distributions of SAT instances. In this paper, we propose a new heuristic called DCCA, which combines two configuration checking (CC) strategies with different definitions of configuration in a novel way. We use the DCCA heuristic to design an efficient SLS solver for SAT dubbed DCCASat. The experiments show that the DCCASat solver significantly outperforms a number of state-of-the-art solvers on extensive random k-SAT benchmarks at the phase transition. Moreover, further empirical analyses on structured benchmarks indicate the robustness of DCCASat. |
15:35 | GenEth: A General Ethical Dilemma Analyzer SPEAKER: unknown ABSTRACT. We contend that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. As it is likely that in many particular cases of ethical dilemmas ethicists agree on the ethically relevant features and the right course of action, generalization of such cases can be used to help discover principles needed for ethical guidance of the behavior of autonomous systems. Such principles help ensure the ethical behavior of complex and dynamic systems and further serve as a basis for justification of their actions as well as a control abstraction for managing unanticipated behavior. To provide assistance in developing ethical principles, we have developed GENETH, a general ethical dilemma analyzer that, through a dialog with ethicists, codifies ethical principles in any given domain. GENETH has been used to codify principles in a number of domains pertinent to the behavior of autonomous systems and these principles have been verified using an Ethical Turing Test. |
16:30 | Mapping Users Across Networks by Manifold Alignment on Hypergraph SPEAKER: unknown ABSTRACT. Nowadays many people are members of multiple online social networks simultaneously, such as Facebook, Twitter and some other instant messaging circles. But these networks are usually isolated from each other. Mapping common users cross these social networks will be beneficial for cross network recommendation or expanding one’s social circle. Methods based on username comparison perform well on parts of users, however they can not work in the following situations: (a) users choose completely different usernames in different networks; (b) a unique username corresponds to different individuals. In this paper, we propose to utilize social structures to improve the mapping performance. Specifically, a novel subspace learning algorithm, Manifold Alignment on Hypergraph (MAH), is proposed. Different from traditional semi-supervised manifold alignment methods, we use hypergraph to model high-order relations here. For a target user in one network, the proposed algorithm ranks all users in the other network by their probabilities of being the corresponding user. Moreover, methods based on username comparison can be incorporated with our algorithm easily to further boost the mapping accuracy. In experiments, we use both simulation data and real world data to test the proposed method. Experiment results have demonstrated the effectiveness of our proposed algorithm in mapping users cross networks. |
16:30 | Worst-Case Solution Quality Analysis When Not Re-Expanding Nodes in Best-First Search SPEAKER: unknown ABSTRACT. The use of inconsistent heuristics with A* can result in increased runtime due to the need to re-expand nodes. Poor performance can also be seen with Weighted A* if nodes are re-expanded. While the negative impact of re-expansions can often be minimized by setting these algorithms to never expand nodes more than once, the result can be a lower solution quality. In this paper, we formally show that the loss in solution quality can be bounded based on the amount of inconsistency along optimal solution paths. This bound holds regardless of whether the heuristic is admissible or inadmissible, though if the heuristic is admissible the bound can be used to show that not re-expanding nodes can have at most a quadratic impact on the quality of solutions found when using A*. We then show that the bound is tight by describing a process for the construction of graphs for which a best-first search that does not re-expand nodes will find solutions whose quality is arbitrarily close to that given by the bound. Finally, we will use the bound to extend a known result regarding the solution quality of WA* when weighting a consistent heuristic, so that it applies to other types of heuristic weighting. |
16:30 | Coactive Learning for Locally Optimal Problem Solving SPEAKER: unknown ABSTRACT. Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a global or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a candidate solution. The goal of learning in this new setting is to minimize the cost as measured by the expert effort over time. We first establish theoretical bounds on the average cost of the existing coactive Perceptron algorithm. In addition, we consider new online algorithms that use cost-sensitive and Passive-Aggressive (PA) updates, showing similar or improved theoretical bounds. We provide an empirical evaluation of the learners in 5 domains, which show that the Perceptron based algorithms are quite effective and that unlike the case for online classification, the PA algorithms do not yield significant performance gains. |
16:30 | Prediction of Helpful Reviews using Emotions Extraction SPEAKER: unknown ABSTRACT. Reviews keep playing an increasingly important role in the decision process of buying products and booking hotels. However, the large amount of available information can be confusing to users. A more succinct interface, gathering only the most helpful reviews, can reduce information processing time and save effort. To create such an interface in real time, we need reliable prediction algorithms to classify and predict new reviews which have not been voted but are potentially helpful. So far such helpfulness prediction algorithms have benefited from structural aspects, such as the length and readability score. Since emotional words are at the heart of our written communication and are powerful to trigger listeners’ attention, we believe that emotional words can serve as important parameters for predicting helpfulness of review text. Using GALC, a general lexicon of emotional words associated with a model representing 20 different categories, we extracted the emotionality from the review text and applied supervised classification method to derive the emotion-based helpful review prediction. As the second contribution, we propose an evaluation framework comparing three different real-world datasets extracted from the most well-known product review websites. This framework shows that emotion-based methods are outperforming the structure-based approach, by up to 9%. |
16:30 | Supervised Scoring with Monotone Multidimensional Splines SPEAKER: Abraham Othman ABSTRACT. Scoring involves the compression of a number of quantitative attributes into a single meaningful value. We consider the problem of how to generate scores in a setting where they should be weakly monotone (either non-increasing or non-decreasing) in their dimensions. Our approach allows an expert to score an arbitrary set of points to produce meaningful, continuous, monotone scores over the entire domain, while exactly interpolating through those inputs. In contrast, existing monotone interpolating methods only work in two dimensions and typically require exhaustive grid input. Our technique significantly lowers the bar to score creation, allowing domain experts to develop mathematically coherent scores. The method is used in practice to create the LEED Performance energy and water scores that gauge building sustainability. |
16:30 | On the Axiomatic Characterization of Runoff Voting Rules SPEAKER: unknown ABSTRACT. Runoff voting rules such as single transferable vote (STV) and Baldwin's rule are of particular interest in computational social choice due to their recursive nature and hardness of manipulation, as well as in (human) practice because they are relatively easy to understand. However, they are not known for their compliance with desirable axiomatic properties, which we attempt to rectify here. We characterize runoff rules that are based on scoring rules using two axioms: a weakening of local independence of irrelevant alternatives and a variant of population-consistency. We then show, as our main technical result, that STV is the only runoff scoring rule satisfying an independence-of-clones property. Furthermore, we provide axiomatizations of Baldwin's rule and Coombs' rule. |
16:30 | Huffman Coding for Storing Non-uniformly Distributed Messages in Networks of Neural Cliques SPEAKER: unknown ABSTRACT. Associative memories are data structures that allow retrieval of previously stored messages given part of their content. They thus behave similarly to human brain's memory that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Nevertheless, it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. Recently, a new family of sparse associative memories achieving almost-optimal efficiency has been proposed. Their structure induces a direct mapping between input messages and stored patterns. In this work, we show the impact of non-uniformity on the performance of this recent model and we exploit the structure of the model to introduce several strategies to allow for efficient storage of non-uniform messages. We show that a technique based on Huffman coding is the most efficient. |
16:30 | Using Timed Game Automata to Synthesize Execution Strategies for Simple Temporal Networks with Uncertainty SPEAKER: unknown ABSTRACT. A Simple Temporal Network with Uncertainty (STNU) is a structure for representing and reasoning about temporal constraints in domains where some temporal durations are not controlled by the executor. The most important property of an STNU is whether it is dynamically controllable (DC); that is, whether there exists a strategy for executing the controllable time-points that guarantees that all constraints will be satisfied no matter how the uncontrollable durations turn out. This paper provides a novel mapping from STNUs to Timed Game Automata (TGAs) that: (1) explicates the deep theoretical relationships between STNUs and TGAs; and (2) enables the memoryless strategies generated from the TGA to be transformed into equivalent STNU execution strategies that reduce the real-time computational burden for the executor. The paper formally proves that the STNU-to-TGA encoding properly captures the execution semantics of STNUs. It also provides experimental evidence of the proposed approaches, generating offline execution strategies for dynamically controllable STNUs encoded as TGAs. |
16:30 | Semantic Data Representation for Improving Tensor Factorization SPEAKER: unknown ABSTRACT. Predicting human activities is important for improving recommendation systems or analyzing social relationships among users. Those human activities are usually represented as multi-object relationships (e.g. user's tagging activity for items or user's tweeting activity at some location). Since multi-object relationships are naturally represented as a tensor, tensor factorization is becoming more important for predicting users' possible activities. However, the prediction accuracy of tensor factorization is weak for ambiguous and/or sparsely observed objects. Our solution, Semantic data Representation for Tensor Factorization (SRTF), tackles these problems by incorporating semantics into tensor factorization based on the following ideas: (1) it links objects to vocabularies/taxonomies and resolves the ambiguity caused by objects that can be used for multiple purposes. (2) it links objects to composite classes that merge classes in different kinds of vocabularies/taxonomies (e.g. classes in vocabularies for movie genres and those for directors) to avoid low prediction accuracy caused by rough-grained semantic space. (3) it lifts sparsely observed objects into their classes to solve the sparsity problem for rarely observed objects. Experiments show that SRTF achieves 10\% higher accuracy than current best methods. |
16:30 | Uncorrelated Multi-View Discrimination Dictionary Learning for Recognition SPEAKER: unknown ABSTRACT. Dictionary learning (DL) has now become an important feature learning technique that owns state-of-the-art recog-nition performance. Due to sparse characteristic of data in real-world applications, dictionary learning uses a set of learned dictionary bases to represent the linear decomposi-tion of a data point. Fisher discrimination DL (FDDL) is a representative supervised DL method, which constructs a structured dictionary whose atoms correspond to the class labels. Recent years have witnessed a growing interest in multi-view (more than two views) feature learning tech-niques. Although some multi-view (or multi-modal) DL methods have been presented, there still exists much room for improvement. How to enhance the total discriminability of dictionaries and reduce their redundancy is a crucial re-search topic. To boost the performance of multi-view dic-tionary learning technique, we propose an uncorrelated multi-view fisher discrimination DL (UMFDDL) approach for recognition. By making dictionary atoms correspond to the class labels such that the obtained reconstruction error is discriminative, UMFDDL aims to jointly learn multiple dic-tionaries with totally favorable discriminative power. Fur-thermore, we design the uncorrelated constraint for multi-view DL, so as to reduce the redundancy among dictionaries learned from different views. Experiments on several public datasets demonstrate the effectiveness of the proposed approach. |
16:30 | Exploiting Support Sets for Answer Set Programs with External Evaluations SPEAKER: unknown ABSTRACT. Answer set programs (ASP) with external evaluations are a declarative means to capture advanced applications. However, their evaluation can be expensive due to external source accesses. In this paper we consider hex-programs that provide external atoms as a bidirectional interface to external sources and present a novel evaluation method based on support sets, which informally are portions of the input to an external atom that will determine its output for any completion of the partial input. Support sets allow one to shortcut the external source access, which can be completely eliminated. This is particularly attractive if a compact representation of suitable support sets is efficiently constructible. We discuss some applications with this property, among them description logic programs over DL-Lite ontologies, and present experimental results showing that support sets can significantly improve efficiency. |
16:30 | Binary Aggregation by Selection of the Most Representative Voter SPEAKER: unknown ABSTRACT. In binary aggregation, a group of voters each express yes/no choices regarding a number of possibly correlated issues and we are asked to decide on a collective choice that accurately reflects the views of this group. A good collective choice will minimise the distance to each of the individual choices, but using such a distance-based aggregation rule is computationally intractable. Instead, we explore a class of low-complexity aggregation rules that select the most representative voter in any given situation and return that voter's choice as the collective outcome. |
16:30 | Preprocessing for Propositional Model Counting SPEAKER: unknown ABSTRACT. This paper is concerned with preprocessing techniques for propositional model counting. We have implemented a preprocessor which includes many elementary preprocessing techniques, including occurrence reduction, vivification, backbone identification, as well as equivalence, AND and XOR gate identification and replacement. We performed intensive experiments, using a huge number of benchmarks coming from a large number of families. Two approaches to model counting have been considered downstream: ”direct” model counting using Cachet and compilation-based model counting, based on the C2D compiler. The experimental results we have obtained show that our preprocessor is both efficient and robust. |
16:30 | Locality Preserving Projection for Domain Adaptation with Multi-Objective Learning SPEAKER: unknown ABSTRACT. In many practical cases, we need to generalize a model trained in a source domain to a new target domain.However, the distribution of these two domains may differ very significantly, especially sometimes some crucial target features may not have support in the source domain. This paper propose a novel locality preserving projection methods for domain adaptation task,which can find a linear mapping preserving the 'intrinsic structure' for both the source and the target domain. In this work, we first construct two graphs encoding the neighborhood information for the source domain and target domain separately. We then try to find linear projection coefficients which have the property of locality preserving for each graph.Instead of combing the two objective function under compatibility assumption and requiring the user to decide the importance of each objective function. We propose a multi-objective formulation for this problem and solve it simultaneously using pareto optimization.The pareto frontier captures all possible good linear projection vectors that are prefered by one or more objectives.The effectiveness of our approach is justified by both theoretical analysis and empirical results on real world data sets. The new feature representation shows better prediction accuracy as our experiment clearly demonstrated. |
16:30 | Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization SPEAKER: unknown ABSTRACT. Missing values are a common problem when applying classification algorithms to real-world medical data. This is especially true for trauma patients, where clinical variables frequently go uncollected due to the severity and urgency of their condition. Standard approaches to handling missingness first learn a model to estimate missing data values, and subsequently train and evaluate a classifier using data imputed with this model. Recently, several works have demonstrated the benefits of jointly estimating the imputation model and classifier parameters. However existing approaches make assumptions that limit their utility with many real-world medical datasets, particularly that data elements are missing at random, which is often invalid. We present a novel approach to jointly learning the imputation model and classifier. Unlike existing methods, the proposed approach makes no assumptions about the missingness of the data, can be used with arbitrary probabilistic data models and classification loss functions, and can be used when both training and testing data have missing values. We investigate the approach's utility in the prediction of several patient outcomes in a large national registry of trauma patients, and find that it significantly outperforms standard sequential methods. |
16:30 | Qualitative Reasoning with Modelica Models SPEAKER: unknown ABSTRACT. Applications of qualitative reasoning to engineering design face a knowledge acquisition challenge. Designers are not fluent in qualitative modeling languages and techniques. To overcome this barrier, we perform qualitative simulation using models solely written in Modelica, a popular language among designers for modeling hybrid systems. This paper has two contributions: (1) a formalization of the relationship between the results of the Modelica and qualitative simulations for the same model along with a novel algorithm for computing the consequences of events in qualitative simulation, and (2) three classes of additional constraints that reduce the number of unrealizable trajectories when performing qualitative simulation with Modelica models. We support these contributions with examples and a case study that shows a reduction by a factor of six the size of the qualitative simulation. |
16:30 | Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs SPEAKER: unknown ABSTRACT. Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs. |
16:30 | Hybrid Heterogeneous Transfer Learning through Deep Learning SPEAKER: unknown ABSTRACT. Most previous heterogeneous transfer learning methods learn a cross-domain feature mapping between heterogeneous feature spaces based on a few cross-domain instance-correspondences, and these corresponding instances are assumed to be representatives in the source and target domains respectively. However, in many real-world scenarios, this assumption may not hold. As a result, the feature mapping may not be constructed precisely or the mapped data from the source (or target) domain may suffer from a data or feature shift issue in the target (or source) domain. In this case, a classifier trained on the labeled transformed-source-domain data may not be useful for the target domain. In this paper, we present a new transfer learning framework called {\em Hybrid Heterogeneous Transfer Learning} (HHTL), which allows choosing the corresponding instances to be biased in either the source or target domain. Moreover, we propose a deep learning approach to learn better feature representations of each domain data to reduce the data shift issue, and a better feature mapping between cross-domain heterogeneous features for the accurate transfer simultaneously. Extensive experiments on several multilingual sentiment classification tasks verify the effectiveness of our proposed approach compared with the baseline methods. |
16:30 | Agent Behavior Prediction and Its Generalization Analysis SPEAKER: unknown ABSTRACT. Machine learning algorithms have been applied to predict agent behaviors in real-world dynamic systems, such as advertiser behaviors in sponsored search and worker behaviors in crowdsourcing, and the prediction models have been used for the optimization of these systems. Note that the behavior data in these systems are generated by \emph{live} agents: once the systems change due to the adoption of the prediction models learnt from the behavior data, agents will observe (directly or indirectly) and respond to these changes by changing their own behaviors accordingly. As a result, the behavior data will evolve and will not be identically and independently distributed, which poses great challenges to the theoretical analysis on the machine learning algorithms for behavior prediction. To tackle this challenge, in this paper, we propose to use \emph{Markov Chain in Random Environments} (MCRE) to describe the behavior data, and perform generalization analysis of the machine learning algorithms on its basis. However, since the one-step transition probability matrix of MCRE depends on both previous states and the random environment, conventional techniques for generalization analysis cannot be directly applied. To address this issue, we propose a novel technique that transforms the original MCRE into a higher-dimensional time-homogeneous Markov chain. The new Markov chain involves more variables but is more regular, and thus easier to deal with. We prove the convergence of the new Markov chain when time approaches infinity. Then we prove a generalization bound for the machine learning algorithms on the behavior data generated by the new Markov chain, which depends on both the Markovian parameters and the covering number of the function class compounded by the loss function for behavior prediction and the behavior prediction model. To the best of our knowledge, this is the first work that performs the generalization analysis on data generated by complex processes in real-world dynamic systems. |
16:30 | Efficiently Implementing GOLOG with Answer Set Programming SPEAKER: Malcolm Ryan ABSTRACT. In this paper we investigate four different approaches to encoding domain-dependent control knowledge for Answer-Set Planning. Starting with a standard implementation of the answer-set planning language B, we add control knowledge expressed in the GOLOG logic programming language. A naive encoding, following the original definitions of Reiter et al., is shown to scale poorly. We investigate three alternative codings based on the finite-state machine semantics of ConGOLOG. These perform better, although there is no clear winner. We discuss the pros and cons of each approach. |
16:30 | Partial Multi-View Clustering SPEAKER: unknown ABSTRACT. Real data are often with multiple modalities or coming from multiple channels, while multi-view clustering provides a natural formulation for generating clusters from such data. Previous studies assumed that each example appears in all views, or at least there is one view containing all examples. In real tasks, however, it is often the case that every view suffers from the missing of some data and therefore results in many partial examples, i.e., examples with some views missing. In this paper, we present possibly the first study on partial multi-view clustering. Our proposed approach, PVC, works by establishing a latent subspace where the instances corresponding to the same example in different views are close to each other, and the instances (belonging to different examples) in the same view are gathering smoothly. Experiments demonstrate the advantages of our proposed approach. |
16:30 | Potential-Aware Imperfect-Recall Abstraction with Earth Mover's Distance in Imperfect-Information Games SPEAKER: unknown ABSTRACT. There is often a large disparity between the size of a game we wish to solve and the size of the largest instances solvable by the best algorithms; for example, a popular variant of poker has about $10^{165}$ nodes in its game tree, while the currently best approximate equilibrium-finding algorithms scale to games with around $10^{12}$ nodes. In order to approximate equilibrium strategies in these games, the leading approach is to create a sufficiently small strategic approximation of the full game, called an abstraction, and to solve that smaller game instead. The leading abstraction algorithm for imperfect-information games generates abstractions that have imperfect recall and are distribution aware, using $k$-means with the earth mover's distance metric to cluster similar states together. A distribution-aware abstraction groups states together at a given round if their full distributions over future strength are similar (as opposed to, for example, just the expectation of their strength). The leading algorithm considers distributions over future strength at the final round of the game. However, one might benefit by considering the distribution over strength in all future rounds, not just the final round. An abstraction algorithm that takes all future rounds into account is called potential aware. We present the first algorithm for computing potential-aware imperfect-recall abstractions using earth mover's distance. Experiments on no-limit Texas Hold'em show that our algorithm improves performance over the previously best approach. |
16:30 | Finding the k-best Equivalence Classes of Bayesian Network Structures for Model Averaging SPEAKER: unknown ABSTRACT. In this paper we develop an algorithm to find the k-best equivalence classes of Bayesian networks. Our algorithm is capable of finding much more best DAGs than the previous algorithm that directly finds the k-best DAGs (Tian, He and Ram 2010). We demonstrate our algorithm in the task of Bayesian model averaging. Empirical results show that our algorithm significantly outperforms the k-best DAG algorithm in both time and space to achieve the same quality of approximation. Our algorithm goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery. |
16:30 | Machine Translation with Real-time Web Search SPEAKER: unknown ABSTRACT. Contemporary machine translation systems usually rely on offline data retrieved from the web for individual model training, such as translation models and language models. Distinct from existing methods, we propose a novel approach that treats machine translation as a web search task and utilizes the web on the fly to acquire translation knowledge. This end-to-end approach takes advantage of fresh web search results that are capable of leveraging tremendous web knowledge to obtain phrase-level candidates on demand and then compose sentence-level translations. Experimental results show that our web-based machine translation method demonstrates very promising performance in leveraging fresh translation knowledge and making translation decisions. Furthermore, when combined with offline models, it significantly outperforms a state-of-the-art phrase-based statistical machine translation system. |
16:30 | Novel Density-based Clustering Algorithms for Uncertain Data SPEAKER: unknown ABSTRACT. Density-based techniques seem promising for handling data uncertainty in uncertain data clustering. Nevertheless, some issues have not been addressed well in existing algorithms. In this paper, we firstly propose a novel density-based uncertain data clustering algorithm, which improves upon existing algorithms from the following two aspects: (1) it employs an exact method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in previous work; (2) it introduces new definitions of core object probability and direct reachability probability, thus reducing the complexity and avoiding sampling. We then further improve the algorithm by using a novel assignment strategy to ensure that every object will be assigned to the most appropriate cluster. Experimental results show the superiority of our proposed algorithms over existing ones. |
16:30 | Feature Selection at the Discrete Limit SPEAKER: unknown ABSTRACT. Feature selection plays an important role in many machine learning and data mining applications. In this paper, we propose to use L2,p norm for feature selection with emphasis on small p. As p appoaches 0, feature selection becomes discrete feature selection problem.We provide two algorithms, proximal gradient algorithm and rank one update algorithm, which is more efficient at large regularization . Experiments on real life data sets show that features selected at small p consistently outperform features selected at p = 1, the standard L2,1 approach and other popular feature selection methods. |
16:30 | Dynamic Bayesian Probabilistic Matrix Factorization SPEAKER: Sotirios Chatzis ABSTRACT. Collaborative filtering algorithms generally rely on the assumption that user preference patterns remain stationary. However, real-world relational data are seldom stationary. User preference patterns may change over time, giving rise to the requirement of designing collaborative filtering systems capable of detecting and adapting to preference pattern shifts. Motivated by this observation, in this paper we propose a dynamic Bayesian probabilistic matrix factorization model, designed for modeling time-varying distributions. Formulation of our model is based on imposition of a dynamic hierarchical Dirichlet process (dHDP) prior over the space of probabilistic matrix factorization models to capture the time-evolving statistical properties of modeled sequential relational datasets. We develop a simple Markov Chain Monte Carlo sampler to perform inference. We present experimental results to demonstrate the superiority of our temporal model. |
16:30 | Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization SPEAKER: unknown ABSTRACT. This paper proposes the Proximal Iteratively REweighted (PIRE) algorithm for solving a general problem, which involves a large body of nonconvex sparse and structured sparse related problems. Comparing with previous iterative solvers for nonconvex sparse problem, PIRE is much more general and efficient. The computational cost of PIRE in each iteration is usually as low as the state-of-the-art convex solvers. We further propose the PIRE algorithm with Parallel Splitting (PIRE-PS) and PIRE algorithm with Alternative Updating (PIRE-AU) to handle the multi-variable problems. In theory, we prove that our proposed methods converge and any limit solution is a stationary point. Extensive experiments on both synthesis and real data sets demonstrate that our methods achieve comparative learning performance, but are much more efficient, by comparing with previous nonconvex solvers. |
16:30 | Trust Prediction with Propagation and Similarity Regularization SPEAKER: unknown ABSTRACT. Online social networks have been used for a variety of rich activities in recent years, such as investigating potential employees or seeking recommendations of high quality services and service providers. In such activities trust is one of the most vital factors for users' decision-making. In the literature, the state-of-the-art trust prediction approaches either focus on dispositional bias and propagated trust value of the pair-wise trust relationship along a path or on the similarity of trust rating values. However, another factor, the distribution of trust ratings, also affects the trust between users. In addition, bias, propagated trust and similarity are of different types, but were treated the same. Therefore, how to utilize the factors needs further improvement. In this paper we propose a new trust prediction model based on trust decomposition and matrix factorization, considering all the above essential factors to predict the trust between two users who are not directly connected. In this model, we firstly decompose trust into biased trust and bias-reduced trust. Then based on bias-reduced trust ratings, matrix factorization with a similarity regularization term, which takes advantages of both users' rating habits and propagated trust, is proposed to predict missing trust values. In the end, the missing trust is recomposed with predicted trust values and bias. Experiments conducted on a real-world dataset illustrate significantly improved prediction accuracy over the state-of-the-art approaches. |
16:30 | Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition SPEAKER: unknown ABSTRACT. Multi-view clustering, which seeks a partition of the data in multiple views that often provide complementary information to each other, has received considerable attention in recent years. In real life clustering problems, the data in each view may have considerable noise. However, existing clustering methods blindly combine the information from multi-view data with possibly considerable noise, which often degrades their performance. In this paper, we propose a novel Markov chain method for \textit{Robust Multi-view Spectral Clustering} (RMSC). Our method has a flavor of low-rank and sparse decomposition, where we firstly construct a transition probability matrix from each single view, and then use these matrices to recover a shared low-rank transition probability matrix as a crucial input to the standard Markov chain method for clustering. The optimization problem of RMSC has a low-rank constraint on the transition probability matrix, and simultaneously a probabilistic simplex constraint on each of its rows. To solve this challenging optimization problem, we propose an optimization procedure based on the Augmented Lagrangian Multiplier scheme. Experimental results on various real world datasets show that the proposed method has superior performance over several state-of-the-art methods for multi-view clustering. |
16:30 | Similarity-Preserving Binary Signature for Linear Subspaces SPEAKER: unknown ABSTRACT. Linear subspace is an important representation for many kinds of real-world data in computer vision and pattern recognition, e.g. faces, motion videos, speech. In this paper, first we define pairwise angular similarity and angular distance for linear subspaces. The angular distance satisfies non-negativity, identity of indiscernibles, symmetry and triangle inequality, and thus it is a metric. Then we propose a method to compress linear subspaces into compact similarity-preserving binary signatures, between which the normalized Hamming distance is an unbiased estimator of the angular distance. We provide a lower bound on the length of binary signatures which suffices to guarantee a uniform distance-preservation within a set of subspaces. Experiments on face recognition demonstrate the effectiveness of this binary signature in terms of recognition accuracy, speed and storage requirement. The results show that, compared with the exact method, the approximation with binary signatures achieves an order of magnitude speed-up, while requiring significantly smaller amount of storage space, yet it still accurately preserves the similarity, and achieves high recognition accuracy comparable to the exact method in face recognition. |
16:30 | Leveraging Fee-Based, Imperfect Advisors in Human-Agent Games of Trust SPEAKER: unknown ABSTRACT. This paper explores whether the addition of costly, imperfect, and exploitable advisors to Berg's investment game enhances or detracts from investor performance in both one-shot and multi-round interactions. We then leverage our findings to develop an automated investor agent that performs as well as or better than humans in these games. To gather this data, we extended Berg's game and conducted a series of experiments using Amazon's Mechanical Turk to determine how humans behave in these potentially adversarial conditions. Our results indicate that, in games of short duration, advisors do not stimulate positive behavior and are not useful in providing actionable advice. In long-term interactions, however, advisors do stimulate positive behavior with significantly increased investments and returns. By modeling human behavior across several hundred participants, we were then able to develop agent strategies that maximized return on investment and performed as well as or significantly better than humans. In one-shot games, we identified an ideal investment value that, on average, resulted in positive returns as long as advisor exploitation was not allowed. For the multi-round games, our agents relied on the corrective presence of advisors to stimulate positive returns on maximum investment. |