PROGRAM FOR TUESDAY, JULY 29TH
View: session overviewtalk overview
08:30 | AAAI Welcome and Award Presentations SPEAKER: Carla Brodley |
08:40 | IAAI Welcome and Awards SPEAKER: David Stracuzzi |
08:47 | IJCAI-JAIR Best Paper Award Presentation SPEAKER: Craig Boutilier |
08:50 | AAAI Special Awards and Honors SPEAKER: Henry Kautz |
09:00 | From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge SPEAKER: Craig Knoblock ABSTRACT. Companies, such as Google and Microsoft, are building web-scale linked knowledge bases for the purpose of indexing and searching the web, but these efforts do not address the problem of building accurate, fine-grained, deep knowledge bases for specific application domains. We are developing an integration framework, called Karma, which supports the rapid, end-to-end construction of such linked knowledge bases. In this talk I will describe machine-learning techniques for mapping new data sources to a domain model and linking the data across sources. I will also present several applications of this technology, including building virtual museums and integrating data sources for peacebuilding. |
10:20 | Parametrized Families of Hard Planning Problems from Phase Transitions SPEAKER: unknown ABSTRACT. There are two complementary ways to evaluate planning algorithms: performance on benchmark problems derived from real applications and analysis of performance on parametrized families of problems with known properties. Prior to this work, few means of generating parametrized families of hard planning problems were known. We generate hard planning problems from the solvable/unsolvable phase transition region of well-studied NP-complete problems that map naturally to navigation and scheduling, aspects common to many planning domains. Our results confirm exponential scaling of hardness with problem size, even at very small problem sizes. We observe significant differences between state-of-the-art planners on these problem families, enabling us to gain insight into the relative strengths and weaknesses of these planners. These families provide complementary test sets exhibiting properties not found in existing benchmarks. |
10:35 | Backdoors to Planning SPEAKER: Martin Kronegger ABSTRACT. Backdoors measure the distance to tractable fragments and have become an important tool to find fixed-parameter tractable (fpt) algorithms. Despite their success, backdoors have not been used for planning, a central problem in AI that has a high computational complexity. In this work, we introduce two notions of backdoors building upon the causal graph. We analyze the complexity of finding a small backdoor (detection) and using the backdoor to solve the problem (evaluation) in the light of planning with (un)bounded domain/plan length. For each setting we present either an fpt-result or rule out the existence thereof by showing parameterized intractability. In three cases we achieve the most desirable outcome: detection and evaluation are fpt. |
10:50 | Scheduling for Transfers in Pickup and Delivery Problems with Very Large Neighborhood Search SPEAKER: unknown ABSTRACT. In pickup and delivery problems (PDPs), vehicles pick up and deliver a set of items under various constraints. We extend the well-studied PDP by allowing vehicles to transfer items to and from one another. By scheduling transfers, the fleet of vehicles can deliver the items faster and at lower cost. We introduce the Very Large Neighborhood Search with Transfers (VLNS-T) algorithm to form schedules for PDPs with transfers. We show that VLNS-T algorithm makes use of transfers to improve upon the best known solutions for selected benchmark problems, and demonstrate its effectiveness on real world taxi data in New York City. |
11:05 | A Scheduler for Actions with Iterated Durations SPEAKER: unknown ABSTRACT. A wide range of robotic missions contain actions that exhibit looping behavior. Examples of these actions include picking fruit in agriculture, pick-and-place tasks in manufacturing, or even search patterns in robotic search or survey missions. These looping actions often have a range of acceptable values for the number of loops and a preference function over them. For example, during robotic survey missions, the information gain is expected to increase with the number of loops in a search pattern. Since these looping actions also take time, which is typically bounded, there is a challenge of maximizing utility while respecting time constraints. In this paper, we introduce the Looping Temporal Problem with Preference (LTPP) as a formalism for encoding scheduling problems that contain looping actions. In addition, we introduce a scheduling algorithm for LTPPs, which leverages the structure of the problem to find the optimal solution efficiently. |
11:20 | Best Paper Nominee: Generalized Label Reduction for Merge-and-Shrink Heuristics SPEAKER: unknown ABSTRACT. Label reduction is a technique for simplifying families of labeled transition systems by dropping distinctions between certain transition labels. While label reduction is critical to the efficient computation of merge-and-shrink heuristics, current theory only permits reducing labels in a limited number of cases. We generalize this theory so that labels can be reduced in every intermediate abstraction of a merge-and-shrink tree. This is particularly important for efficiently computing merge-and-shrink abstractions based on non-linear merge strategies. As a case study, we implement a non-linear merge strategy based on the original work on merge-and-shrink heuristics in model checking by Dräger et al. |
10:20 | Evaluation and Deployment of a People-to-People Recommender in Online Dating SPEAKER: Alfred Krzywicki ABSTRACT. This paper reports on the successful deployment of a people-to-people recommender system in a large commercial online dating site. The deployment was the result of thorough evaluation and an online trial of a number of methods, including profile-based, collaborative filtering and hybrid algorithms. Results taken a few months after deployment show that key metrics generally hold their value or show an increase compared to the trial results, and that the recommender system delivered its projected benefits. |
10:50 | The Quest Draft: An Automated Course Allocation Algorithm SPEAKER: Richard Hoshino ABSTRACT. Course allocation is one of the most complex issues facing any university, due to the sensitive nature of deciding which subset of students should be granted seats in highly-popular (market-scarce) courses. In recent years, researchers have proposed numerous solutions, using techniques in integer programming, combinatorial auction design, and matching theory. In this paper, we present a four-part AI-based course allocation algorithm that was conceived by an undergraduate student, and recently implemented at a small Canadian liberal arts university. This new allocation process, which builds upon the Harvard Business School Draft, has received overwhelming support from students and faculty for its transparency, impartiality, and effectiveness. |
11:20 | THink: Inferring Cognitive Status from Subtle Behaviors SPEAKER: Randall Davis ABSTRACT. The Digital Clock Drawing Test is a fielded application that provides a major advance over existing neuropsychological testing technology. It captures and analyzes high precision information about both outcome and process, opening up the possibility of detecting subtle cognitive impairment even when test results appear superficially normal. We describe the design and development of the test, document the role of AI in its capabilities, and report on its use over the past seven years. We outline its potential implications for earlier detection and treatment of neurological disorders. We also set the work in the larger context of the THink project, which is exploring multiple approaches to determining cognitive status through the detection and analysis of subtle behaviors. |
11:35 | Challenges in Planning SPEAKER: Rao Khambampati |
11:35 | What's Hot in Autonomous Agents SPEAKER: Noa Agmon |
11:35 | Syskill & Webert: Identifying Interesting Web Sites SPEAKER: Michael J. Pazzani |
11:35 | What's Hot in Cognitive Science SPEAKER: Mathias Scheutz |
13:00 | Behavioral Network Science SPEAKER: Michael Kearns ABSTRACT. For a number of years, we have been conducting human subject experiments on collective and individual behavior and performance in social networks. These experiments have investigated diverse competitive, cooperative and computational tasks that include graph coloring, voting, trading and viral marketing under a wide variety of network structures. In this talk I will survey these experiments and their findings, emphasizing the questions they raise for multi-agent systems, machine learning, and other disciplines. |
14:00 | Can Agent Development Affect Developer's Strategy? SPEAKER: unknown ABSTRACT. Peer Designed Agents (PDAs), computer agents developed by non-experts, is an emerging technology, widely advocated in recent literature for the purpose of replacing people in simulations and investigating human behavior. Its main premise is that strategies programmed into these agents reliably reflect, to some extent, the behavior used by the programmer in real life. In this paper we show that PDA development has an important side effect that has not been addressed to date --- the process that merely attempts to capture one's strategy is also likely to affect the developer's strategy. The phenomenon is demonstrated experimentally using several performance measures. This result has many implications concerning the appropriate design of PDA-based simulations, and the validness of using PDAs for studying individual decision making. Furthermore, we obtain that PDA development actually improved the developer's strategy according to all performance measures. Therefore, PDA development can be suggested as a means for improving people's problem solving skills. |
14:30 | Active Learning with Model Selection SPEAKER: unknown ABSTRACT. Most work on active learning avoids the issue of model selection by training models of only one type (SVMs, boosted trees, etc.) using one pre-defined set of model hyperparameters. We propose an algorithm that actively samples data to simultaneously train a set of candidate models (different model types and/or different hyperparameters) and also to select the best model from this set of candidates. The algorithm actively samples points for training that are most likely to improve the accuracy of the more promising candidate models, and also samples points to use for model selection---all samples count against the same fixed labeling budget. This exposes a natural trade-off between the focused active sampling that is most effective for training models, and the unbiased uniform sampling that is better for model selection. We empirically demonstrate on six test problems that this algorithm is nearly as effective as an active learning oracle that knows the optimal model in advance. |
15:00 | Sketch Recognition with Natural Correction and Editing SPEAKER: unknown ABSTRACT. In this paper, we target at the problem of sketch recognition. We systematically study how to incorporate users' natural correction and editing into isolated and full sketch recognition. This is a natural and necessary interaction in real systems such as Visio where extremely similar shapes exist. First, a novel algorithm is proposed to mine the prior shape knowledge for three editing modes. Second, to differentiate visually similar shapes, a novel symbol recognition algorithm is introduced by leveraging the learnt shape knowledge. Then, a novel correction/editing detection algorithm is proposed to facilitate symbol recognition. Furthermore, both of the symbol recognizer and the correction/editing detector are systematically incorporated into the full sketch recognition. Finally, based on the proposed algorithms, a real-time sketch recognition system is built to recognize hand-drawn flowchart/diagram with flexible interactions. Extensive experiments on benchmark datasets show the effectiveness of the proposed algorithms. |
15:30 | Designing Fast Absorbing Markov Chains SPEAKER: unknown ABSTRACT. Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability distribution, we consider the problem of designing an absorbing Markov Chain that minimizes the time required to reach a sink node, by selecting transition probabilities subject to some natural regularity constraints. By exploiting the Markovian structure, we obtain closed form expressions for the objective function as well as its gradient, which can be thus evaluated efficiently without any simulation of the underlying process and fed to a gradient-based optimization package. For the special case of designing reversible Markov Chains, we show that global optimum can be efficiently computed by exploiting convexity. We demonstrate how our method can be used to evaluate and design local search methods tailored for certain domains. |
16:00 | On Hair Recognition in the Wild by Machine SPEAKER: unknown ABSTRACT. We present an algorithm for identity inference using only the information from the hair. Face recognition in the wild (i.e., unconstrained settings) is highly useful in a variety of applications, but performance suffers due to many factors, e.g., obscured face, lighting variation, extreme pose angle, and expression. It is well known that humans use hair information to guide identity decisions under many of these scenarios due to either the consistent hair appearance of the same subject or obvious hair discrepancy of different subjects, but little work exists to replicate this intelligence artificially. We propose a learned hair matcher using shape, color, and texture features derived from localized patches through an AdaBoost technique with abstaining weak classifiers when features are not present in the given location. The proposed hair matcher achieves 71.53% accuracy on the LFW View 2 dataset. Hair also reduces the error of a COTS face matcher through simple score-level fusion by 5.7%. |
16:30 | How Do Your Friends on Social Media Disclose Your Emotions? SPEAKER: unknown ABSTRACT. Mining emotions hidden in images has attracted significant interest, in particular with the rapid development of social networks. The emotional impact is very important for understanding the intrinsic meanings of images. Despite many studies have been done, most existing methods focus on image content, but ignore the emotions of the user who has published the image. To understand the emotional impact from images, one interesting question is: How does social effect correlate with the emotion expressed in an image? Specifically, can we leverage friends interactions (e.g., discussions) related to an image to help discover the emotions? In this paper, we formally formalize the problem and propose a novel emotion learning method by jointly modeling images posted by social users and comments added by friends. One advantage of the model is that it can distinguish those comments that are closely related to the emotion expression for an image from other irrelevant ones. Experiments on an open Flickr dataset show that the proposed model can significantly improve (+37.4% by F1) the accuracy for inferring emotions from images. More interestingly, we found that half of the improvements are due to interactions between 1% of the closest friends. |
17:00 | Automatic Construction and Natural-Language Description of Nonparametric Regression Models SPEAKER: unknown ABSTRACT. This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of possible statistical models to discover a good explanation of the data, and then produces a detailed report with figures and natural-language text. Our approach treats unknown functions nonparametrically using Gaussian processes, which has two important consequences. First, Gaussian processes model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models, this allows us to automatically describe functions through a decomposition into additive parts. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state-of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains. |
17:30 | Confident Reasoning on Raven’s Progressive Matrices Tests SPEAKER: unknown ABSTRACT. We report a novel approach to addressing the Raven’s Progressive Matrices (RPM) tests, one based upon purely visual representations. Our technique introduces the calculation of confidence in an answer and the automatic adjustment of level of resolution if that confidence is insufficient. We first describe the nature of the visual analogies found on the RPM. We then exhibit our algorithm and work through a detailed example. Finally, we present the performance of our algorithm on the four major variants of the RPM tests, illustrating the impact of confidence. This is the first such account of any computational model against the entirety of the Raven’s. |
18:00 | Learning Deep Representations for Graph Clustering SPEAKER: unknown ABSTRACT. Recently deep learning has been successfully adopted in many applications such as speech recognition, image classification, and natural language processing. In this work, we explore the possibility of employing deep learning in graph clustering. In particular, we propose a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs a $k$-means algorithm on the embedding to obtain the clustering result. We show that this simple method has a solid theoretical foundation, due to the equivalence between autoencoder and spectral clustering in terms of what they actually optimize. Then, we demonstrate that the proposed method is more efficient and flexible than spectral clustering. First, the computational complexity of autoencoder is much lower than spectral clustering: the former can be linear to the number of nodes in a sparse graph while the latter is super quadratic due to its dependency on an eigenvalue decomposition. Second, when additional constraints like sparsity is imposed, we can simply employ the \emph{sparse} autoencoder developed in the literature of deep learning; however, it is non-straightforward to implement a sparse spectral method. We have conducted comprehensive experiments to test the performance of the proposed method. The results on various graph datasets show that it can significantly outperform the conventional spectral clustering method. This clearly indicates the effectiveness of deep learning in graph clustering, and enriches our understanding on the power of deep learning in general. |
18:30 | Capturing Difficulty Expressions in Student Online Q&A Discussions SPEAKER: unknown ABSTRACT. We introduce a new application of online dialogue analysis: supporting pedagogical assessment of online Q&A discussions. Extending the existing speech act framework, we capture common emotional expressions that often appear in student discussions, such as frustration and degree of certainty, and present a viable approach for the classification. We demonstrate how such dialogue information can be used in analyzing student discussions and identifying difficulties. In particular, the difficulty indicators are aligned to discussion patterns and student performance. We found that frustration occurs more frequently in longer discussions. The students who frequently express frustration tend to get lower grades than others. On the other hand, frequency of high certainty expressions is positively correlated with the performance. We believe emotional and informational dialogue roles are important factors in explaining discussion development and student performance. We expect such online dialogue analyses can become a powerful assessment tool for instructors and education researchers. |
19:00 | Quality-based Learning for Web Data Classification SPEAKER: Ou Wu ABSTRACT. The types of web data vary in terms of information quantity and quality. For example, some pages contain numerous texts, whereas some others contain few texts; some web videos are in high resolution, whereas some other web videos are in low resolution. As a consequence, the quality of extracted features from different web data may also vary greatly. Existing learning algorithms on web data classification usually ignore the variations of information quality or quantity. In this paper, the information quantity and quality of web data are described by quality-related factors such as text length and image quantity, and a new learning method is proposed to train classifiers based on quality-related factors. The method divides training data into subsets according to the clustering results of quality-related factors and then trains classifiers by using a multi-task learning strategy for each subset. Experimental results indicate that the quality-related factors are useful in web data classification, and the proposed method outperforms conventional algorithms that do not consider information quantity and quality. |
19:30 | A Region-Based Model for Estimating Urban Air Pollution SPEAKER: unknown ABSTRACT. Air pollution has a direct impact to human health, and data-driven air quality models are useful for evaluating population exposure to air pollutants. In this paper, we propose a novel region-based Gaussian Process model for estimating urban air pollution dispersion, and applied it to a large dataset of ultrafine particle measurements collected from a network of trams monitoring levels of ultrafine particle dispersion in the city of Zurich. We show that compared to existing grid-based models, the region-based model produces better predictions across all aggregate time scales. The new model is appropriate for many useful user applications such as anomaly detection, exposure assessment and sensor optimization. |
14:00 | Deployed: Engineering Works Scheduling for Hong Kong’s Rail Network SPEAKER: Andy Hon Wai Chun ABSTRACT. This paper describes how AI is used to plan, schedule, and optimize nightly engineering works for both the commuter and rapid transit lines in Hong Kong. The MTR Corporation Limited operates and manages all the rail lines in Hong Kong. Its “Engineering Works and Traffic Information Management System” (ETMS) is a mission critical system that manages all information related to engineering works and their related track possessions and engineering train movements. The AI Engine described in this paper is a component of this ETMS. In Hong Kong, the maintenance, inspection, repair, or installation works along the rail lines are done during the very short non-traffic hours (NTH) of roughly 4 to 5 hours each night. These engineering works can be along the running tracks, track-side, tunnel, freight yards, sub-depots, depot maintenance tracks, etc. The proper scheduling of necessary engineering works is crucial to maintaining a reliable and safe train service during normal hours. The AI Engine optimizes resource allocation to maximize the number of engineering works that can be performed, while ensuring all safety, environment, and operational rules and constraints are met. The work described is part of a project to redesign and replace the existing ETMS, deployed in 2004, with an updated technology platform and modern IT architecture, to provide a more robust and scalable system that potentially can be deployed to other cities around the world. |
14:30 | Emerging: A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests SPEAKER: Jeremy Ludwig ABSTRACT. Whenever an auto manufacturer refreshes an existing car or truck model or builds a new one, the model will undergo hundreds if not thousands of tests before the factory line and tooling is finished and vehicle production beings. These tests are generally carried out on expensive, custom-made vehicles because the new factory lines for the model do not exist yet. The work presented in this paper describes how an existing intelligent scheduling software framework was modified to include domain-specific heuristics used in the vehicle test planning process. The result of this work is a prototype scheduling tool that optimizes the overall given test schedule in order to complete the work in a given time window while minimizing the total number of vehicles required for the test schedule. Initial results are presented that show a reduction in required test vehicles compared to manual scheduling of the same tasks as well as increased capability to ask “what-if” questions to further improve the schedule. |
14:50 | Emerging: STREETS: Game-Theoretic Traffic Patrolling with Exploration and Exploitation SPEAKER: Matthew Brown ABSTRACT. To dissuade reckless driving and mitigate accidents, cities deploy resources to patrol roads. In this paper, we present STREETS, an application developed for the city of Singapore, which models the problem of computing randomized traffic patrol strategies as a defender-attacker Stackelberg game. Previous work on Stackelberg security games has focused extensively on counter-terrorism settings. STREETS moves beyond counter-terrorism and represents the first use of Stackelberg games for traffic patrolling, in the process providing a novel algorithm for solving such games that addresses three major challenges in modeling and scale-up. First, there exists a high degree of unpredictability in travel times through road networks, which we capture using a Markov Decision Process for planning the patrols of the defender (the police) in the game. Second, modeling all possible police patrols and their interactions with a large number of adversaries (drivers) introduces a significant scalability challenge. To address this challenge we apply a compact game representation in a novel fashion combined with adversary and state sampling. Third, patrol strategies must balance exploitation (minimizing violations) with exploration (maximizing omnipresence), a tradeoff we model by solving a bi-objective optimization problem. We present experimental results using real-world traffic data from Singapore. This work is done in collaboration with the Singapore Ministry of Home Affairs and is currently being evaluated by the Singapore Police Force. |
15:10 | Emerging: AI-MIX: Using Automated Planning to Steer Human Workers Towards Better Crowdsourced Plans SPEAKER: Lydia Manikonda ABSTRACT. One subclass of human computation applications are those directed at tasks that involve planning (e.g. tour planning) and scheduling (e.g. conference scheduling). Interestingly, work on these systems shows that even primitive forms of automated oversight on the human contributors helps in significantly improving the effectiveness of the humans/crowd. In this paper, we argue that the automated oversight used in these systems can be viewed as a primitive automated planner, and that there are several opportunities for more sophisticated automated planning in effectively steering the crowd. Straightforward adaptation of current planning technology is however hampered by the mismatch between the capabilities of human workers and automated planners. We identify and partially address two important challenges that need to be overcome before such adaptation of planning technology can occur: (1 interpreting inputs of the human workers (and the requester) and (2) steering or critiquing plans produced by the human workers, armed only with incomplete domain and preference models. To these ends, we describe the implementation of AI-MIX, a tour plan generation system that uses automated checks and alerts to improve the quality of plans created by human workers; and present a preliminary evaluation of the effectiveness of steering provided by automated planning. |
14:35 | Semantic Graph Construction for Weakly-Supervised Image Parsing SPEAKER: unknown ABSTRACT. We investigate weakly-supervised image parsing, i.e., assigning class labels to image regions by using image-level labels only. Existing studies pay main attention to the formulation of the weakly-supervised learning problem, i.e., how to propagate class labels from images to regions given an affinity graph of regions. Notably, however, the affinity graph of regions, which is generally constructed in relatively simpler settings in existing methods, is of crucial importance to the parsing performance due to the fact that the weakly-supervised parsing problem cannot be solved within a single image, and that the affinity graph enables label propagation among multiple images. In order to embed more semantics into the affinity graph, we propose novel criteria by exploiting the weak supervision information carefully, and develop two graphs: L1 semantic graph and k-NN semantic graph. Experimental results demonstrate that the proposed semantic graphs not only capture more semantic relevance, but also perform significantly better than conventional graphs in image parsing. |
14:50 | Efficient Object Detection via Adaptive Online Selection of Sensor-Array Elements SPEAKER: Matthai Philipose ABSTRACT. We examine how to use emerging far-infrared imager ensembles to detect certain objects of interest (e.g., faces, hands, people and animals) in synchronized RGB video streams at very low power. We formulate the problem as one of selecting subsets of sensing elements (among many thousand possibilities) from the ensembles for tests. The subset selection problem is naturally {\em adaptive} and {\em online}: testing certain elements early can obviate the need for testing many others later, and selection policies must be updated at inference time. We pose the ensemble sensor selection problem as a structured extension of test-cost-sensitive classification, propose a principled suite of techniques to exploit ensemble structure to speed up processing and show how to re-estimate policies fast. We estimate reductions in power consumption of roughly 50x relative to even highly optimized implementations of face detection, a canonical object-detection problem. We also illustrate the benefits of adaptivity and online estimation. |
15:05 | Grounding Acoustic Echoes In Single View Geometry Estimation SPEAKER: unknown ABSTRACT. Extracting the 3D geometry plays an important part in scene understanding. Recently, structured prediction-based, robust visual descriptors are proposed for extracting the indoor scene layout from a passive agent’s perspective, i.e., single image. This robustness is mainly due to modeling the physical interaction of the underlying room geometry with the objects and the humans present in the room. In this work we add the physical constraints coming from acoustic echoes, generated by an audio source, to this visual model. Our audio-visual 3D geometry descriptor improves over the state of the art in passive perception models as shown by experiments. |
15:20 | Sub-Selective Quantization for Large-Scale Image Search SPEAKER: unknown ABSTRACT. Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to two popular quantization techniques PCA Quantization (PCAQ) and Iterative Quantization (ITQ). Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval. |
16:00 | The Global Literacy Project: Technology to Power Child-Driven Learning SPEAKER: Cynthia Breazeal ABSTRACT. Children are the most precious natural resource of any nation. Education opens the mind of a child to a potential lifetime of knowledge in all its varieties, personal growth, and critical and creative thinking. Yet, it is estimated that around 67 million children live in poor, remote areas where there is no access to schools and where everyone around them is illiterate. There are at least another 100 million children who live where schooling is so inadequate, that they also fail to achieve education any meaningful manner. There are, and always will be, places in every country in the world where good schools will not exist and good teachers will not want to go. Even in developed countries such as the United States, literacy rates, especially in areas of poverty, are unacceptably low. And over 40 percent of preschool aged children in the US are not enrolled in preschool. Too many children enter Kindergarten not ready to learn, and too few ever catch up. We need a fundamentally different approach to this set of issues. Advances in new, affordable mobile computer technologies, growing ubiquity of connectivity to the Internet with cloud computing, big data analytics, even social robots allows us to explore fundamentally new ways of educating and promoting readiness skills of young children — even in these extreme contexts. It allows us to develop a new platform for global literacy: to support science, technology and content development, and to evaluate their impact on learning outcomes for even these most extreme contexts. I will present both the vision and early initiatives and results to date of our multi-university team's work in the pursuit of this provocative mission. This is a story of technological innovation, community, and the power of child-driven learning on a global scale. What we can learn from this endeavor has to potential to help us think differently about education and technology in both formal and informal learning environments, even in the most extreme learning environments. As a novel platform, we invite participation of the global community to make a difference in the lives of children everywhere. |
18:15 | Supervised Transfer Sparse Coding SPEAKER: unknown ABSTRACT. A combination of sparse coding and transfer learning techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from different underlying distributions, i.e., belong to different domains. The key assumption in such case is that in spite of the domain disparity, samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are not labeled, and thus the training set solely comprised objects from the source domain. However, in real world applications, the target domain often has some labeled objects, or one can always manually label a small number of them. In this paper, we explore such possibility and show how a little amount of labeled data in the target domain can significantly leverage classification accuracy of the state-of-the-art transfer sparse coding methods. We further propose a unified framework named Supervised Transfer Sparse Coding (STSC) which simultaneously optimizes sparse representation, domain transfer and supervised classification. Experimental results on three applications demonstrate that little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy. |
18:15 | Non-Restarting SAT Solvers With Simple Preprocessing Efficiently Simulate Resolution SPEAKER: unknown ABSTRACT. Propositional satisfiability (SAT) solvers based on conflict directed clause learning (CDCL) implicitly produce resolution refutations of unsatisfiable formulas. The precise class of formulas for which they can produce polynomial size refutations has been the subject of several studies, with special focus on the clause learning aspect of these solvers. The results, however, either assume the use of non-standard and non-asserting learning schemes such as FirstNewCut, or rely on polynomially many restarts for simulating individual steps of a resolution refutation, or work with a theoretical model that significantly deviates from certain key aspects of all modern CDCL solvers such as learning only one asserting clause from each conflict and other techniques such as conflict guided backjumping and clause minimization. We study non-restarting CDCL solvers that learn only one asserting clause per conflict and show that, with simple preprocessing that depends only on the number of variables of the input formula, such solvers can polynomially simulate resolution. |
18:15 | Learning to Recognize Novel Objects in One Shot through Human-Robot Interactions in Natural Language Dialogues SPEAKER: unknown ABSTRACT. Being able to quickly and naturally teach robots new knowledge is critical for many future open-world human-robot interaction scenarios. In this paper we present a novel approach to using natural language context for one-shot learning of visual objects, where the robot is immediately able to recognize the described object. We describe the architectural components and demonstrate the proposed approach on a robotic platform in a proof-of-concept evaluation. |
18:15 | Online Portfolio Selection with Group Sparsity SPEAKER: unknown ABSTRACT. In portfolio selection, it often might be preferable to focus on a few top performing industries/sectors to beat the market. These top performing sectors however might change over time. In this paper, we propose an online portfolio selection algorithm that can take advantage of sector information through the use of a group sparsity inducing regularizer while making lazy updates to the portfolio. The lazy updates prevent changing ones portfolio too often which otherwise might incur huge transaction costs. The proposed formulation is not straightforward to solve due to the presence of non-smooth functions along with the constraint that the portfolios have to lie within a probability simplex. We propose an efficient primal-dual based alternating direction method of multipliers algorithm and demonstrate its effectiveness for the problem of online portfolio selection with sector information. We show that our algorithm O-LUGS has sub-linear regret $w.r.t.$ the best \textit{fixed} and best \textit{shifting} solution in hindsight. We successfully establish the robustness and scalability of O-LUGS by performing extensive experiments on two real-world datasets. |
18:15 | Non-convex feature learning via lp,∞ operator SPEAKER: unknown ABSTRACT. We present a feature selection method for solving sparse regularization problem, which has a composite regularization of $\ell_p$ norm and $\ell_{\infty}$ norm. We use proximal gradient method to solve this \L1inf operator problem, where a simple but efficient algorithm is designed to minimize a relative simple objective function, which contains a vector of $\ell_2$ norm and $\ell_\infty$ norm. Proposed method brings some insight for solving sparsity-favoring norm, and extensive experiments are conducted to characterize the effect of varying $p$ and to compare with other approaches on real world multi-class and multi-label datasets. |
18:15 | An Agent-Based Model Studying the Acquisition of a Language System of Logical Constructions SPEAKER: Josefina Sierra-Santibanez ABSTRACT. This paper presents an agent-based model that studies the emergence and evolution of a language system of logical constructions, i.e. a vocabulary and a set of grammatical constructions that allow expressing logical combinations of categories. The model assumes the agents have a common vocabulary for basic categories, the ability to construct logical combinations of categories using Boolean functions, and some general purpose cognitive capacities for invention, adoption, induction and adaptation. But it does not assume the agents have a vocabulary for Boolean functions nor grammatical constructions for expressing such logical combinations of categories through language. The results of the experiments we have performed show that a language system of logical constructions emerges as a result of a process of self-organisation of the individual agents' interactions when these agents adapt their preferences for vocabulary and grammatical constructions to those they observe are used more often by the rest of the population, and that such language system is transmitted from one generation to the next. |
18:15 | Adaptation Guided Case Base Maintenance SPEAKER: unknown ABSTRACT. In case-based reasoning (CBR), problems are solved by retrieving prior cases and adapting their solutions to fit new problems. Controlling the growth of the case base in CBR is a fundamental problem. Much research on case-base maintenance has developed methods aimed at compacting case bases while maintaining system competence, by deleting cases whose absence is considered least likely to degrade the system's problem-solving, given static case adaptation knowledge. This paper proposes adaptation-guided case-base maintenance (AGCBM), a case-base maintenance approach exploiting the ability to dynamically generate new adaptation knowledge from cases. In AGCMB, case retention decisions are based both on their value as base cases for solving problems and on their value for generating new adaptation rules, in turn increasing the problem-solving value of other cases in the case base. The paper tests the method for numerical prediction tasks (case-based regression) in which rules are generated automatically using the case difference heuristic. Tests on four sample domains compare accuracy with a set of five candidate case-based maintenance methods, for varying case-base densities. AGCBM outperformed the alternatives all domains, with the benefit most substantial for the greatest amounts of compression. |
18:15 | Abduction Framework for Repairing Incomplete EL Ontologies: Complexity Results and Algorithms SPEAKER: unknown ABSTRACT. In this paper we consider the problem of repairing missing is-a relations in ontologies. We formalize the problem as a generalized TBox abduction problem (GTAP). Based on this abduction framework, we present complexity results for the existence, relevance and necessity decision problems for the GTAP with and without some specific preference relations for ontologies that can be represented using a member of the EL family of description logics. Further, we present an algorithm for finding solutions, a system as well as experiments. |
18:15 | Saturated Path-Constrained MDP: Planning under Uncertainty and Deterministic Model-Checking Constraints SPEAKER: unknown ABSTRACT. In many probabilistic planning scenarios, a system's behavior needs to not only maximize the expected utility but also obey certain restrictions. This paper presents Saturated Path-Constrained Markov Decision Processes (SPC MDPs), a new MDP type for planning under uncertainty with deterministic model-checking constraints, e.g., "state s must be visited before s'", "the system must end up in s", or "the system must never enter s". We present a mathematical analysis of SPC MDPs, showing that although SPC MDPs generally have no optimal policies, every instance of this class has an epsilon-optimal randomized policy for any epsilon > 0. We propose a dynamic programming-based algorithm for finding such policies, and empirically demonstrate this algorithm to be orders of magnitude faster than its next-best alternative. |
18:15 | State Aggregation in Monte Carlo Tree Search SPEAKER: unknown ABSTRACT. Monte Carlo tree search (MCTS) is a popular class of algorithms for online decision making in large Markov decision processes (MDPs). The effectiveness of these algorithms, however, often deteriorates for MDPs with high stochastic branching factors. In this paper, we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning, but little attention has been paid to state aggregation in MCTS. Our main contribution is to establish basic results about the optimality-preserving properties of state aggregation for search trees. We then apply these results to show that popular MCTS algorithms such as UCT and sparse sampling can employ fairly coarse state aggregation schemes while retaining their theoretical properties. As a proof of concept, we experimentally confirm that state aggregation in MCTS improves finite-sample performance. |
18:15 | A Constructive Argumentation Framework SPEAKER: unknown ABSTRACT. Dung’s argumentation framework is an abstract framework based on a set of arguments and a binary attack relation defined over the set. One instantiation, among many others, of Dung’s framework consists in constructing the arguments from a set of propositional logic formulas. Thus an argument is seen as a reason for or against the truth of a particular statement. Despite its advantages, the argumentation approach for inconsistency handling also has important shortcomings. More precisely, in some applications what one is interested in are not so much only the conclusions supported by the arguments but also to the precise explications of such conclusions. We show that argumentation framework applied to classical logic formulas is not suitable to deal with this problem. On the other hand, intuitionistic logic appears to be a natural alternative candidate logic (instead of classical logic) to instantiate Dung’s framework. We develop constructive argumentation framework. We show that intuitionistic logic offers nice and desirable properties of the arguments. We also provide a characterization of the arguments in this setting in terms of minimal inconsistent subsets when intuitionistic logic is embedded in the modal logic S4. |
18:15 | Privacy and Regression Model Preserved Learning SPEAKER: unknown ABSTRACT. Sensitive data such as medical records and business reports usually contains valuable information that can be used to build prediction models. However, designing learning models by directly using sensitive data might result in severe privacy and copyright issues. In this paper, we propose a novel matrix completion based framework that is able to handle two challenging issues simultaneously: i) recovering missing and noisy sensitive data, and ii) preserving the privacy of the sensitive data during the learning process. In particular, the proposed framework is able to mask the sensitive data while ensuring that the transformed data are still usable for training regression models. We show that two key properties, namely \emph{model preserving} and \emph{privacy preserving}, are satisfied by the transformed data obtained from the proposed framework. In \emph{model preserving}, we guarantee that the linear regression model built from the masked data approximates the regression model learned from the original data in a perfect way. In \emph{privacy preserving}, we ensure that the original sensitive data cannot be recovered since the transformation procedure is irreversible. Given these two characteristics, the transformed data can be safely released to any learners for designing prediction models without revealing any private content. Our empirical studies with a synthesized dataset and multiple sensitive benchmark datasets verify our theoretical claim as well as the effectiveness of the proposed framework. |
18:15 | Mechanism Design for Mobile Geo–Location Advertising SPEAKER: unknown ABSTRACT. Mobile geo-location advertising, where mobile ads are targeted based on a user's location, has been identified as a key growth factor for the mobile market. As with online advertising, a crucial ingredient for their success is the development of effective economic mechanisms. An important difference is that mobile ads are shown sequentially over time and information about the user can be learned based on their movements. Furthermore, ads need to be shown selectively to prevent ad fatigue. To this end, we introduce, for the first time, a user model and suitable economic mechanisms which take these factors into account. Specifically, we design two truthful mechanisms which produce an advertisement plan based on the user's movements. One mechanism is allocatively efficient, but requires exponential compute time in the worst case. The other requires polynomial time, but is not allocatively efficient. Finally, we experimentally evaluate the trade-off between compute time and efficiency of our mechanisms. |
18:15 | Learning Relative Similarity by Stochastic Dual Coordinate Ascent SPEAKER: unknown ABSTRACT. Learning relative similarity from pairwise instances is an important problem in machine learning and potentially very useful for many applications, such as image and text retrieval. Despite being studied for years, some existing methods solved by Stochastic Gradient Descent (SGD) techniques generally suffer from slow convergence. In this paper, we investigate the application of Stochastic Dual Coordinate Ascent (SDCA) technique to tackle the optimization task of relative similarity learning by extending from vector to matrix parameters. Theoretically, we prove the optimal linear convergence rate for the proposed SDCA algorithm, beating the well-known sublinear convergence rate by the previous best metric learning algorithms. Empirically, we conduct extensive experiments on both standard and large-scale data sets to validate the effectiveness of the proposed algorithm for retrieval tasks. |
18:15 | Approximate Equilibrium and Incentivizing Social Coordination SPEAKER: unknown ABSTRACT. We study techniques to incentivize self-interested agents to form socially desirable solutions in scenarios where they benefit from mutual coordination. Towards this end, we consider coordination games where agents have different intrinsic preferences but they stand to gain if others choose the same strategy as them. For non-trivial versions of our game, stable solutions like Nash Equilibrium may not exist, or may be socially inefficient even when they do exist. This motivates us to focus on designing efficient algorithms to compute (almost) stable solutions like Approximate Equilibrium that can be be realized if agents are provided some additional incentives. Alternatively, approximate stability corresponds to the addition of a switching cost that agents have to pay in order to deviate. Our results apply in many settings like adoption of new products, project selection, and group formation, where a central authority can direct agents towards a strategy but agents may defect if they have better alternatives. We show that for any given instance, we can either compute a high quality approximate equilibrium or a near-optimal solution that can be stabilized by providing a small fraction of the social welfare to all players. Our results imply that little influence is necessary in order to ensure that selfish players coordinate and form socially efficient solutions. |
18:15 | New Models for Competitive Contagion SPEAKER: unknown ABSTRACT. In this paper, we introduce and examine two new and natural models for competitive contagion in networks, a game-theoretic generalization of the viral marketing problem. In our setting, firms compete to maximize their market share in a network of consumers whose adoption decisions are stochastically determined by the choices of their neighbors. Building on the switching-selecting framework introduced by Goyal and Kearns (2012), we first introduce a new model in which the payoff to firms comprises not only the number of vertices who adopt their (competing) technologies, but also the network connectivity among those nodes. For a general class of stochastic dynamics driving the local adoption process, we derive upper bounds for (1) the (pure strategy) Price of Anarchy (PoA), which measures the inefficiency of resource use at equilibrium, and (2) the Budget Multiplier, which captures the extent to which the network amplifies the imbalances in the firms' initial budgets. These bounds depend on the firm budgets and the maximum degree of the network, but no other structural properties. In addition, we give general conditions under which the PoA and the Budget Multiplier can be unbounded. We also introduce a model in which budgeting decisions are endogenous, rather than externally given as is typical in the viral marketing problem. In this setting, the firms are allowed to choose the number of seeds to initially infect (at a fixed cost per seed), as well as which nodes to select as seeds. In sharp contrast to the results of Goyal and Kearns (2012), we show that for almost any local adoption dynamics, there exists a family of graphs for which the PoA and Budget Multiplier are unbounded. |
18:15 | DJAO: A Communication‐Constrained DCOP Algorithm that Combines Features of ADOPT and Action‐GDL SPEAKER: unknown ABSTRACT. In this paper we propose a novel DCOP algorithm, called DJAO, that is able to efficiently find a solution with low communication overhead; this algorithm can be used for optimal and bounded approximate solutions by appropriately setting the error bounds. Our approach builds on distributed junction trees used in Action-GDL to represent independence relations among variables. We construct an AND/OR search space based on these junction trees. This new type of search space results in higher degrees for each OR node, consequently yielding a more efficient search graph in the distributed settings. DJAO uses a branch-and-bound search algorithm to distributedly find solutions within this search graph. We introduce a heuristics to compute the upper and lower bound estimates that the search starts with, which is integral to our approach for reducing communication overhead. We empirically evaluate our approach in various settings. |
18:15 | Cross-lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis SPEAKER: unknown ABSTRACT. Creating knowledge bases based on the crowd-sourced wikis, like Wikipedia, has attracted significant research interest in the field of intelligent Web. However, the derived taxonomies usually contain many mistakenly imported taxonomic relations due to the difference between the user-generated subsumption relations and the semantic taxonomic relations. Current approaches to solving the problem still suffer the following issues: (i) the heuristic-based methods strongly rely on specific language dependent rules. (ii) the corpus-based methods depend on a large-scale high-quality corpus, which is often unavailable. In this paper, we formulate the cross-lingual taxonomy derivation problem as the problem of cross-lingual taxonomic relation prediction. We investigate different linguistic heuristics and language independent features, and propose a cross-lingual knowledge validation based dynamic adaptive boosting model to iteratively reinforce the performance of taxonomic relation prediction. The proposed approach successfully overcome the above issues, and experiments show that our approach significantly outperforms the designed state-of-the-art comparison methods. |
18:15 | Sparse Learning for Stochastic Composite Optimization SPEAKER: unknown ABSTRACT. In this paper, we focus on the Sparse Learning for Stochastic Composite Optimization (SCO). Many algorithms have been proposed for SCO, and they have reached the optimal convergence rate $\mathcal{O}(1/T)$ recently. However, the sparsity of the solutions obtained by the existing methods is unsatisfactory due to mainly two reasons: (1) taking the average of the intermediate solutions as the final solution, (2) the reducing of the magnitude of the sparse regularizer in the iterations. In order to improve the sparse pattern of the solutions, we propose a simple but effective stochastic optimization scheme by adding a novel sparse online-to-batch conversion to the traditional algorithms for SCO. Two specific approaches are discussed in this paper to reveal the power of our scheme. The theoretical analysis shows that our scheme can find a solution with better sparse pattern without affecting the current optimal convergence rate. Experiment results on both synthetic and real-world data sets show that our proposed methods have obviously superior sparse recovery ability and have comparable convergence rate as the state-of-the-art algorithms for SCO. |
18:15 | Signed Laplacian Embedding for Supervised Dimension Reduction SPEAKER: unknown ABSTRACT. Manifold learning is a powerful tool for solving nonlinear dimension reduction problems. By assuming that the high-dimensional data usually lie on a low-dimensional manifold, many algorithms have been proposed. However, most algorithms simply adopt the traditional graph Laplacian to encode the data locality, so the discriminative ability is limited and the embedding results are not always suitable for the subsequent classification. Instead, this paper deploys the signed graph Laplacian and proposes Signed Laplacian Embedding (SLE) for supervised dimension reduction. By exploring the label information, SLE comprehensively transfers the discrimination carried by the original data to the embedded low-dimensional space. Without perturbing the discrimination structure, SLE also retains the locality. Theoretically, we prove the immersion property by computing the rank of projection, and relate SLE to existing algorithms in the frame of patch alignment. Thorough empirical studies on synthetic and real datasets demonstrate the effectiveness of SLE. |
18:15 | On the Challenges of Physical Implementations of RBMs SPEAKER: unknown ABSTRACT. Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC. Physical computation offers the opportunity to reduce the cost of sampling by building physical systems whose natural dynamics correspond to drawing samples from the desired RBM distribution. Such a system avoids the burn-in and mixing cost of a Markov chain. However, hardware implementations of this variety usually entail limitations such as low-precision and limited range of the parameters and restrictions on the size and topology of the RBM. We conduct software simulations to determine how harmful each of these restrictions is. Our simulations are designed to reproduce aspects of the D-Wave Two computer, but the issues we investigate arise in most forms of physical computation. Our findings suggest that designers of new physical computing hardware and algorithms for physical computers should concentrate their efforts on overcoming the limitations imposed by the topology restrictions of currently existing physical computers. |
18:15 | On Computing Optimal Strategies in Open List Proportional Representation: the Two Parties Case SPEAKER: unknown ABSTRACT. Open list proportional representation is an election mechanism used in many elections including the 2012 Hong Kong Legislative Council Geographical Constituencies election. In this paper, assuming that there are just two parties in the election, and that the number of votes that a list would get is the sum of the numbers of votes that the candidates in the list would get if each of them would go alone in the election, we formulate the election as a mostly zero-sum game, and show that while the game always has a pure Nash equilibrium, it is NP-hard to compute it. |
18:15 | ARIA: Asymmetry Resistant Instance Alignment SPEAKER: unknown ABSTRACT. This paper studies the problem of instance alignment between knowledge bases (KBs). Existing approaches, exploiting the "symmetry" of structure and information across KBs, suffer in the presence of asymmetry, which is frequent as KBs are independently built. Specifically, we observe three types of asymmetry -- concept, feature, and structural asymmetry. The goal of this paper is to identify key techniques for overcoming each type of asymmetry, then build them into a framework that robustly aligns matches over asymmetry. In particular, we propose an Asymmetry-Resistant Instance Alignment framework (ARIA), implementing two-phased blocking methods considering concept and feature asymmetry, with a novel similarity measure overcoming structural asymmetry. Our evaluation results validate that this framework outperforms state-of-the-arts in terms of both response time and accuracy, by increasing 18% in precision and 2% in recall in matching large-scale real-life KBs. |
18:15 | Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization SPEAKER: unknown ABSTRACT. Due to its low storage cost and fast query speed, hashing has been widely adopted for similarity search in multimedia data. In particular, more and more attentions have been payed to multimodal hashing for search in multimedia data with multiple modalities, such as images with tags. Typically, supervised information of semantic labels is also available for the data points in many real applications. Hence, many supervised multimodal hashing~(SMH) methods have been proposed to utilize such semantic labels to further improve the search accuracy. However, the training time complexity of most existing SMH methods is too high, which makes them unscalable to large-scale datasets. In this paper, a novel SMH method, called semantic correlation maximization~(SCM), is proposed to seamlessly integrate semantic labels into the hashing learning procedure for large-scale data modeling. Experimental results on two real-world datasets show that SCM can significantly outperform the state-of-the-art SMH methods, in terms of both accuracy and scalability. |
18:15 | Learning Word Representation Considering Proximity and Ambiguity SPEAKER: unknown ABSTRACT. Distributed representations of words (aka word embedding) have been proven helpful in solving NLP tasks. Training distributed representations of words with neural networks has received much attention of late. Especially, the most recent work on word embedding, the Continuous Bag-of-Words (CBOW) model and the Continuous Skip-gram (Skip-gram) model proposed by Google, shows very impressive results by significantly speeding up the training process to enable word representation learning from very large-scale data. However, both CBOW and Skip-gram do not pay enough attention to the word proximity in terms of model or the word ambiguity in terms of linguistics. In this paper, we propose Proximity-Ambiguity Sensitive (PAS) models (i.e. PAS CBOW and PAS Skip-gram) for producing high quality distributed representations of words considering both word proximity and ambiguity. From the model perspective, we introduce proximity weights as parameters to be learned in PAS CBOW and used in PAS Skip-gram. By better modeling word proximity, we reveal the real strength of the pooling-structured neural networks in word representation learning. The proximity-sensitive pooling layer can also be applied to other neural network applications that employ pooling layers. From the linguistics perspective, we train multiple representation vectors per word. Each representation vector corresponds to a particular sense of the word. By using PAS models, we achieved a maximum accuracy increase of 16.9% over the state-of-the-art models on the word representation test set. |
18:15 | Echo-State Conditional Restricted Boltzmann Machines SPEAKER: Sotirios Chatzis ABSTRACT. Restricted Boltzmann machines (RBMs) are a powerful generative modeling technique, based on a complex graphical model of hidden (latent) variables. Conditional RBMs (CRBMs) are an extension of RBMs tailored to modeling temporal data. A drawback of CRBMs is their consideration of linear temporal dependencies, which limits their capability to capture complex temporal structure. They also require many variables to model long temporal dependencies, a fact that might provoke overfitting proneness. To resolve these issues, in this paper we propose the echo-state CRBM (ES-CRBM): our model uses an echo-state network reservoir in the context of CRBMs to efficiently capture long and complex temporal dynamics, with much fewer trainable parameters compared to conventional CRBMs. In addition, we introduce an (implicit) mixture of ES-CRBM experts (im-ES-CRBM) to enhance even further the capabilities of our ES-CRBM model. The introduced im-ES-CRBM allows for better modeling temporal observations which might comprise a number of latent or observable subpatterns that alternate in a dynamic fashion. It also allows for performing sequence segmentation using our framework. We apply our methods to sequential data modeling and classification experiments using public datasets. As we show, our approach outperforms both existing RBM-based approaches as well as related state-of-the-art methods, such as conditional random fields. |
18:15 | Adding Local Exploration to Greedy Best-First Search for Satisficing Planning SPEAKER: unknown ABSTRACT. Greedy Best-First Search (GBFS) is a powerful algorithm at the heart of many state of the art satisficing planners. One major weakness of GBFS is its behavior in so-called uninformative heuristic regions (UHR) - parts of the search space in which no heuristic provides guidance towards states with improved heuristic values. In such regions, GBFS degenerates into an inefficient breadth-first type search. This work analyzes the problem of UHR in planning in detail, and proposes a two level search framework as a solution. In Greedy Best-First Search with Local Exploration (GBFS-LE), local exploration is started from within a global GBFS whenever the search seems stuck in UHRs. Two different local exploration strategies are developed and evaluated experimentally: Local GBFS (LS) and Local Random Walk Search (LRW). The two new planners LAMA-LS and LAMA-LRW integrate these strategies into the GBFS component of LAMA-2011. Both are shown to yield clear improvements in terms of both coverage and search time on standard International Planning Competition benchmarks, especially for domains that are proven to have large or unbounded UHRs. |
18:15 | On the Incompatibility of Efficiency and Strategyproofness in Randomized Social Choice SPEAKER: unknown ABSTRACT. Efficiency--no agent can be made better off without making another one worse off--and strategyproofness--no agent can obtain a more preferred outcome by misrepresenting his preferences--are two cornerstones of economics and ubiquitous in important areas such as voting, auctions, or matching markets. Within the context of randomized social choice, Bogomolnaia and Moulin have shown that two particular notions of efficiency and strategyproofness based on stochastic dominance are incompatible. However, there are various other possibilities of lifting preferences over alternatives to preferences over lotteries apart from stochastic dominance. In this paper, we give an overview of common preference extensions, propose two new ones, and show that the above-mentioned incompatibility can be extended to various other notions of strategyproofness and efficiency. |
18:15 | Ranking Tweets by Labeled and Collaboratively Selected Pairs with Transitive Closure SPEAKER: unknown ABSTRACT. Tweets ranking is important for information acquisition in Microblog. Due to the content sparsity and lack of labeled data, it is better to employ semi-supervised learning methods to utilize the unlabeled data. However, most of previous semi-supervised learning methods do not consider the pair conflict problem, which means that the new selected unlabeled data may conflict with the labeled and previously selected data. It will hurt the learning performance a lot, if the training data contains many conflict pairs. In this paper, we propose a new collaborative semi-supervised SVM ranking model (CSR-TC) with consideration of the order conflict. The unlabeled data is selected based on a dynamically maintained transitive closure graph to avoid pair conflict. We also investigate the two views of features, intrinsic and content-relevant features, for the proposed model. Extensive experiments are conducted on TREC Microblogging corpus. The results demonstrate that our proposed method achieves significant improvement, compared to several state-of-the-art models. |
18:15 | Learning Temporal Dynamics of Behavior Propagation in Social Networks SPEAKER: unknown ABSTRACT. Social influence has been widely accepted to explain people's cascade behaviors and further utilized to benefit many applications. However, few of existing work studied the direct, microscopic and temporal impact of social influence on people's behaviors in detail. In this paper we engage in the investigation of behavior propagation based on social influence and its temporal dynamics over continuous time. We formalize the static behavior models including BP and IBP, and the discrete DBP and DIBP models. We introduce continuous-temporal functions (CTFs) to model the fully-continuous dynamic variance of social influence over time. Upon that we propose the continuous-temporal interest-aware behavior propagation model, called CIBP, and present effective inference algorithm. Experimental studies on real-world datasets evaluated the family of behavior propagation models (BPMs) and demonstrated the effectiveness of our proposed models. |
18:15 | On the Structure of Synergies in Cooperative Games SPEAKER: unknown ABSTRACT. We investigate synergy, or lack thereof, between agents in cooperative games, building on the popular notion of Shapley value. We think of a pair of agents as synergistic (resp., antagonistic) if the Shapley value of one agent when the other agent participates in a joint effort is higher (resp. lower) than when the other agent does not participate. Our main theoretical result is that any graph specifying synergistic and antagonistic pairs can arise even from a restricted class of cooperative games. We also study the computational complexity of determining whether a given pair of agents is synergistic. Finally, we use the concepts developed in the paper to uncover the structure of synergies in two real-world organizations, the European Union and the International Monetary Fund. |
18:15 | Multilabel Classification with Label Correlations and Missing Labels SPEAKER: unknown ABSTRACT. Many real-world applications involve multilabel classification, in which the labels can have strong inter-dependencies and some of them may even be missing. Existing multilabel algorithms are unable to deal with both issues simultaneously. In this paper, we propose a probabilistic model that can automatically learn and exploit multilabel correlations. By integrating out the missing information, it also provides a disciplined approach to the handling of missing labels. The inference procedure is simple, and the optimization subproblems are convex. Experiments on a number of real-world data sets with both complete and missing labels demonstrate that the proposed algorithm can consistently outperform the state-of-the-art multilabel classification algorithms. |