Creating High Level Content Descriptors for Recommender Systems Datasets
ABSTRACT. Information Retrieval and Recommender Systems have been frequently evaluated using indexes based on variants and extensions of precision-like measures. Likewise, approaches for diversity evaluation have been proposed. However, these measures are usually defined in terms of a set of high level content descriptors known as \textit{information nuggets} that are hard to obtain. We propose a method to create these nuggets using social tags, providing datasets with annotations to evaluate content diversity in recommender systems. Since recommending items to a target user is analogous to searching documents from a query, this method might be extended to Information Retrieval.
A Development Methodologies Recommender System Based on Knowledge from the Software Industry
ABSTRACT. Software development methodologies are fundamental for the proper development of software projects. However, these methodologies are not always used in the most appropriate way or the most pertinent ones are selected depending on the resources available, which could result in failed or aborted projects during their execution. In fact, several reports show a high rate of failure in the development and implementation of software projects, especially in the case of large projects. The type of development methodology chosen is one of the important factors when differentiating between successful from unsuccessful projects. For example, it has been reported that projects that use agile practices have a greater percentage of success than those that use other development methodologies. However, agile methodologies are not necessarily appropriate for any type of project. Given the above, a development methodologies recommendation system prototype was developed, that allows guiding developers with low level of experience or knowledge to select methodologies that are more appropriate according to different criteria. These criteria were obtained from bibliographic sources and then validated and complemented through an empirical study where surveys and interviews were applied to professionals in the IT area of companies that develop software in Chile.
A Compact Memory-based Index for Spatial Keyword Query Resolution
ABSTRACT. Spatial keyword queries are massively used to provide innovative search services, such as retrieving the nearest restaurant offering a desired service or searching for people who write about a topic in a particular location. Behind these services, geo-textual indexes take a leading role in efficiently resolving such queries. Existing approaches combine spatial and text indexing schemes that are based primarily on secondary storage, so their performance is mainly affected by I/O costs. To overcome this limitation, we propose a new compact memory-based index that enhances a balanced KD-Tree with keyword information encoded in the form of highly-compressed bitmaps. This index is, to the best of our knowledge, the first approach based on (memory friendly) compact data structures. We also design an in-memory algorithm that efficiently resolves the well-known Top-k Spatial Keyword Query (TkSKQ); i.e. it retrieves the k nearest objects that are described by a set of keywords. The experiments run in this research, involving some different real-world datasets, show that our propose overcome the state of the art both in space requirement (27% in comparison) and runtime (12.5 times faster).
Extending the CMHD Compact Data Structure to Compute Aggregations over Data Warehouses
ABSTRACT. Compact data structures are data structures that allow compacting data without losing the ability of querying them in their compact form. In this paper, we present algorithms to extend the functionality of the compact data structure CMHD (Compact representation of Multidimensional data on Hierarchical Domains). CMHD allows the computation of aggregate queries with SUM function on multidimensional matrices. We present algorithms to implement the rest of aggregate functions, i.e., functions MIN, MAX, COUNT and AVG.
Moreover, we use the CMHD compact data structure to compact Data Warehouses (DWs). A DW is a collection of data oriented to a subject, integrated, non-volatile and historical, organized to support the decision-making process. The most common queries over DWs are aggregate queries, i.e., queries that group data and compute an aggregate function over the groups. The improvement of efficiency of query processing in DWs is a very important issue. Therefore, various efforts have been made in that direction, such as, maintaining and updating materialized views of data. In this paper, we represent DWs in the CMHD compact data structure and query them efficiently. Experimental results over DWs with synthetic data show that by using a compact representation of DWs, we can achieve better performance in processing aggregate queries.
Heuristic parametrization of anisotropic diffusion filtering
ABSTRACT. The methods of evolutionary computation allow to set optimal values in a space of solutions from a candidate set. In this work, we have used evolutionary methods, for optimal parameters setting of the anisotropic diffusion filter in degraded images with additive noise. The experiments show the potential of the evolutionary methods to optimize the parametrization, with the objective of evaluating the performance of filters, applied on a set of degraded images with controlled conditions and a known ground truth
Algorithms for the Unrelated Parallel Machine Scheduling Problem with Sequence Dependent Setup Times
ABSTRACT. In this work the unrelated parallel machine scheduling problem with sequence dependent setup times is studied, the objective considered in this problem is the minimization of the maximum completion time of the schedule, also known as makespan. It is very relevant in a practical sense since it is largely found in the industrial scope, as well as in theoretical sense because it belongs to the NP-Hard class. Several algorithms were implemented in order to find good solutions for the UPMSP; the results obtained by them were compared, and the F\&O heuristic with the VNS metaheuristic generating the initial solution presented the best results and in some instances, it found better results than the best-known results available.
CHAVE: Resource Consolidation with High Availability on Virtualized Environments
ABSTRACT. In order to meet the growing demand for business continuity, the adoption of cloud computing platforms is growing to keep its critical services. However, availability rates established in service level agreements (SLA) by cloud service providers (CSP) does not always meet their demand for high-availability (HA). Services replicated in multi-AZs architecture result in high costs due to the inherent increase in the load of the physical servers, resulting in higher consumption of energy. In this way, virtual machine (VM) consolidation stands out as an energy efficiency strategy based on virtual resource scheduling, allowing to reduce energy consumption as well as improve the organization of fragmented resources. However, when consolidation is applied in conjunction with HA mechanism, there is a risk of violating affinity (MV-server) and anti-affinity (MV-AZ) constraints, thereby violating SLA requirements. Thus, CHAVE presents an
on-demand HA mechanism based on Multi-AZ replication, which simultaneously performs a VM consolidation strategy isolated for each AZ, considering its inherent constraints. The numerical results by real-trace driven simulations show that CHAVE, meets 20% of HA requests with energy consumption similar to a CSP that does not apply to consolidation with replication. Additionally, CHAVE does not cause any SLA violations such as overcommiting, or rejection of critical requests.