Tags:features relevance analysis, interpretable model and least squares
Abstract:
Linear least squares is one of the most widely used regression methods in all the sciences. The simplicity of the model allows this method to be used when data is scarce and it is usually appealing to practitioners that need to gather some insight into the problem by inspecting the values of the learnt parameters. In this paper we propose a variant of the linear least squares model that allows practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. We formally show that the new formulation is not convex and provide two alternative methods to attack the problem: one non-exact method based on an alternating least square approach; and one exact method based on a reformulation of the problem using an exponential number of sub-problems whose minimum is guaranteed to be the optimal solution. We formally show the correctness of the exact method and also compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (assuming that the number of partitions is small).