SDA 2020: Symbolic Data Analysis Workshop 2020 Caserta Caserta, Italy, June 11-12, 2020 |

Submission link | https://easychair.org/conferences/?conf=sda2020 |

Abstract registration deadline | January 31, 2020 |

Submission deadline | March 31, 2020 |

What is Symbolic Data Analysis?

Data Science, considered as a science by itself, is in general terms, the extraction of knowledge from data. Symbolic data analysis (SDA) gives a new way of thinking in Data Science by extending the standard input to a set of classes of individual entities. Hence, classes of a given population are considered to be units of a higher level population to be studied. Such classes often represent the real units of interest. In order to take variability between the members of each class into account, classes are described by intervals, distributions, set of categories or numbers sometimes weighted and the like. In that way, we obtain new kinds of data, called ‘symbolic’ as they cannot be reduced to numbers without losing much information. The first step in SDA is to build the symbolic data table where the rows are classes and the variables can take symbolic values. The second step is to study and extract new knowledge from these new kinds of data by at least an extension of Computer Statistics and Data Mining to symbolic data. SDA is a new paradigm which opens up a vast domain of research and applications by giving complementary results to classical methods applied to standard data. SDA also gives answers to big data and complex data challenges as big data can be reduced and summarized by classes and as complex data with multiple unstructured data tables and unpaired variables can be transformed into a structured data table with paired symbolic-valued variables.

Diday, E. (2016). Thinking by classes in data science: the symbolic data analysis paradigm. WIREs Comp Stat, 8: 172–205. doi: 10.1002/wics.1384

Symbolic Data Analysis (SDA) provides a framework for the representation and analysis of data that comprehends inherent variability. While in Data Mining and classical Statistics the data to be analyzed usually presents one single value for each variable, that is no longer the case when the entities under analysis are not single elements, but groups gathered on the basis of some given criteria. Then, for each variable, variability inherent to each group should be taken into account. Also, when analyzing concepts, such as botanic species, disease descriptions, car models, and so on, data entail intrinsic variability, which should be explicitly considered. To this purpose, new variable types have been introduced, whose realizations are not single real values or categories, but sets, intervals, or, more generally, distributions over a given domain. SDA provides methods for the (multivariate) analysis of such data, where the variability expressed in the data representation is taken into account, using various approaches.

Brito, P. (2014). Symbolic data analysis: Another look at the interaction of data mining and statistics. WIREs Data Mining and Knowledge Discovery, 4 (4), 281–295. doi: 10.1002/widm.1133

## Submission Guidelines

All papers must be original and not simultaneously submitted to another journal or conference. The following paper categories are welcome:

**TBA**

## List of Topics

- TBA

## Committees

### Program Committee

- Antonio Irpino (Unicampania)
- TBA

### Organizing committee

- Antonio Irpino (Unicampania)
- Antonio Balzanella (Unicampania)
- TBA

## Venue

The conference will be held in Caserta

## Contact

All questions about submissions should be emailed to antonio.irpino@unicampania.it .