Séminaire de Probabilités et Statistique
lundi 26 novembre 2012 à 15:00 - UM2 - Bât 09 - Salle 331 (3ème étage)
Chritophe Biernacki (Université Lille 1 / INRIA)
Model-based clustering for multivariate partial ranking data
We propose the first model-based clustering algorithm dedicated to multivariate partial ranking data. This is an extension of the Insertion Sorting Rank (ISR) model for ranking data, which is a meaningful and effective model obtained by modelling the ranking generating process assumed to be a sorting algorithm. The heterogeneity of the rank population is modelled by a mixture of ISR, whereas conditional independence assumption allows the extension to multivariate ranking. Maximum likelihood estimation is performed through a SEM-Gibbs algorithm, and partial rankings are considered as missing data, what allows to simulate them during the estimation process. After having validated the estimation algorithm on simulations, three real datasets are studied: the 1980 American Psychological Association (APA) presidential election votes, the results of French students to a general knowledge test and the votes of the European countries to the Eurovision song contest. For each application, the proposed model shows relevant adequacy and leads to significant interpretation. In particular, regional alliances between European countries are exhibited in the Eurovision contest, which are often suspected but never proved.