(Psycho-)Analysis of Benchmark Experiments: A Formal Framework for Investigating the Relationship between Data Sets and Learning Algorithms

Lecturer : 
Manuel J. A. Eugster
Event type: 
HIIT seminar
Event time: 
2012-10-22 13:15 to 14:00
Place: 
Lecture Hall T2, Konemiehentie 2
Description: 

Abstract:
It is common knowledge that the performance of different learning algorithms depends on certain characteristics of the data---such as dimensionality, linear separability or sample size. However, formally investigating this relationship in an objective and reproducible way is not trivial.

In this talk a new formal framework for describing the relationship between data set characteristics and the performance of different learning algorithms is proposed. The framework combines the advantages of benchmark experiments with the formal description of data set characteristics by means of statistical and information-theoretic measures and with the recursive partitioning of Bradley-Terry models for comparing the algorithms' performances.

The main advantages of this framework are its objectivity and reproducibility as well as its flexibility for detecting potentially complex combinations of data set characteristics related to systematic differences in the algorithms' performances.

Bio: Manuel Eugster is a postdoctoral researcher at Helsinki Institute for Information Technology HIIT, in the Statistical Machine Learning and Bioinformatics research group. He graduated from the Ludwig-Maximilians- Universtiy Munich with a doctoral degree in Statistics on March 2011. Work presented here is part of his PhD thesis where he developed a statistically sound framework to compare learning algorithms.

Host: Sohan Seth


Last updated on 16 Oct 2012 by Sohan Seth - Page created on 16 Oct 2012 by Sohan Seth