HIIT seminars in spring 2007 will be held in hall **B222** of Exactum,
on Fridays starting at 10:15 a.m. Coffee available from 10.
Fri Mar 2
Arto Klami
Generative models that discover dependencies between data sets
Abstract:
We study a kind of data fusion problem where the aim is to find
dependencies between two (or in general more) data sets with
co-occurring paired samples. The underlying motivation is that if
several measurements have been tailored to measure the same phenomenon
from different views then what is in common between the measurements
is interesting. Variation that occurs only in one of the data sets is
assumed noise in this context, and when working with small data sets
we want to avoid modeling that.
Traditionally dependencies are sought by explicitly maximizing a dependency measure, such as correlation or mutual information. This kind of approaches, however, are known to overfit seriously, and consequently rather simple models need to be used. Recently a probabilistic interpretation of canonical correlation analysis (CCA) was presented, opening way for more robust methods for the same task. In this talk I will present an extended version of the original generative model of CCA, describe a clustering model using the same underlying principle, and discuss a fully Bayesian variant of CCA.
Last updated on 26 Feb 2007 by Teija Kujala - Page created on 2 Mar 2007 by Teija Kujala