SULRSL

Supervised Unsupervised Learning and Relevant Subtask Learning (SULRSL) of the Statistical Machine Learning and Bioinformatics group is a project funded by the Academy of Finland.

The project develops statistical machine learning methods to extract from high-dimensional data sets regularities that are relevant to the analyst. We infer relevance from auxiliary information that comes with the data, such as class labels coupled with input samples. In tasks like discriminative visualization, discriminative clustering or discriminative feature extraction, the labels guide unsupervised analysis of the features; we call such tasks supervised unsupervised learning.

In both standard supervised learning and in the new idea, supervised unsupervised learning, a common problem is having too little labeled training data. The problem is particularly hard for the high-dimensional data in genome-wide studies of modern bioinformatics, but appears also in image classification from few examples, finding of relevant texts, etc.

Thankfully, the world is full of potentially related ”background” data sets: for instance in bioinformatics there are databases full of data measured for different tasks, conditions or contexts; for texts there is the web. Our second research problem is, can we solve the small-data problem by using the partially relevant data sets to build a better class-discriminative model for the test data?

We have recently introduced a learning problem called relevant subtask learning to solve this second problem. Relevant subtask learning intelligently makes use of other, potentially related ”background” data sets: it simultaneously learns from a small data set and retrieves useful information from the other data sets. This scenario is a special kind of multi-task learning problem: in contrast to typical multi-task learning, our problem is fundamentally asymmetric and more structured.

More information and publications of the project are available from the group page.


Last updated on 28 May 2008 by Antti Ajanki - Page created on 20 May 2008 by Antti Ajanki