A General Framework for Domain Adaptation in a Good Similarity-Based Projection Space

Lecturer : 
Emilie Morvant
Event type: 
HIIT seminar
Event time: 
2012-03-05 13:15 to 14:00
Place: 
Lecture hall T2, ICS department
Description: 

In the past few years, machine learning researchers have formalized the crucial scenario of domain adaptation as learning from a source distribution and evaluating on a somewhat different target distribution. Usually, two settings are considered. The first one where labeled data are only drawn from the source learning sample, this setting is often called unsupervised domain adaptation. The second one where labeled data are available in both source and target samples which corresponds to the semi-supervised domain adaptation setting. We consider the challenging unsupervised case and we extend our framework to the semi-supervised case with the use of some few target labels. From the theoretical standpoint of Ben-David et al.[1], a classifier has better generalization guarantees when the two marginal distributions are close.  We study a new direction based on a recent framework of Balcan et al.[1]  allowing to learn linear classifiers in an explicit projection space based on similarity functions that may be not symmetric and not positive semi-definite. We propose a general method for learning a low-error classifier on target data with generalization guarantees (according to Xu and Mannor[3]) and we improve its efficiency thanks to an iterative procedure by a reweighting the similarity function - compatible with the Balcan et al.[2]'s framework - to move closer the two distributions in a new projection space. The hyperparameters and the reweighting quality are controlled by a reverse validation procedure. Our approach is based on a linear programming formulation and shows good adaptation performances with very sparse models.  Finally, we evaluate our method on a synthetic problem and on a real image annotation task. 

A part of this work has been published in Proc. of ICDM 2011. An author manuscript is available here: http://hal.archives-ouvertes.fr/index.php?halsid=fpps8ig19b0vo8ampct290dep6&view_this_doc=hal-00629207&version=1

[1] S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J.W. Vaughan. A theory of learning from different domains. Machine Learning Journal, 79(1-2):151-175, 2010.

[2] M.-F. Balcan, A. Blum, and N. Srebro. Improved guarantees for learning via similarity functions. In Proceedings of COLT, 2008.

[3] H. Xu and S. Mannor. Robustness and generalization. In Proceedings of COLT, 2010.


Last updated on 6 Feb 2012 by Sohan Seth - Page created on 6 Feb 2012 by Sohan Seth