Multi-source Probabilistic Inference

The Multi-source Probabilistic Inference group develops probabilistic machine learning models and inference techniques for analyzing and understanding complex heterogeneous data collections. For most data analysis tasks it is beneficial to jointly analyze all available data, but  often the different data sources are not directly commensurable. For example, a data scientist studying demographics of a neighborhood might have static spatial information about the buildings, dynamic group-level information on public transportation, large collections of time-stamped and user-specific social media content both as text as images, and perhaps even some interview questionnaires. All of these sources provide information on the demographics, but standard modeling tools do not help much in providing an overall picture.

The goal of this research group is to overcome the theoretical and practical challenges needed for integrating such heterogeneous data sources, by building statistical models for various types of data and especially hierarchical models for joint analysis of them even in cases where there are no obvious ways of linking the sources with each other. We also develop statistical machine learning methods in more general, focusing on computationally efficient approximative inference, transfer learning, domain adaptation, and other techniques crucial for learning complex models (including deep neural networks) from limited data collections.

The group is part of Finnisch Center for Artificial Intelligence (FCAI).

Open positions

There are currently no open calls, but we are constantly looking for talented postdoctoral researchers and PhD students to join the group. For both levels we consider both candidates with existing machine learning research track and those switching from other computational fields (physics, statistics, mathematics, economics, ...) towards machine learning and artificial intelligence. Contact the group leader by email if interested.

Research themes

  • Multi-view learning, data integration, cross-domain object matching
  • Approximative Bayesian inference; MCMC, variational approximation
  • Nonparametric Bayesian modeling
  • Scalable probabilistic models, probabilistic programming


  • Traces of Information: Intelligence from fragmented data (Academy of Finland, 2013-2019)
  • RAB-ML: Robust Automatic Bayesian Machine Learning (Academy of Finland, 2018-2019, with Aki Vehtari and Antti Honkela)
  • SPA: Scalable Probabilistic Analytics (Tekes, 2016-2018, with Petri Myllymäki and Teemu Roos)
  • ILCIS: Improved Learning by Combining Information Sources (Xerox Foundation, 2013-2015)


  • Arto Klami, PhD, Assistant Professor, Academy Research Fellow
  • Aditya Jitta, Doctoral student
  • Joseph Sakaya, Doctoral student
  • Jarkko Lagus, Doctoral student
  • Krista Longi, Doctoral student
  • Chang Rajani, Research assistant

Alumni (MSc+)

  • Liye He, MSc
  • Johannes Sirola, MSc

Selected recent publications

  • Semi-supervised convolutional neural networks for identifying Wi-Fi interference sources. Krista Longi, Teemu Pulkkinen and Arto Klami. In Proceedings of Asian Conference on Machine Learning (ACML), 2017. [pdf]
  • Importance sampled stochastic optimization for variational inference. Joseph Sakaya and Arto Klami. In Proceedings of Uncertainty in Artificial Intelligence (UAI), 2017. [pdf, code]
  • Partially hidden Markov models for privacy-preserving modeling of indoor trajectories. Aditya Jitta and Arto Klami. Neurocomputing, 2017. [doi]
  • Probabilistic size-constrained microclustering. Arto Klami and Aditya Jitta. In Proceedings of Uncertainty in Artificial Intelligence (UAI), 2016. [pdf]
  • Using regression makes extraction of shared variation in multiple datasets easy. Jussi Korpela, Andreas Henelius, Lauri Ahonen, Arto Klami, and Kai Puolamäki. Data Mining and Knowledge Discovery, 2016. [html]
  • Towards brain-activity-controlled information retrieval: Decoding image relevance from MEG signals. Jukka-Pekka Kauppi, Melih Kandemir, Veli-Matti Saarinen, Lotta Hirvenkari, Lauri Parkkonen, Arto Klami, Riitta Hari, and Samuel Kaski. Neuroimage, 2015. [doi]
  • Group factor analysis. Arto Klami, Seppo Virtanen, Eemeli Leppäaho, and Samuel Kaski. IEEE Transactions in Neural Networks and Learning Systems, 2015. [preprint]
  • Latent-feature regression for multivariate count data. Arto Klami, Johannes Sirola, Lauri Väre, Abhishek Tripathi, and Frederic Roulland. In Proceedings of Artificial Intelligence and Statistics, 2015. [pdf]
  • Group-sparse embeddings in collective matrix factorization. Arto Klami, Guillaume Bouchard, and Abhishek Tripathi. In Proceedings of International Conference on Learning Representations, 2014. []
  • Bayesian object matching. Arto Klami. Machine Learning, 92(2):225-250, 2013. [doi:10.1007/s10994-013-5357-4, preprint]
  • Bayesian canonical correlation analysis. Arto Klami, Seppo Virtanen, and Samuel Kaski. Journal of Machine Learning Research, 14:965-1003, 2013. [pdf]

Last updated on 21 Mar 2018 by Arto Klami - Page created on 21 Jan 2014 by Arto Klami