Zhirong Yang, Professor of Computer Science, Aalto University
Learning Data Representation by Large-Scale Neighbor Embedding
Abstract: Machine learning, the state-of-the-art data science, has been increasingly influencing our life. Encoding data in a suitable vector space is the fundamental starting point for machine learning. A good vector coding should respect the relations among the data items. However, conventional methods that preserve pairwise or higher order relationship are very slow and consequently they can handle only small-scale data sets. We have been developing a family of unsupervised methods called large-scale Neighbor Embedding (NE) which substantially accelerate the vector coding. Our method can thus learn low-dimensional vector representation for mega-scale data according to their neighborhoods in the original space. With our efficient algorithms and a wealth of neighborhood information, Neighbor Embedding significantly outperforms small-scale NE and many other existing approaches for learning data representation. Besides generic feature extraction, our work also delivers two important tools as special cases of Neighbor Embedding for data visualization and cluster analysis, which scales up these applications by an order of magnitude and enables the current-sized visualization and clustering for interactive use. Because neighborhood information is naturally and massively available in many areas, our method has wide applications as a critical component in scientific research, next-generation DNA sequence analysis, natural language processing, educational cloud, financial data analysis, market studies, etc.
Machine Learning Coffee seminars are weekly seminars held jointly by the Aalto University and the University of Helsinki. The seminars aim to gather people from different fields of science with interest in machine learning. Talks will begin at 9:15 am and porridge and coffee will be served from 9:00 am.
Next talks:
* we'll have a summer break and continue on September 4, 2017 *
Welcome!
Last updated on 16 May 2017 by Noora Suominen de Rios - Page created on 16 May 2017 by Noora Suominen de Rios