Machine Learning Coffee seminar: "Machine Learning using Unreliable Components: From Matrix Operations to Neural Networks and Stochastic Gradient Descent"

Submitted by troos on April 13, 2018 - 18:50

Lecturer :

Sanghamitra Dutta

Event type:

HIIT seminar

Event time:

2018-04-16 09:15 to 10:00

Place:

Aalto University, Konemiehentie 2, seminar room T6

Web page:

Machine Learning Coffee seminar

Description:

Sanghamitra Dutta, Carnegie Mellon University

Machine Learning using Unreliable Components: From Matrix Operations to Neural Networks and Stochastic Gradient Descent

Reliable computation at scale is one key challenge in large-scale machine learning today.Unreliability in computation can manifest itself in many forms, e.g. (i) "straggling" of a few slow processing nodes which can delay your entire computation, e.g., in synchronous gradient descent; (ii) processor failures; (iii) "soft-errors," which are undetected errors where nodes can produce garbage outputs. My focus is on the problem of training using unreliable nodes.

First, I will introduce the problem of training model parallel neural networks in the presence of soft-errors. This problem was in fact the motivation of von Neumann's 1956 study, which started the field of computing using unreliable components. We propose "CodeNet", a unified, error-correction coding-based strategy that is weaved into the linear algebraic operations of neural network training to provide resilience to errors in every operation during training. I will also survey some of the notable results in the emerging area of "coded computing," including my own work on matrix-vector and matrix-matrix products, that outperform classical results in fault-tolerant computing by arbitrarily large factors in expected time. Next, I will discuss the error-runtime trade-offs of various data parallel approaches in training machine learning models in presence of stragglers, in particular, synchronous and asynchronous variants of SGD. Finally, I will discuss some open problems in this exciting and interdisciplinary area.

Parts of this work is accepted at AISTATS 2018 and ISIT 2018.

Machine Learning Coffee seminars are weekly seminars held jointly by the Aalto University and the University of Helsinki. The seminars aim to gather people from different fields of science with interest in machine learning. Talks will begin at 9:15 am and porridge and coffee will be served from 9:00 am.

Welcome!

Last updated on 13 Apr 2018 by Teemu Roos - Page created on 13 Apr 2018 by Teemu Roos

HIIT

Otaniemi

T building: Aalto University, Otaniemi campus, Computer Science building, Konemiehentie 2, 02150 Espoo.

Get to Otaniemi site by public transport

Kumpula

Exactum building: University of Helsinki, Kumpula campus, Gustaf Hällströmin katu 2b, 00560 Helsinki

Get to Kumpula site by public transport

More contact information

Tweets by @HIIT

Search form

Search form

Machine Learning Coffee seminar: "Machine Learning using Unreliable Components: From Matrix Operations to Neural Networks and Stochastic Gradient Descent"

News & events

HIIT

Otaniemi

Kumpula