Title: Knowledge Discovery and Management in Life Sciences
Time: July 8, 2009 at 10.15
Place: C222, Exactum building, Kumpula Campus, University of Helsinki
Presenter: Dr. Fazel Famili
Knowledge Discovery Group
Institute for Information Technology
National Research Council of Canada
Ottawa, Canada
Abstract:
Many applications of knowledge discovery have emerged to deal with analyzing real world data. Particular examples are related to life sciences, physical systems (sensor based systems) and financial data. Of the more complex of these examples is the life sciences domain where one tries to analyze and integrate large amounts of high-throughput genomics and proteomics data obtained from either single time point or time-series applications. Similar to many other domains, in life sciences, various methods have also been developed, and many data mining tools (commercial, non-commercial) have been introduced. These applications have all contributed to tasks such as: (i) identification of certain genes or protein functions, and (ii) understanding the molecular mechanism of certain species and their associated biological pathways. One question is, however, with this wealth of newly discovered or existing knowledge, what is the best way to properly manage all this knowledge when it is validated. This has been one of the motivations behind several data mining research projects that we have initiated. Here, in addition to searching for patterns in genomics and proteomics data, we have been working on identifying proper ways to represent, structure, and distribute all forms of knowledge.
This talk consists of two parts. In part one, we provide an overview of knowledge discovery focusing on life sciences and describe the main motivations for developing and applying knowledge discovery methods to analyze complex biological data. We also briefly describe a few of our case studies where we have analyzed high throughput biological data using unsupervised or supervised machine learning techniques. These are cases in which real biological data sets (obtained from public or private sources) have been analyzed and studied for tasks such as gene function identification and gene response analysis. In part two of this talk, we describe how discovered and validated knowledge could be structured into knowledge bases where it can be integrated with other forms of knowledge, for dissemination to multiple users. We conclude our talk with some lessons learned and the research directions that we are currently pursuing.
Short Bio: Dr. A. Famili is a Senior Research Scientist, Group Leader for the Knowledge Discovery Group and a leading data mining expert working at the Institute for Information Technology (IIT) of the National Research Council of Canada (NRC), where he has been for the last 24 years. Prior to joining NRC, he worked in industry for 3 years.
Dr. Famili has been actively involved in the field of Artificial Intelligence, Data Mining and Bioinformatics and successful application of these technologies. He has a strong data mining and bioinformatics team within IIT that is currently engaged in unique research and development in data mining for genomics, proteomics and health care. His research has been on data mining, machine learning and bioinformatics and their applications to real world problems in various data rich environments, such as life sciences. Dr. Famili has edited two books, published over 50 articles in the area of data mining and AI and holds a US data mining patent. He has organized many workshops, has been involved in a number of data mining and AI conferences and has extensive collaboration with a number of institutes in Canada, Europe and south America. He is also on the editorial board of four scientific journals and an adjunct professor at SITE (School of Information Technology and Engineering), and The Institute of System Biology, at the University of Ottawa.
Last updated on 2 Jul 2009 by Visa Noronen - Page created on 8 Jul 2009 by Visa Noronen