To perform such tasks, a profile of the user's interests must be created. In this tutorial, we will focus on the learning and representation of user profiles, the methods for collecting user feedback, and the representation of information sources. This tutorial will review a variety the findings from several decades of research on information retrieval focusing on approaches to information filtering and classification. Next, machine learning approaches to classification will be described including decision trees, nearest neighbor algorithms, Bayesian classifiers and neural networks. We will discuss how they may be used to learn user profiles. The relationship between machine learning and classic approaches from information retrieval will be discussed. Finally, recent developments such as collaborative filtering, efficient rule learners, combining multiple models, weighted majority algorithms and infinite attribute models will be described.
The technology will be illustrated with examples from a variety of
information agents including LIRA, NewsWeeder, WebWatcher, WebDoggie,
InfoFinder, Inquery, Letizia, firefly, InfoFinder, Syskill & Webert,
DICA and the Remembrance Agent.
Prerequisite Knowledge
The intended audience of this tutorial is practitioners and researchers
interested in issues involved with applying machine learning and
information retrieval algorithms to classification and ranking of
information on the Internet. A familiarity with basic knowledge of
mathematics and probability will be assumed.
About the Lecturers
Michael Pazzani
received an M.S. degree in computer science specializing in Natural
Language Processing in 1980, and a Ph.D. in computer science
specializing in Machine Learning from UCLA in 1987. He is now a
professor and department chair of Information and Computer Science at
the University of California, Irvine. He has been active in Machine
Learning research for the past decade with numerous publications in the
IJCAI, AAAI, Cognitive Science and the International Machine Learning
Conferences.