Machine Learning fall 2020/21

Date:

11 & 12, 18 & 19 November 2020

Time:

10.00 -16.00 h

Location:

Utrecht

Lecturer:

Prof. Inneke Van Nieuwenhuyse (Hasselt University) & Dr. Rui Jorge Almeida (Maastricht University)

Days:

4

ECTS:

1 (participating only) – 4 (participating + passing the assignment)

Course fee:

free for TRAIL/Beta/OML members, others please contact the TRAIL office

Registration:

= This course is fully booked =

This course was originally scheduled in April & May 2020. Because of the coronavirus it is re-scheduled to November. All participants of the April & May-edition have been informed by e-mail.

For a place on the waiting list, please sent an e-mail to info@rstrail.nl.

Objectives:

The objective of this course is to provide knowledge in machine learning models and techniques. Both fundamentals as well as practical applications are discussed.

After a successful completion of this course, students will be able to:

  • Implement different machine learning models.
  • Understand how data can be used to provide new insights into problems.
  • Compare different techniques and algorithms in advanced machine learning,
  • Understand how to choose a model to describe a particular type of data or problem.
  • Evaluate machine learning models in practice.
  • Understand the mathematics necessary for constructing novel machine learning solutions.
  • Design and implement various machine learning algorithms in a range of real-world applications.

Course description:

This course introduces the fundamental methods of machine learning and statistical pattern recognition. This course will cover both theoretical foundations as well as implementation of machine learning in the data mining context. We will analyze data to create predictive and prescriptive models with (un)supervised machine learning methods, such as regression, clustering, tree based methods ensemble methods, support vector machines, (deep) neural networks, and Gaussian processes.

This course will introduce the end-to-end process of investigating data through machine learning methodology. The goal is either to discover / generate some preliminary insights in an area where there really was little knowledge beforehand, or to be able to predict future observations accurately. This includes methods to extract and identify useful features that best represent your data.

The sessions will focus on theoretical aspects of machine learning methods and algorithms, but also on hands-on experience using a suitable programming language. The course will not focus on particular applications. That is the objective of the optional project, where you can use the foundations provided for an application from your own scientific area. Examples of machine learning applications in the operations management and logistics field are provided during the sessions.

Assignment:

A number of mandatory hand-in assignments are provided in each session. Students will work on the assignments during the meetings, and can complete the assignment at home. An optional project is available where you are required to use the topics discussed in this course to analyze data for an application from your own scientific area. The objective of this final project is to explore new research in machine learning. A starting point could be replicating a paper, adding your own meaningful analysis, comparing it with other papers or applying the methods to a completely new application. There are two deliverable: a project proposal and a final report.

Program:

  • Day 1: Introduction to machine learning concepts. Tree based methods and ensembles. Assessing model accuracy and model interpretability.
  • Day 2: Support vector machines. Resampling Methods. Model selection and regularization. Clustering.
  • Day 3: Neural networks. Convolutional and recurrent neural networks.
  • Day 4: Gaussian processes regression and classification. Introduction to Bayesian optimization.

Literature:

Methodology:

Course material:

The material for this course includes selected papers and chapters from:

  • Hastie, T., Tibshirani, R., Friedman, J. (2001). The Elements of Statistical Learning. New York, NY, USA: Springer New York Inc.. ISBN 978-0-387-84858-7.
  • Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. (2013). An introduction to statistical learning : with applications in R. New York :Springer. ISBN: 978-1461471370.
  • Goodfellow, I., Bengio, Y. Courville, A. (2016). Deep Learning. MIT Press. ISBN: 978-0-262-035613. www.deeplearningbook.org.
  • Murphy, K.P., (2012), Machine Learning A Probabilistic Perspective. MIT Press. ISBN: 978-0-262-01802-9
  • Han, J., Pei, J., Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. ISBN: 978-1-55860-901-3
  • Rasmussen, C.E., Williams, CK.I. (2005). Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press. www.gaussianprocess.org/gpml/

A selected list of papers will be made available at the beginning of the course.

Prerequiste:

Students need to have solid background in statistics, probability theory, linear algebra, continuous mathematics, multivariate calculus and multivariate probability theory. In addition, students should have solid foundations with programming languages such as Matlab, Python or R, using procedural, functional or object-oriented paradigms. Students can use their language of choice, but they should ensure that the methods discussed in the sessions are available in their preferred language. For neural networks, Python is highly recommended. For Gaussian processes, Matlab is highly recommended. Students are encouraged to follow online courses in Python and Matlab as preparation. Students are not required to have any prior knowledge on machine learning.

Pre-registration form


Member of research school: