Machine Learning spring 2020/21

Date:

10, 17, 24 February & 2 March 2021

Time:

10.00 -16.00 h

Location:

Online

Lecturer:

Prof. Inneke Van Nieuwenhuyse (Hasselt University) & Dr. Rui Jorge Almeida (Maastricht University)

Days:

ECTS:

2 (participating only) – 4 (participating + passing the assignment)

Course fee:

free for TRAIL/Beta/OML members, others please contact the TRAIL office

Registration:

This course was originally scheduled in June 2020. Because of the COVID-19 measures it was re-scheduled to February/March 2021. All participants of the June edition have been informed by e-mail, and have first choice to particpate.

The February edition of this course is fully booked!
There is also a very long waiting list for this course.

Objectives:

The objective of this course is to provide knowledge in machine learning models and techniques. Both fundamentals as well as practical applications are discussed.

After a successful completion of this course, students will be able to:
- Implement different machine learning models.
- Understand how data can be used to provide new insights into problems.
- Compare different techniques and algorithms in advanced machine learning,
- Understand how to choose a model to describe a particular type of data or problem.
- Evaluate machine learning models in practice.
- Understand the mathematics necessary for constructing novel machine learning solutions.
- Design and implement various machine learning algorithms in a range of real-world applications.

Course description:

This course introduces the fundamental methods of machine learning and statistical pattern recognition. This course will cover both theoretical foundations as well as implementation of machine learning in the data mining context. We will analyze data to create predictive and prescriptive models with (un)supervised machine learning methods, such as regression, clustering, tree based methods ensemble methods, (deep) neural networks, and Gaussian processes.
This course will introduce the end-to-end process of investigating data through machine learning methodology. The goal is either to discover / generate some preliminary insights in an area where there really was little knowledge beforehand, or to be able to predict future observations accurately. This includes methods to extract and identify useful features that best represent your data.
The sessions will focus on theoretical aspects of machine learning methods and algorithms, but also on hands-on experience using a suitable programming language. The course will not focus on particular applications. That is the objective of the optional project, where you can use the foundations provided for an application from your own scientific area. Examples of machine learning applications in the operations management and logistics field are provided during the sessions.

Assignment:

A number of mandatory assignments are provided in each session. Students will work on the assignments before the meetings, and can complete the assignment at home. An optional project is available where you are required to use the topics discussed in this course to analyze data for an application from your own scientific area. The objective of this final project is to explore new research in machine learning. A starting point could be replicating a paper, adding your own meaningful analysis, comparing it with other papers or applying the methods to a completely new application. There are two deliverables: a project proposal and a final report.

Program:

Day 1: Introduction to machine learning concepts. Tree based methods and ensembles. Assessing model accuracy and model interpretability. (Instructor: Rui Jorge Almeida, Maastricht University)

Day 2: Gaussian processes regression. Introduction to Bayesian optimization. (Instructor: Inneke Van Nieuwenhuyse, Hasselt University)

Day 3: Neural networks. Feedforward neural networks, hidden layers, back-propagation, bagging and regularization. (Instructor: Rui Jorge Almeida, Maastricht University)

Day 4: Unsupervised learning. Clustering, association rule learning, dimension reduction. (Instructor: Rui Jorge Almeida, Maastricht University)

General information:
- Attendance indicates that you join the meeting, participate in the discussions using your camera and microphone throughout the meeting.
- There is no physical location for the meetings. These meetings will take place via Zoom, or GoogleMeet. Links will be sent by email.
- To accommodate for the online format, we will be using a system similar to ‘flipped classroom’.
- Preparation is necessary for all sessions. The information on topics, reading material and othernecessary information are included in the session exercises, which are sent to the participants byemail +/- one week before class.

Literature:

Methodology:

Course material:

The material for this course includes selected papers and chapters from:
Hastie, T., Tibshirani, R., Friedman, J. (2001). The Elements of Statistical Learning. New York, NY,USA: Springer New York Inc.. ISBN 978-0-387-84858-7.
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. (2013). An introduction to statisticallearning : with applications in R. New York :Springer. ISBN: 978-1461471370.
Goodfellow, I., Bengio, Y. Courville, A. (2016). Deep Learning. MIT Press. ISBN: 978-0-262-035613.www.deeplearningbook.org.
Murphy, K.P., (2012), Machine Learning A Probabilistic Perspective. MIT Press. ISBN: 978-0-262-01802-9
Han, J., Pei, J., Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. ISBN: 978-1-55860-901-3
Rasmussen, C.E., Williams, CK.I. (2005). Gaussian Processes for Machine Learning (AdaptiveComputation and Machine Learning). The MIT Press. www.gaussianprocess.org/gpml/
Forrester, A., Sobester, A., & Keane, A. (2008). Engineering design via surrogate modelling: a practicalguide. John Wiley & Sons.
Gramacy, R. B. (2020). Surrogates: Gaussian Process Modeling, Design, and Optimization for theApplied Sciences. CRC Press.

A selected list of papers will be made available throughout the course, by email.

Prerequiste:

Students need to have solid background in statistics, probability theory, linear algebra, continuous mathematics, multivariate calculus and multivariate probability theory. In addition, students should have solid foundations with programming languages such as Matlab, Python or R, using procedural, functional or object-oriented paradigms. Students can use their language of choice, but they should ensure that the methods discussed in the sessions are available in their preferred language. For neural networks, Python is highly recommended. For Gaussian processes, Matlab is highly recommended. Students are encouraged to follow online courses in Python and Matlab as preparation. Students are not required to have any prior knowledge on machine learning.

Pre-registration form

Course

Date

First name

Last name

Employed at

Faculty

Email address

Telephone

Position

Member of research school:

Member of research school

Yes, Beta research school Yes, TRAIL research school Yes, other No

Name other research school

Additional comments/dietary needs

I agree that GP-OML uses my data for the purpose of this course.