This course is designed as an applied introduction to Machine Learning (ML) techniques, with exercises in R and Python to run various ML algorithms.
Dr Joanna Dipnall is an applied statistician with particular interests in the advanced statistical methods and machine and deep learning techniques. She completed her Honours in Econometrics with Monash University and PhD with IMPACT SRC, School of Medicine, Deakin University. Joanna works extensively with registry and linked medical data and collaborates extensively with the Faculty of IT at Monash to supervise Masters and PhD students to integrate AI within health research. Joanna teaches within the Monash Biostatistics Unit and is the Unit Coordinator for the Monash Masters of Health Data Analytics course. Joanna has taught advanced statistical methods for many years at universities and for ACSPRI.
Machine Learning techniques are becoming increasingly popular across areas of research from computer science to various disciplines of medicine. This branch of artificial intelligence relates to algorithms that learn from data based on specific tasks and performance measures. This course is an introductory applied course, with exercises in R and Python to run various ML algorithms.
Classification, prediction and model selection issues will be discussed. Detailed notes with worked examples and references will be provided as a basis for both the lecture and hands-on computing aspect of the course.
This course primarily focusses on the application of specific ML techniques rather than the complex mathematics behind the ML algorithms and discussion of some of the uses in ML techniques in publications will be discussed at the end of the course.
This course is broken up into the following sections:
Part I: Fundamentals of Machine Learning
Part II: Machine Learning Techniques and Work Flow
Part III: Decision Trees & Random Forests
Part IV: Boosted regression
Part V: Support Vector Machines
Part VI: Machine Learning Techniques in Publications
Participants will be given time to do some ML exercises on their own to practise what they have learned. Exercises and solutions will be provided in both R and Python software.
This workshop will take place in a classroom. You will need to bring your own laptop with R and/or Python installed.
This course assumes that participants have:
(1) Sound familiarity with at least one of the two software packages R and/or Python.
(2) sufficient understanding of statistics to be able to comprehend the material covered in the course outline, such as a basic grounding in multiple regression (e.g., linear, logistic, Poisson) and clustering techniques (e.g. Principal components analysis, k-means clustering)
(3) access to either R and/or Python
(4) some experience in using Microsoft Word and Excel or their equivalent
(5) experience using a text editor such as Notepad.
Course notes will be supplied.
No specific references are suggested but a number will be supplied with the notes handed out for the course.