Data Analysis, Graphics and Visualisation Using R

BYO laptops are required for this course.

 

This course is intended for data analysts and quantitative methods lecturers who currently use proprietary data analysis software (eg SPSS, SAS or Stata), and early career researchers aiming to establish a solid foundation for lifelong skill development.  The emphasis of the course is on hands-on analysis, graphical display, and interpretation of data.  An important aim is to give a strong sense of the new vistas, especially in data visualization, to which R gives access.

 

R is the leading tool for statistics, data analysis, machine learning and statistical graphics. It is supported by an active community of thousands of developers and contributors, and more than 2 million users. It has become the environment of choice for the implementation of new techniques, with over 2000 modules -- with more added every day -- covering the methods of every discipline from anthropology to zoology.  The powerful and innovative graphics abilities available in R include the provision of well-designed publication-quality plots that can include mathematical symbols and formulae.

 

The course shows how to use R for the methods covered in the ACSPRI course 'Fundamentals of Multiple Regression'.  It assumes that participants have an understanding of regression techniques at least to the level provided by that course. Additionally, it will give an introduction to R abilities that, relative to what was available a few years ago, are spectacular.  These include:

 

1) Dynamic displays of 3-dimensional data;

2) the creation of Hans Rosling style Motion Charts. See http://www.gapminder.org/upload-data/motion-chart/

3) the overlaying of plots on to Google maps or Google Earth Displays; with the ability to manipulate the resulting display dynamically

 

Concepts and understanding that are important for the use of R will be introduced in the context of data exploration and regression calculations. While a major emphasis will be on the use of R for regression, the introduction that it provides to the R system will be helpful in getting started on the use of R more generally.

 

Intending participants are encouraged to work through the introductory notes on the R system that are noted below. There will be some limited use of the graphical user interface provided by the R Commander package for R.  Most use of R will however be from the command line, using the highly attractive RStudio "interactive display environment".

 

Notes will be provided.  The Maindonald and Braun text noted below covers a substantial part of the course content, and will be useful for supplementary reading.  Arrangements will be made for course participants to purchase this text at a discounted cost.

 

Data will be provided.  Participants who can provide the data in advance of the course will, if the data are suitable for the methods covered in the course, have the opportunity to analyse their own data and

discuss the output.

 

For information on relevant components of the R system, and on
preparation for this course, go to:

https://maths-people.anu.edu.au/~johnm/r-courseprep.html

 
Level 3 - runs over 5 days
Instructor: 

Following a first in Mathematics at Auckland University and a variety of teaching and lecturing positions, John Maindonald settled down to working with other researchers as a quantitative problem solver. Until his move from New Zealand to Australia in 1996, much of his work was in plant, fruit and insect and other pest research, with industrial consulting as a sideline. He took up a position at The Australian National University (ANU) in 1998.  At ANU he has relished the stimulus of working with biologists (including molecular biologists), ecologists, epidemiologists, public health researchers, demographers, computer scientists, numerical analysts, machine learners, an economic historian, forensic linguists, and a lively group of statisticians. He is the author of a book on Statistical Computation.  He the senior author of "Data Analysis and Graphics Using R". This example-based exposition of practical approaches to data analysis, now into its third edition, has sold more than 10,000 copies.  Now in semi-retirement, he does occasional consulting, and fronts workshops on the use of the open source R system for scientific and statistical applications and for graphics.

Course dates: Monday 7 July 2014 - Friday 11 July 2014
Course status: Course completed (no new applicants)
Week: 
Week 2
Recommended Background: 

 

Knowledge of the principles of multiple regression at a level comparable to that provided by the ‘Fundamentals of Multiple Regression’ course. Previous experience of data analysis using SPSS or SAS or Stata or R, or another system with comparable abilities. Participants must be comfortable with typing commands at the command

line.

Recommended Texts: 

Maindonald, J.H. and Braun, W.J. Data Analysis and Graphics Using R --

An Example-Based Approach.  Cambridge University Press 2010.



 



 

Course fees
Member: 
$1,800
Non Member: 
$3,230
Full time student Member: 
$1,800