Using R for Practical Research and Data Visualisation

This course is intended for applied data analysis users. It will examine questions dealt with in public policy, politics and industry using real data. This includes voter surveys; housing, unemployment and prison data; and surveys on water use in Bangladesh. The unit will help build participants’ ability to undertake basic statistical analysis (including means, confidence intervals and linear regression) in R and create publication-standard graphs of the results. The end result will be more professional and easy to understand analyses of data.

 
Level 2 - runs over 5 days
Instructor: 

Dr Shaun Ratcliff is a political scientist, survey researcher and applied data scientist.

He is the Principal at Accent Research, where he works with clients on complex social and political research, studying how the public thinks and behaves, what influences their beliefs and actions, and ways to engage with them.

He was previously Director of Data Science at YouGov, and before that, a Lecturer at the US Studies Centre at the University of Sydney where he remains an Honorary Associate and continues to teach data science.

He has a PhD in Political Science from Monash University.

Course dates: Monday 6 July 2015 - Friday 10 July 2015
Course status: Course completed (no new applicants)
Week: 
Week 2
About this course: 

R is open source and free, no licences are required. R is flexible, powerful and intuitive and it is excellent for data visualisation. As it is open source, R has thousands of developers in leading universities, corporate research labs and other institutions across the world. This means its capabilities tend to exceed competing software, with new packages added all the time. As there is no licence, you can take it with you wherever you go. Wherever you work, you don't have to change software when you change employers. As a result, R has becoming increasingly popular for academic research, economics analysis and public policy development. This trend is only likely to continue. Becoming conversant in R will help build your personal capabilities and employment opportunities by making you a more flexible worker capable of undertaking analysis many other researchers and analysts cannot.

 

No prior experience with R, regression or any sophisticated quantitative methods are required for this course. Participants should use data in their occupations (or study if they are a student) and understand some of the basics (what is the mean, the median the standard deviation, for instance).

 

This is a course for subject matter experts who want to use more quantitative analysis in their work. By the end of the week you will be able to better conduct basic descriptive analysis and simple linear regression in R, and will be able to graph that analysis in a way that looks professional and is easy to understand.

Course syllabus: 

Day 1: Getting started

R is excellent for conducting simple yet effective analyses of data. In particular, it is useful in plotting descriptive data such as trends in unemployment, oil prices and crime rates. We will look at plotting data to provide you with methods you can use in your work or your research.

The course starts with instructions on how to access data in R, calculate descriptive statistics and plotting the results. We will then cover graphing means and variance so you can better understand the structure of your data. The first day of the unit will also include how to plot the trends of multiple indicators (for instance, several economic indicators such as unemployment, inflation, consumer confidence, oil prices, building starts, job vacancies and business investment) in a way that looks professional and sophisticated, with just a few a few lines of code.

Day 2: Graphing probabilities

Starting with the descriptive statistics above, how can we add to this analyses? It would help to understand how much certainty we have in this estimate. This can be done by calculating the confidence intervals of the data. On the second day we will estimate the confidence intervals and probabilities for real-life problems and graph the results. Learning how to create plots with point estimates and 95% confidence intervals that communicate your findings, making it easier for you to publish your work in a way your audience will clearly understand.

Day 3: Understanding elections and well-switching in Bangladesh

Some problems we may want to explore in the course of our research or work have multiple dimensions. For instance, politics can be seen as having two dimensions (economic and social issues). To understand multiple dimensions we will look at creating scatter plots to better understand the structure of our data. On the third day of the course, we will also explore some more complex issues, such as voting behaviour among different demographic groups and surveys of community behaviour relating to access to clean water in Bangladesh.

Day 4: Regression

Sometimes you need to do more than look at the descriptive data. There may be confounding factors, such as the effects of economic, political and demographic factors on policy outcomes. We can control from these and learn far more from our data using simple linear regression. On Day four we will look at fitting linear regression models to the data examined in the previous sessions, to provide a more thorough analysis of those problems. We will also look at graphing the regression coefficients to make our results clearly understandable.

Day 5: Bringing it all together

On the final day we will explore some slightly more complex regression models and look at graphing estimates and predictions from the model outputs. One-on-one consultations will be undertaken in the afternoon to go over specific parts of the course participants want further advice on.

 

Course format: 

This course will be held in a classroom. Course participants will require a laptop for this course with R installed. ACSPRI staff and the course instructor will be able to help with this in the weeks leading up to the course.

Recommended Background: 

ACSPRI course Fundamentals of Statistics or a similar level basic understanding of statistical analysis.

Course fees
Member: 
$1,870
Non Member: 
$3,485
Full time student Member: 
$1,870
Program: 
Winter Program 2015
Notes: 

Data and course notes will be provided.