Introduction to R and the Tidyverse

These are materials from a workshop I taught in May of 2019. The goal was to introduce R and RStudio, basic data manipulation with the tidyverse, and then provide an overview of literate programming for report generation (automated plus manual).

Introduction to R & RStudio

We started as usual by getting to know R and RStudio a bit, how t install them, the RStudio IDE’s panes, installing/updating packages, CRAN repositories, compiling packages from source, and then reading in data files in various formats and ways.

Introduction to the tidyverse

“Yet far too much handcrafted work — what data scientists call “data wrangling,” “data munging” and “data janitor work” — is still required. Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets." Well, as the quote underscores, cleaning data takes the bulk of our time so knowing how to go about accomplishing tasks you will likely need to accomplish weekly if not daily is well worth the effort. In this session we covered some basic dplyr and 'tidyr verbs. I threw in some lubridate but we had to skip this section because of time constraints.

Graphics with ggplot2

In the last and final session of the workshop we spent a little bit of time going over the basics of ggplot2. Having another half-day would have been useful because we just did not get very far, as is obvious from the material in this module.