It’s the first big hurdle to dealing with data in R. Most people are used to double clicking their data file and having some software like Excel open it.
In R it takes some thought and deliberation. While it’s now possible to use the Import Dataset button in RStudio, we’re going to do it the proper way with a command.
Thanks to members of the community, there are many packages that lets R import all types of data such as:
The repo for this class is on Github, but can be easily downloaded to your desktop with the following commands:
If you get an error that ‘git2r’ is not available, then run install.packages(“git2r”). If that still doesn’t work, you might need to brew install libgit2 from your cmd or terminal (pc or mac).
It’s possible to import data like CSVs and table-delimited formats in Base R but the way it does makes more sense for statisticians and less sense to journalists.
read.csv() is the Base R function to read in CSVs but the default mode is to treat strings, like names and addresses, as factors. You have to pass it the variable
stringsAsFactors=FALSE to make it work like you need it to.
We’re going to go right into using packages that imports data faster and that requires very little adjustment.
Some tips when importing data into R:
Here are some links to importing other types of data we won’t be able to get into in this class.
There are links to exercise what you’ve learned spread through out this section.
It’s possible to run these files locally to test yourself if you’ve downloaded the files for the chapter as instructed above.
Make sure your project directory is correct and then run these lines in the console:
install.packages("learnr") install.packages("rmarkdown") install.packages("tidyverse")
© Copyright 2018, Andrew Ba Tran
© Copyright 2018, Andrew Tran