Introduction to R
What is R?
According to their site http://www.r-project.org/ :
"R is a language and environment for statistical computing and graphics."
"R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible."
"One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed."
Obtaining a copy of the R applications
Download a copy of the most recent version of this application from their site: http://www.r-project.org/
The website will require you to choose a 'CRAN Mirror'. The idea is to find the location geographically closest to you.
Launching R Programs
In R you can enter each line of code at the prompt in a step-by-step approach. You may also save R programs as simple text files to open in a separate window so that you can enter multiple lines of code at once and save your commands.
Here is an example data set you may save on your computer:
Here is an example program:
#Read data file into R as a vector
#Calculate the sample mean
The # symbol indicates a programmer's comment. This text is not read by the R application. This program can either be copied and pasted into the R command line, line by line or as an entire program. You may also source this program from where it is saved on your computer as shown below.
It is often useful to set a working directory so that file names without a pathname will refer to files in that directory on your system. The command getwd() will print your working directory to your screen. The command setwd("/pathname") sets the R working directory. On a Mac, your pathname is shown at the bottom of your Finder window, (/Users/Username/Documents/... for example). The pathname for Windows is "C:/Users/Username/Documents/...".
One nice feature of the step-by-step command lines in R is that you may scroll through previous commands using the Up and Down arrow keys. Here are a couple of other handy commands that you can use in R:
### to read the commands from a source file directly and to output it in the R console instead of doing it line by line or copying the source file, in the command line envoke:
> source("intro.R", echo=TRUE)
#### to read the commands from a source file directly and to save the output named "example1.txt" as a text file
> source("intro_file.R", echo=TRUE)
#### Within the intro_file.R program the following commands redirect all subsequent R
Here are the data files and programs to practice the above commands:
Depending on the course, datasets are either presented within the context of the lesson or within a datasets folder. Common file extensions for data files include ".dat", ".csv", and ".txt". You must download the data from your course website. Canvas provides instructions on how to save a file for Windows users or Mac users. A Save dialog box will be displayed and allow you to save the data file to the location you choose on your computer.
There are a number of ways to read data into your R session. Two popular commands used in the examples presented here are read.table and scan. A nice summary of inputting data into R may be found at:
How to Input data into R. UCLA: Academic Technology Services, Statistical Consulting Group. from http://www.ats.ucla.edu/stat/r/faq/inputdata_R.htm (accessed November 10, 2010).
Install a Development Environment
The development environment is the application that you will use to open, edit, and execute R programs. If you already have a favorite development environment, you can see if it’s compatible with R (many of them are). If you don’t we recommend one called RStudio.
- You need to have R installed first (see above)
- RStudio can be downloaded from: http://www.rstudio.com/products/rstudio/download/
- Select the “installer” link that corresponds to your operating system (e.g. Windows, Mac OSX).
If you need help understanding a command or its syntax type either ?command, or help(command) and R will display the help available on this topic. For instance, here is the help page for read.table from the command ?read.table:
Please note: Certain functions in R may NOT run on all platfroms (e.g., Windows, MAC, Linux, etc..) the same way. For example, you may get an error with the sink() function on your PC depending how the read/write permissions are set up. It should not be a problem using it on Linux or a MAC, or if you run your R programs in a Terminal. In the Windows version of R, you can delete or comment-out the sink() function, and save your output by clicking on File/Save to file. You can also explore other options/functions such as file(), capture.output(), etc... You can also copy and paste your output from the R console window into a seperate file.
The Department of Statistics offers two 1 credit online courses, STAT 484: Topics in R: Statistical Language and STAT 485 - Intermediate Topics in R Statistical Language. This would be a good step towards building a solid foundation in using R. In addition, you may also find the following references handy:
- The R Project Homepage
- R Tutorial - web site at Clarkson University Department of Mathematics
- An Introduction to R - written by W. N. Venables, D. M. Smith
and the R Development Core Team
- Short Reference Card - written by Tom Short, EPRI PEAC
- R Seek helps you find the R function you require
- R Graph Gallery contains numerous examples of graphs and code made within R
- DataCamp offers a free Introduction to R course and many additional courses with subscription.
- Lynda.psu.edu includes two courses involving R, Up and Running with R and R Statistics Essential Training