Common Tasks in R

Installation, Configuration and Environment

.libPaths() displays the location of package libraries

Vectors and Matrices

base::rep() replicates elements of vectors and lists.

rep(5, 3) #returns 5, 5, 5
rep(c(1,2),2) #returns 1 2 1 2
rep(c(1, 2), each=2) #returns 1 1 2 2
rep(1:3, 3:1) # 1, 1, 1, 2, 2, 3

base::seq() generates regular sequences. Very flexible with many options. Typical usage includes seq(from, to), seq(from, to, by= ), seq(from, to, length.out= ), seq(along.with= ), seq(from), seq(length.out= ).

seq(0, 1, length.out=11)
seq(stats::rnorm(20))
seq(1, 9, by = 2) # match
seq(1, 9, by = pi)# stay below
seq(1, 6, by = 3)
seq(1.575, 5.125, by=0.05)
seq(17) # same as 1:17

base::vector() produces a vector of the given length and mode. The atomic modes are ‘logical’, ‘integer’, ‘numeric’, ‘complex’, ‘character’ and ‘raw’. Mode can also be ‘list’

X = vector(mode=’list’, length=10000) #creates list with 10000 cells of NULL
X = vector(mode=’numeric’, length = 5) #creates numeric vector of 5 zeros

base::matrix() creates a 2d matrix (see also base::array) X= matrix(data = NA, nrow=2,ncol=2, dimnames = list(c(‘row1’, ‘row2’), c(‘col1’, ‘col2’))) # creates empty 2X2 matrix

Generic Variable Manipulation

Calculate new variable within subjects (e.g., standardize startle within subject)

Use the plyr package. In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. There is a year variable which indicates the calendar year (e.g. 1991) for each row. To transform calendar year to career year (cyear; i.e. the number of years since the player started playing), for each player, do the following:

baseball = ddply(.data= baseball, .variables= c(‘id’), .fun= transform, cyear = year – min(year) + 1)

NOTES: transform() is a function in base R. id is a unique identifier (e.g., SubID) for each baseball player. More detail on this example can be found in the published article on plyr from the plyr website

Recoding variable values (NEED)

Factor Manipulation

Create a factor

base::factor() creates a factor variable from text or numeric variable

d$AFactor = factor(d$BevGroup, levels = c(‘no-alcohol’, ‘placebo’, ‘alcohol’)) #create factor AFactor from variable with text data labels
d$AFactor = factor(d$BevGroup, levels = c(1,2,3), labels = c(‘no-alcohol’, ‘placebo’, ‘alcohol’)) #create factor AFactor from variable with numeric entries 1=no-alcohol, 2=placebo, 3=alcohol

Display levels of a factor

base::levels() sets or displays the levels of a factor.

levels(BevGroup) #displays the levels the BevGroup factor
levels(BevGroup) = c(‘no-alcohol’, ‘placebo’, ‘alcohol’) #set levels of BevGroup as indicated. NOTE: this is not recommended because it is error prone, use revalue()

Changle labels of factor levels

plyr::revalue() changes the values of specific levels of the factor, without respect to their order

d$BevGroup = revalue(d$BevGroup, c(“Alcohol”=”Alc”, “No-Alcohol”=”NoAlc”))

Reorder factor levels (NEED)

Set contrasts for a factor (NEED)

Date/Time Manipulations

use as.POSIXct() to convert text date to a POSIXct calendar date. This allows for standard use of a date object.

Date = ’10/12/2016 14:32:10′ #timezone is ‘America/Chicago’ which determines CST and CDT by date but probably presents a problem for the ambiguous times during switch days

(t=as.POSIXct(x=Date, format=’%m/%d/%Y %H:%M:%S’, tz = “America/Chicago”))

Info on specification of format string can be found in strptime()

To change a POSIXct date object to an integer (i.e. unix timestamp)

as.numeric(t)

To see the attributes of a POSIXct date object

attributes(t)

To change the timezone of a POSIXct date object (for display only. It doesnt fundamentally change the date/time. It is still the same moment in time)

attributes(t)$tzone = ‘UTC’

Note that it doesnt change the date itself, it just changes how it is displayed. e.g. Unix time stamp is unchanged by timezone change as.numeric(t)

If you want to force the time zone to change without updating the time use force_tz from the lubridate package. Note that this will change the moment in time but can be useful when functions default to giving you UTC but your input was in another time zone (like converting from excel)

NewTime = force_tz(OldTime, tz=’America/Chicago’)

This wikipedia page is useful for finding valid timezone info

The Epoch Converter is also a useful web resource

Using Dataframes

Opening dataframes from various sources

lmSupport::lm.readDat() loads tab delimited text in Curtin lab format

d = lm.readDat (‘Data.dat)
d = lm.readDat(‘Data.dat’, SubID = ‘ID’)

utils::read.table() to read text data

d= read.table(‘Prison.dat’, header=TRUE)
d= read.table(‘clipboard’, header=TRUE) #read data via the clipboard (e.g., from Excel)
d= read.table(‘SampleData.dat’, header=TRUE) #read .dat data file

foreign::read.spss() loads SPSS data files

d= read.spss(‘Prison.sav’, to.data.frame=TRUE)

base::scan() allows input from keyboard directly into a data frame. Separate entries by space. Enter a blank line to terminate input.

dData = scan()

R.matlab::readMat() and writeMat() are used to read and write Matlab MAT files.

d= readMat(‘X.mat’)
d= data.frame(d)

base::file.choose() is used to bring up dialog box to select filename and path

d= read.table(file.choose(), header=TRUE)

clipboard is used in various functions to read from clipboard rather than file.
d = read.delim(‘clipboard’) after copying excel data to clipboard.

Creating a new dataframe

base::data.frame() creates a dataframe from vectors

d= data.frame(X=seq(2,10,2), Y=(1:5), Z=c(1,3,6,10,12))#define vectors named X, Y, and Z
d= data.frame(BevGroup, Sex, Age) #use previously defined vectors.

Saving dataframe

lmSupport::lm.writeDat() writes a data frame to tab-delimited text file using Curtin lab defaults

lm.writeDat(d, ‘Data.dat’)

utils::write.table() writes a dataframe to a text file.

write.table(d,file=’c:\\Data.dat’, sep=’\t’) #use sep = ‘\t’ to write as tab-delimited (non-default option)
write.table(d,file=’c:\\Data.dat’, sep=’\t’, row.names=FALSE) #use row.names=FALSE to not write case/row.names (i.e., if you want to use data later in SPSS)

Display data and properties of dataframes

car::some() displays 10 (by default) randomly selected participants from the data frame

some(d)
some(d,20) #passing second argument allows display of more (or less) cases

utils::head() displays the first n rows of the dataframe

head(d)
head(d,20) #passing second argument allows display of more (or less) cases

utils::tail() displays the last n rows of the dataframe

tail(d)
tail(d,20) #passing second argument allows display of more (or less) cases

utils::View() displays dataframe in a crude spreadsheet. [see also relimp::showData()]

View(d)

base::rownames() sets the row names for a dataframe

rownames(d) = as.character(d$SubID) #set row names to the SubIDs
row.names(d) #print row names for d

base::dim() provides the dimensions (# of rows and columns) of the dataframe

dim(d)

base::nrow() provides the # of rows (observations) of the dataframe

nrow(d)

base::ncol() provides the # of columns (variables) of the dataframe

ncol(d)

base::str() compactly provides the stucture of an object

str(d)

base::names() gets or sets the names of an object.

names(d) #displays the variable names of the dataframe
names(d) = c(‘VarName1’, ‘VarName2’, ‘VarName3’) #set names of three variables in dData
names(d)[3] = ‘VarName3’ #set name of third variable to ‘VarName3’

Indexing

base::which() returns indices for specific cases based on variables in dataframe

which(d$Age> 21) #return indices for cases based on Age
which(d$BevGroup == ‘alcohol’) #return indices for cases where factor BevGroup = alcohol (level label)

stats::na.omit() selects subset of non-missing cases in dataframe

dNew = na.omit(d)

stats::complete.cases() returns a logical vector indicating which cases are complete, i.e., have no missing values.

complete.cases(d$X1, d$X2)
d= d[complete.cases(d$X1,d$X1),]

car::whichNames() returns indices of specific row names in dataframe

whichNames(c(‘1001’, ‘2022’), d) #returns indices of SubIDs 1002 and 2033 (assuming SubIDs are row names)

Get row name of specific indices

rownames(d)[10] #returns row name of case 10
rownames(d)[1:10] #returns row names of first 10 cases

Selecting a single variable in a dataframe.

d$MyVariable
d[1] #select variable in first column of dataframe

Selecting cases in a dataframe.

d[10, ] #select case 10 in d
d[7:10, ] #select cases 7-10 in d
d[c(7,10,11), ] #select cases 7, 10, 11 in d

Dataframe Manipulation

Aggregating data

Use the plyr package. In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. To make an aggregate data file that includes the mean and max number of runs across years for each player, use the following code:

NewData = ddply(.data= baseball, .variables = c(‘id’), .fun= summarise, MeanRuns = mean(r), MaxRuns = max(r))

Convert dataframe from LONG to WIDE format

Use dcast() from reshape2 package

The dcast formula has the following format: x_variable + x_2 ~ y_variable + y_2 ~ z_variable ~ … The order of the variables makes a difference. The first varies slowest, and the last fastest.

If the combination of variables you supply does not uniquely identify one row in the original data set, you will need to supply an aggregating function, fun.aggregate=. You would typically use fun.aggregate=mean.

First lets create a sample dataframe in LONG format for this example.

dLong = read.table(header=T, text=’
SubID sex condition measurement
1 M control 7.9
1 M cond1 12.3
1 M cond2 10.7
2 F control 6.3
2 F cond1 10.6
2 F cond2 11.1
3 F control 9.5
3 F cond1 13.1
3 F cond2 13.8
4 M control 11.5
4 M cond1 13.4
4 M cond2 12.9
‘)

Then use dcast as follows:

dWide = dcast(data= dLong, formula= SubID + sex ~ condition, value.var=”measurement”)

Convert dataframe from WIDE to LONG format

Use dcast() from reshape2 package You need to specify:

id.vars: the variables that will not be split apart on melt
measure.vars: the variates of the within subject variable
variable name: the name of the within subject variable
value.name: the name of the dependent variable

First, lets make a WIDE example dataframe

dWide <- read.table(header=T, text=’
SubID sex control cond1 cond2
1 M 7.9 12.3 10.7
2 F 6.3 10.6 11.1
3 F 9.5 13.1 13.8
4 M 11.5 13.4 12.9
‘)

Then use melt():

dLong <- melt(dWide, id.vars=c(‘SubID’,’sex’), measure.vars=c(‘control’, ‘cond1’, ‘cond2′), variable.name=’condition’, value.name=’DV’)

Summary/Descriptive Statistics

Calculate summary statistics separately on every subject in dataframe

Use the plyr package. In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. There is a year variable which indicates the calendar year (e.g. 1991) for each year played for each player. To calculate the last year each player played ball choose one of these options:

Option 1 uses summarize to return one row per subject in new dataframe, LastYears:

LastYears = ddply(.data= baseball, .variables= c(‘id’), .fun= summarise, MaxYear = max(year))

Option 2 uses transform to return max year in every existing row in the original dataframe (same value for every row for the same player)

baseball = ddply(.data= baseball, .variables= c(‘id’), .fun= transform, MaxYear = max(year))

Option 3 uses our own anonymous function to demo situation where you need a new or more complex function that doesnt exist (of course, this one does exist):

NewMax = function(x) {max(x, na.rm=TRUE)}
LastYears = ddply(.data= baseball, .variables= c(‘id’), .fun= summarise, MaxYear = NewMax(year))

Programming and Debugging (DRAFT)

Conditional branching

If() not vectorized If(x< 0) –x else x

Ifelse() is vectorized

list all logical operators

! x
    x & y
    x && y
    x | y
    x || y
    xor(x, y)

== != >= <=

Functions

return()

Control structures

switch()

convert2meters <- function(x,

   units=c("inches", "feet", "yards", "miles")) {
   units <- match.arg(units)
   switch(units,
       inches = x * 0.0254,
       feet = x * 0.3048,
       yards = x * 0.9144,
       miles = x * 1609.344)

}

for()

f=0
for (i in seq(1:10)) {f = f * i}

while()

fact3 <- function(x){

   if ((!is.numeric(x)) || (x != floor(x))
        || (x < 0) || (length(x) > 1))
       stop("x must be a non-negative integer")
   i <- f <- 1  # initialize
   while (i <= x) {
       f <- f * i  # accumulate product
       i <- i + 1  # increment counter
       }
   f  # return result

}

repeat

fact4 <- function(x) {

   if ((!is.numeric(x)) || (x != floor(x))
        || (x < 0) || (length(x) > 1))
       stop("x must be a non-negative integer")
   i <- f <- 1  # initialize
   repeat {
       f <- f * i  # accumulate product
       i <- i + 1  # increment counter
       if (i > x) break  # termination test
   }
   f  # return result

}

recursion

fact5 <- function(x){

   if (x <= 1) 1  # termination condition
   else x * fact5(x - 1)  # recursive call

}

Function definitions (NEED)

Debugging

browser()
debug() & undebug()

system.time()

debugger() with options(error=dump.frames) & options(error=NULL)

Rprof w/tempfile() & unlink() & summaryRprof()
traceback()
str()
class()

Iterative Procedures

Calculate summary statistics separately on every subject in dataframe

Use the plyr package. In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. There is a year variable which indicates the calendar year (e.g. 1991) for each year played for each player. To calculate the last year each player played ball choose one of these options:

Option 1 uses summarize to return one row per subject in new dataframe, LastYears:

LastYears = ddply(.data= baseball, .variables= c(‘id’), .fun= summarise, MaxYear = max(year))

Option 2 uses transform to return max year in every existing row in the original dataframe (same value for every row for the same player)

baseball = ddply(.data= baseball, .variables= c(‘id’), .fun= transform, MaxYear = max(year))

Option 3 uses our own anonymous function to demo situation where you need a new or more complex function that doesnt exist (of course, this one does exist):

NewMax = function(x) {max(x, na.rm=TRUE)}
LastYears = ddply(.data= baseball, .variables= c(‘id’), .fun= summarise, MaxYear = NewMax(year))

Create and save (to pdf) individual subject plots

In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. To make subject by subject plots of runs (r) by at bats (ab), do the following:

xlim = range(baseball$ab)
ylim = range(baseball$r)
MakePlot =function(df)
{
plot(df$ab, df$r, data = df, xlab = ‘at bat’, ylab = ‘runs’)
title(df$id[1])
}
pdf(“c:\\paths.pdf”, width = 8, height = 4) #print to pdf
d_ply(.data= baseball, .variables = c(‘id’), .fun = failwith(NA, MakePlot), .print = TRUE)
dev.off() #turn off output to pdf

Estimate linear models on individual subjects

In this example, we use the baseball dataset, In this dataset, each baseball player has n rows for each of the n years they played ball. To estimate a linear model for each player regressing runs (r) on at bats (ab) and save models in a list, do the following”

DoLM = function(df) {lm(r ~ ab, data=df)}
Models = dlply(.data= baseball, .variables= c(‘id’), .fun= DoLM)

NOTES: Defined DoLM outside of plyr call to demonstrate this functionality. Could include multiple lines of code in DoLM if needed.

Extract parameters from list of linear models with plyr

In this example, we use the baseball dataset, We first make a list of simple linear models within subject as in the previous example

DoLM = function(df) {lm(r ~ ab, data=df)}
Models = dlply(.data= baseball, .variables= c(‘id’), .fun= DoLM)

To extract the parameters (and model r-squared) and save as variables in a data frame, do the following:

rsq = function(x) summary(x)$r.squared
Parameters <- ldply(Models, function(x) c(coef(x), rsquare = rsq(x)))

John Curtin’s R Reference Card

R Installation and Workspace

utils::install.packages() installs the package or packages listed. Must load package (using library()) after installation.

install.packages(‘car’, dependencies = TRUE)

base::library() loads a package into the workspace for use or lists available packages

library(car)
library() #lists all installed packages

base::detach(‘package:car’) removes the package from the workspace.

detach(‘package:car’)

base::search() returns list of attached packages and dataframes.

search()

base::ls() returns names of objects in workspace.

ls()

base::rm() removes an object from the workspace.

rm(dData) #removes dataframe dData
rm(dData, mLM) #removes dData and mLM
rm(list = ls()) #removes all objects in workspace (with no warnings).

base::options() get or set options for R

options(digits=4) #sets digits options to 4
options() #returns all options
options(‘digits’) #returns option setting for digits
names(options()) #returns the names of all options

base::source() accept input from the named file. Used typically to load function libraries that are not in packages.

source(‘P:\\Methods\\Statistics\\R\\functions\\CurtinGLM.R’)

utils::str() returns the structure of an object.

str(dData) #return structure of the dataframe dData

base::class() returns the class of an object. Useful for debugging.

class(dData)
class(mLM)

utils::methods() List all available methods for an S3 generic function, or all methods for a class.

methods(lm)

utils::data() loads specified data sets, or list the available data sets.

data() #lists available data sets
data(USArrests) #loads USArrests data.frame

grDevices::graphics.off() closes all graphic devices.

grDevices::dev.off() closes current graphic device.

ctrl-L clears the console.

rm(list = ls()) clears the workspace

Help

The CRAN Task Views webpage provide overviews on various topics in R.

utils::help provides help on functions.  ? is shortcut.

help(lm)
help(lm)
?lm
help(‘for’)
help(package= Hmisc) #help on a package

utils::apropos() finds funcions or other objects by partial name.

apropos(‘lm’)
apropos(‘log’)

utils::help.search() provides a broader search of a topic in all installed packages.  ?? is a shortcut.

help.search(‘log’)
??’linear model’

base::args() displays the argument names and corresponding default values of a function.

args(lm)

utils::RSiteSearch() provides a search of websites, mailing lists, etc.

RSiteSearch(‘loglinear’,’functions’)

General Useful Functions

base::sign returns a vector with the signs of the corresponding elements in its single argument (the sign of a real number is 1, 0, or -1 if the number is positive, zero, or negative, respectively).

sign(c(-2, -1, 0, 1, 2))

base::abs returns a vector with the absolute values of elements in its single argument.

abs(c(-2, 0, 2))

base::identical(x,y) compare R objects ‘x’ and ‘y’ and tests for equality. Helpful when comparing vectors where == would return a vector but identical returns single TRUE or FALSE.

identical(1, 1)
identical(c(1,2), c(1,3))

base::is.element(x,y) tests ‘x %in% y’ and returns logical vector that is length of x.

is.element(c(1,2), c(1,3,5,7))

base::is.na() indicates which elements are missing.

is.na(c(1,NA,3))

base::unique()

base::sort(x, decreasing = FALSE, index.return = FALSE, …) sorts/orders a vector or factor (partially) into ascending (or descending) order. For ordering along more than one variable, (e.g., for sorting data frames), see order(). index.return=TRUE will return indices for the new sorted vector

sort(c(2,1,5))
sort(c(2,1,5), index.return=TRUE)

MASS::fractions() finds rational approximations to the components of a real numeric object.

fractions(c(.5, .33333333))

Dataframes

Indexing and manipulating

utils::fix() allows simple editing of an exisiting dataframe via a crude text editor.

fix(dData)

Creating a new dataframe with a subset of variables.

dNew = dData[,c(‘SubID’, ‘BevGroup’, ‘Sex’, ‘FPS1’)

Creating a new dataframe w/o specific row #s.

dNew = dData[-c(1:5,10),] #remove rows 1-5, 10

Creating a new dataframe based on values of a variable.

dNew = dData[dData$Age > 21,] #Select participants with Age < 21
dNew = dData[dData$Age > mean(dData$Age),] #Select participants with Age > mean Age

Remove a variable from dataframe.

dData$SubID <- NULL

Create a data frame from all combinations of the supplied vectors or factors
expand.grid(c(‘control’, ‘placebo’, ‘alcohol’), c(‘word first’, ‘color first’))

    Var1        Var2

1 control word first
2 placebo word first
3 alcohol word first
4 control color first
5 placebo color first
6 alcohol color first

Reshaping dataframe from Wide to Long format.

dLong <- melt(dWide, id.vars = c(“SubID”, “Sex”, “Alcohol”, “Baseline”), variable.name = “Condition”, value.name = “Startle”)

1st argument = Wide format data frame
2nd argument = id.vars = List variables you want to keep in rows in the new data frame
3rd = variable.name = Name of new column header/variable
4th argument = value.name = Value to be input in columns

Reshaping dataframe from Long to Wide format.

dWide <- dcast(dLong, SubID + Sex + Alcohol + Baseline ~ Condition, value.var = “Startle”)

1st argument = Long format data frame
2nd argument = Variables on the left of the ~ represents data that are staying in columns.
2nd argument = Variables on the right of the ~ represent variables to be transformed into wide format.
3rd argument = value.var = Represents the numeric values that are being transformed into wide format.

NOTE: Need to add information on merging dataframes

Working with Variables (DRAFT)

General manipulations

base::cbind() combines vectors together as columns

test = cbind(dData$X1, dData$X2, dData$X3) #sets test to three columns from dData

String Pattern Matching

search for a string pattern
  • foo<-c(‘a’,’b’,’c’) #create variable with list of strings
  • grep(“a”,foo, value=FALSE) #returns a list of indices of all levels of foo equal to ‘a’
  • grep(“a”,foo, value=TRUE) #returns a list of strings in foo equal to ‘a’
  • grepl(“a”,foo) #returns a logical index of levels of foo equal to ‘a’ (eg, returns 0 and 1 for every level of foo)
search for a string pattern with wildcards
  • foo<-c(‘abc’,’def’,’ghi’) #create variable with list of strings
  • grep(glob2rx(‘*e*’),foo) #returns a list of all levels in foo containing the string ‘e’

Quantitative Variable manipulations

base::scale() mean and/or sd transforms a matrix. Default returns a matrix so must use index if working with variable in data.frame.

dData$cX1 = scale(dData$X1, center=TRUE)[1] #mean center X1
dData$zX1 = scale(dData$X1, center=TRUE, scale=TRUE)[1] #standardize X1

car::recode() recodes a numeric vector, character vector, or factor according to simple recode specifications.

dData$NewX1 = recode(dData$X1, ‘lo:50″=1; 51:hi=2’)
dData$NewX2 = recode(dData$X2, ‘c(1,2)=”A”; else=”B”‘)

base::rowMeans() is used to create a mean across a row (i.e., across variables in a data.frame). See also colMeans(), rowSums(), & colSums().
dData$MeanX123 = rowMeans(dData$X1,dData$X2, dData$X3)

Summary Statistics (DRAFT)

base::print() is used to print the object to the screen. It is a generic function whose method depends on the object

print(c(1.234, 2.3456, 3),digits=2)

base::summary() is used to summarize an arguement. It is a generic function whose method depends on the object.

summary(dData) #provide summary statistics for variables in a dataframe
summary(mLM) #provide summary statistics for a linear model object

psych::describe() provides many common summary statistics for variables in a dataframe.

describe(dData)

descriptives by group

describe.by(dData, group,…)

base::table() provides cross tabs of counts for factors

table(Sex)
table (BevGroup, Sex)

xtabs

base::apply() returns a vector or array or list of values obtained by applying a function to margins of an array or data.frame. See also lapply(), sapply(), & tapply()

apply(dData, 1, mean) #sum across each row in dData
apply(dData, 2, sum) #sum down each column in dData
apply(dData, 1, function(x) 7*mean(x, na.rm=TRUE)) #using anon function across rows in dData
apply(dData, 1, make.scale) #applying user-defined function named make.scale()

Bivariate Statistics (DRAFT)

Correlation

stats::cor()

cor(D1)
cor(D1[,c(“mp1_con”, “mp1_nem”)])

stats::cor.test()

psych::corr.test()

Hmisc::rcorr.adjust()

psych::fisherz()

psych::fisherz2r()

psych::r.test()

psychometric::CIr()

CIr(.5, n=100)
CIr(.3, n=100, level= 0.99)

stats::padjust()

corpcor::cor2pcor()

Means comparison

stats::t.test()

Linear Models (DRAFT)

car::box.cox()
MASS::boxcox
car::box.tidwell()
car::ncvTest()
car::qq-plot()
car::cr.plots()
car::spread.level.plot()
model.matrix (~ type, data=dData)

Graphing (DRAFT)

There are many sample figures and additional resources for Graphing in R in our Wiki. In addition, the CRAN Graphics Task View provides a nice overview of graphing in R.

Options

graphics::par() sets and returns display (and many other) options for R. Type help(par) for detailed information on all parameters.

par() #returns names and values of all current options
par(cex.lab=1.5, cex.axis=1.2, lwd=2)

mfrow is used to produce multipanel figures filled by row (see also mfcol).

par(mfrow = c(2,2)) #set options to produce a four panel figure filled topleft, top right, bottom left, bottom right.
par(mfrow = c(1,2) #set options to produce two panel horizonal orientation
par(mfrow = c(2,1) #set options to produce two panel vertical orientation

Colors

palette()

rainbox()

gray()

colors()

High-level plotting functions

graphics:::plot()

plot(aex_tot ~ sss_tot, xlab=”MPQ Negative Emotionality”, ylab=”Anger Expression”)

hist()

hist(Data$aex_tot, main=”Anger Expression”)

Low-level plotting functions

graphics:::abline()

abline(lm(aex_tot ~ sss_tot),col= “red”, lwd=4)
abline(h=mean(D1$aex_tot), col=”blue”, lwd=4)

graphics:::lines()

graphics:::points()

graphics:::axis()

graphics:::legend()

text()

polygon()

curve()

arrows()

See also p.arrows() in sfsmisc package

Other useful graphing functions

identify()

identify(mp1_con,sss_tot)
identify(D1$mp1,con,D1$sss_tot,labels=row.names(Data)

density()

plot(density(Data$aex_tot),main=”Anger Expression”)

jitter()
Create multi-panel plot

locator()

plot(allEffects(mLM), ask=FALSE)
.

Mapping

https://rstudio.github.io/leaflet/

Building Packages

Read “Writing R Extensions” for more information. http://cran.r-project.org/doc/manuals/R-exts.html#Top

Read this for putting packages on CRAN http://cran.r-project.org/doc/manuals/R-exts.html#Submitting-a-package-to-CRAN

Using devtools to check and build package

  • Choose version number for release. See: [1]
  • Make sure package is up to date on Sourceforge
  • export package to C:\RBuild\lmSupport\
  • library(devtools)
  • set working directory in RStudio to package folder (e.g., C:\RBuild\lmSupport\)
  • check(document=FALSE)
  • build()
  • Upload to CRAN here: http://cran.r-project.org/submit.html

Other Notes

  • To install the tar in R use, install.packages(‘P:/Methods/R/lmSupport/lmSupport_2.9.8.tar.gz’, repos=NULL, type= ‘source’)

Other R Reference Cards

R Reference Card by Tom Short

Regression Reference Card by Vito Ricci

Short R Reference Card by Jonathan Baron