Writing output to file

Remember that when writing to external files, you need to pay attention to the class of your object and what kind of output file you want to create.

Data frame to .csv:
A .csv file is a text file that can be read by a word processor or spreadsheet software. Use write.csv().

mydat <- data.frame(x=1:10, y=rnorm(x))
write.csv(mydat, file="mydat.csv")

Data frame (and other objects) to .Rdata:
An .Rdata file is a binary file specific to R. This is good for saving intermediate steps in data analysis that can be read in directly as R objects.

mydat <- data.frame(x=rnorm(10), y=rnorm(10))
a <- letters[1:10]
write(mydat, a, file="myobjects.Rdata")

Lines of information to text file
If you want to save lines of output to a text file, use cat(). Cat takes character strings, which you can assemble using paste(). It will write output line by line for each call, so if you want to add a line on to existing lines make sure to set append=TRUE.

x <- 244
line <- paste("The value of x is:", x, "\n", sep=" ")
cat(line, file="myoutput.txt")
cat(Sys.time(), file="my output.txt", append=TRUE)
cat("That is all. The end", file="myoutput.txt", append=TRUE)

Dumping output to a text file
Sometimes you just want all of the output, say from a linear model summary, to be saved in a text file.

xy <- data.frame(x=rnorm(10), y=rnorm(10))
lm.xy <- lm(y ~ x, data=xy)

sink("myoutput.txt") # opens connection to file
print(summary(lm.xy)) # prints regression output
sink() ## directs output back to console (screen)

sink(“filename”) open a connection to a file. Anything you print() after that will be sent to your file.  Call sink() again to send output back to console.

Print plots to pdf
This is similar to the sink() example. Use pdf() to open a connection to a pdf file (or a “graphical device” of type pdf), then make your plots, then turn off the device.

xy <- data.frame(x=rnorm(10), y=rnorm(10))
pdf("xyplot.pdf") # opens pdf device
plot(x,y) # plots to file, nothing to screen
dev.off() # must turn off device or pdf wonʻt open
plot(x,y) # plots to screen

Put some thought into the storage of your data and your output and your script. Think about opening the project a year from now and being able to remember what you did, with enough information to be able to replicate it.

  • Always save a copy of your data in text files (.csv). These are the most robust to archiving. No matter how file formats change, you can always open a text file.
  • Also save a copy of all of your output as pdfs or txt files. This is what you will refer to when you are revising your manuscript, etc. Even if your script is working, itʻs always a good idea to save the output so you know what you got.
  • Of course save a clean, commented, working copy of your script. Clean it up, remove extraneous objects and code, and run it from source after restarting R. Save all of these items in a clean folder with only the essential elements, inputs and outputs.

Reproducible results!

Matching quantitive values

3) Some of you have emailed me regarding matching on numeric values. In general, it is not a good idea to try to use == on quantitative numeric (non-integer) values.

This is because numeric values are stored as double precision. Although it looks like 5.1, it may actually be 5.1000001 which will cause no match when you compare it to 5.1. It is actually stored to eight decimal places (+/- 10^-8). Itʻs perfectly fine to use == for discrete values (like index numbers which are integers, or factors or characters), but do not use them for quantitive values or you may be surprised.

Here is an example of where this might come up.

x <- seq( from=0, to=50, by=0.1)
which(x==5.5) # works
which(x==5.1) # doesnʻt work
which(x==5.3) # doesnʻt work

Instead, use:

which(x>5.0 & x<5.2 ) # Better way

Of course it may return more than one index. You can just take the mean index:
which( x>4.9 & x<5.3 )
[1] 51 52 53

ii <- which( x>4.9 & x<5.3 )
mean(ii)
[1] 52

Of course rather than using actual fixed numbers, you may want to have a guess for x called P_x and test in a small neighborhood around that point. So if your delta is .1, you could code it as:

> ii <- which((x>P_x-delta) & (x<P_x+delta))

mean(ii)

If you are concerned about getting an even number (and resulting in a mean with a .5 ending, just round down:

> ii <- floor(mean(ii))

Then the coordinates of for example your guess would be:

> P_x <- x[ii]

P_fx <- fx[ii]

If you create a vector of derivatives, you can also use ii to index that as well:

> dx <- derivative[ii] # if you store your vector of derivatives in “derivative”

Marguerite