Sweaving through your PhD: learning LaTeX in RStudio

I’m approaching thesis write-up mania.

Actually, I’ve been approaching it for a while, but with the mania also comes self-delusion. I figured that a very productive way to put off writing would be to learn LaTeχ, a document preparation system that has many advantages over word processors for scientific publications. In particular, snippets of R code can be integrated to provide self-updating outputs in the document (magic!). I told myself that this would undoubtedly save me time later on. For once, this procrastination actually worked out pretty well.

So, I’m going to share with you a quick guide on how I’ve been using tool called Sweave in RStudio (.Rnw files) to seamlessly mesh R outputs with my thesis chapters and scientific manuscripts. Here are just a few of the important benefits:

  • Changing the layout is super easy. Need to change the margins? Sorted. Want all your figures at the end of the document instead of spread throughout? Done. Changed your mind? Simpler still.
  • Fewer copying errors. Instead of copying the R2 value for a given model, just pop in the R code rather than writing the number, and you’re done. This also means that as your dataset and models change, you can just recompile the document to reflect the newest version of your analyses.
  • Cross-referencing tables, figures and sections is simple. The numbering is automatic, so you’ll never accidentally end up with duplicate figures.
  • Easy version control. Git works really well with .Rnw and .tex files, while version control of Word documents so far seems impossible. So, if you’ve scrapped a bit of text that now would make your introduction flow perfectly, it’s no problem.

The disadvantages:

  • Formatting tables can be a chore. I have found it pretty difficult to get tables looking the way I want them, but I’m getting there.
  • Difficult to get comments. Many people prefer to comment on Word documents. This means that to get comments from my co-authors, I have to compile the PDF from my .Rnw file, then convert the PDF to word (which always has some hideous formatting errors associated with it). Then, I have to go back into my .Rnw file and copy all the comments from the word file. And of course, there’s never just one round of comments…

Overall, I think the advantages far outweigh the disadvantages. But my PhD is pretty data heavy, so the ability to integrate with R code is really important to me. If you’ve got 17 different supervisors, it may not be the way to go, but for me, it’s been a delight.

Jump to:
Starting a new document
The Preamble
Begin the document
Fonts
Sections
Bullet points and lists
Including R code
Including ready-made figures
Cross-referencing
Using R-code in text
Finishing the document
My favourite package of the moment
Additional help

Starting a new document

I’ve been using Sweave for no other reason than it’s the default in RStudio. So, to begin:

  • Open an Sweave document in RStudio. It will look like this:
\documentclass{article}
\begin{document}
\SweaveOpts{concordance=TRUE}
\end{document}
  • Press ‘Compile PDF’ to create the pdf of the document.
  • If this fails, you may need to direct RStudio to the tools needed, by running the following code in your R console:
Sys.setenv(PATH="C:\\Program Files\\MiKTeX 2.9\\miktex\\bin\\",
TEXINPUTS="C:\\Users\\Name\\Documents\\Latex",
BIBINPUTS="C:\\Users\\Name\\Documents\\BibTex",
BSTINPUTS="C:\\Users\\Name\\Documents\\Bibliography_style")

This points to your:

  • Teχ programme (I’ve used MikTeχ),
  • .tex files (wherever you’re stashing your .Rnw files)
  • BibTeχ files. These are your references. Your reference software should be able to output the library in BibTeχ format.
  • Bibliography style. You can use a style associated with a package, or you can download a specific style for the journal format you’re after.

The Preamble

  • This goes before your actual text (so before \begin{document})
  • This is where you specify:
    • the document format (e.g. paper and font size)
    • what packages to use
    • the metadata (e.g. your author list and title).

So, my preamble might be:

\documentclass[10pt]{article}
\usepackage[round,sort]{natbib} % make the in-text citations with a round bracket, and sort by year
\usepackage{authblk} % allows pretty formatting of author lists
\usepackage[capitalise]{cleveref}  % when you reference a table or figure, automatically format as "Table 1" rather than "table 1" or "1".
\title{The joy of LaTeX}
\author[1,2]{Adriana De Palma}
\author[1,3]{Matthew Dray}
\affil[1]{The Rostrum Blog}
\affil[2]{Imperial College London}
\affil[3]{Cardiff University}
\date{today} %automatically writes in today's date

Begin the document

  • Begin the document after the preamble with \begin{document}
  • Note that in LaTeχ, the ‘commands’ (like \begin), require a backslash to identify them as code rather than text.
  • Write whatever you want as you would in a normal text editor.
\begin{document}
\maketitle
\SweaveOpts{concordance = TRUE}

Fonts

There are a few formatting quirks to learn:

  • To italicise, use \textit{my text to be italicised}
  • To embolden, use \textbf{my text to be emboldened}

Sections

  • LaTeχ has a very pretty default style.
  • If you want subtitles and subsubtitles, all you need to do is this:
\section{My section name}
\subsection{My subsection name}
\subsubsection{My subsubsection name}

Bullet points and lists

I really like LaTeχ because:
\begin{itemize}
\item I can make pretty lists
\item It's really this simple
\end{itemize}
  • If you want a numbered list, just swap out itemize for enumerate.

Including R code

  • You can easily add R code into your file with code chunks:
<<echo = FALSE, include = FALSE, fig = TRUE>>=
data<-rnorm(1000, mean = 0, sd = 1)
hist(data, main = "")
@
  • Place the R code inside a “code chunk”, beginning with <<>>= and ending with @.
  • Here are the meanings for the code chunk options:
    • echo = FALSE: don’t print the actual R code, just the output
    • include = FALSE: don’t include the output in the final document. Produce a separate PDF containing the figure.
    • fig = TRUE: specify that the output is a figure.

This code is very simple, but doesn’t number the figure. So we can change it to this:

\begin{figure}
\begin{center}
<<echo = FALSE, fig = TRUE, include = TRUE>>=
data<-rnorm(1000, mean = 0, sd = 1)
hist(data, main = "")
@
\end{center}
\caption{Histogram of random data}
\label{fig:one}
\end{figure}

Including ready-made figures

  • If your figures are in a different folder to your .Rnw file, then point to your graphics folder in your preamble:
\graphicspath{{C:/path/figs/}}
  • Now add the figure into your document:
\begin{figure}
\begin{center}
\includegraphics{myfig}
\end{center}
\caption{A figure of something cool}
\label{fig:cool} %this line allows you to reference the figure in text
\end{figure}

Cross-referencing

  • In the section above, we’ve labelled our figures (\label{fig:one} and \label{fig:cool}).
  • We can use these labels to reference the figures.
  • If the order of the figures changes, the figure numbers will automatically reorder as well (so no more finding out that you’ve accidentally referred to figure 6 instead of 2 because you changed the layout).
  • Reference things using the command \cref{label}.
  • You don’t need to write that it’s a table or figure, \cref automatically generates the correct text.
The results really were awesome (See \cref{fig:one}).
Our analyses revealed some amazing results, shown in \cref{fig:cool}.
  • You can label (and therefore reference) most things, including code chunks, tables, figures and sections (very useful when referring to an Appendix section for instance) as long as you specify a label:
\section{The start}
\label{sec:beginning}
I really like \cref{sec:beginning}.

Using R code in-text

  • Do R calculations or print values in text using the \Sexpr command:
Our data ranged from \Sexpr{round(min(data))} to \Sexpr{round(max(data))} (See \cref{fig:one}).

Finishing the document

  • Once you’ve finished writing your document, don’t forget to sign off:
\end{document}
  • Every single time you \begin something (e.g. \begin{centering}, \begin{table}, \begin{figure}), you have to \end it.

Here’s the output when you have all this code in an .Rnw file in RStudio and you press the ‘Compile PDF’ button.

My favourite package of the moment

So, I think these are the basics to get started with. Like I said, there are loads of amazing packages out there. My favourite so far is endfloat, which automatically moves all your tables and figures to the end of the document when preparing it for publication. Just add this code to your preamble:

\usepackage[tablesfirst]{endfloat}

Then, when you want it as a chapter, just remove that line of the preamble and they’re back where you originally placed them.

Additional help

Well, this was a very short example, but there’s a wealth of amazing help files out there: the wikibook is particularly good. So have a gander, and if you’re willing to spend a little time now getting to grips with it, Sweave in RStudio could save you quite a bit of time in the long-run. Anyway, happy writing all!

Advertisements

3 responses to “Sweaving through your PhD: learning LaTeX in RStudio

  1. This is really interesting! But do you have yo use Sweave as your editor? Is there integration with other latex editors, or is it the point that you write exclusively in Sweave? I’ve recently moved to Scrivener for the actual writing of the manuscript (and I LOVE it; it can outout latex code but you compile with another program e.g. texmaker), but the ability to integrate my workflow with Sweave would be interesting. Thanks!

    • I actually use Rstudio as my editor. Not particularly pretty, but it does the job. I’d be interested to move over to something like Overleaf (www.overleaf.com) for more collaborative projects, but I haven’t explored editors that much, to be honest. I’ll definitely have a look at Scrivener! Thanks!

  2. Have a look at knitr. I think it is replacing Sweave as the de facto standard. There is a lot of carry-over from Sweave so there is little learning curve

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s