Cat at a laptop: Scientific writing in R Markdown

How I felt when first trying to work in R Markdown.

Writing can be scary. Writing can be scary for everyone, not just us scientists. But whether or not we enjoy it, or think we’re good at it, it’s probably the best tool for communicating our findings. So removing as much pain from the process is key.

That’s why I’ve started using R Markdown for writing.

If you’re like me, the worst part about writing scientific papers is formatting. I hate it. I hate getting bogged down in font size, citation style, line numbers–all that stuff. Not only does it take me forever to get just right, but it gives me so much room to mess up stuff that isn’t based in content. If I’m spending time fighting with format, that’s time away from thinking about stuff that really matters. And the idea of switching between different journals’ format style makes me want to cry. R Markdown made worrying about that a thing of the past.

But perhaps even better than the formatting convenience R Markdown provides, it makes collaboration so much easier. This is especially true when you pair R Studio with your Github account. All changes and additional files referenced are all neatly connected, and any code printout included in your paper is already sitting in your paper.

So, I’ve switched to writing in R Markdown. I’ve always worked in either Word or Google Docs, and I still will if I’m writing something that isn’t going to require a lot of coordinating; but for big projects, I’m moving on up. I’m ready to get productive.

When I first tried this new step in my workflow, I felt less than skilled. I have experience in R Studio and Markdown, but when learning anything new I feel like a cat trying to type. So here’s some important tips I’ve collected from my first time through the process to hopefully make it easier.

  1. Define and fill the space R will reference when filling in format details. Three dashes (—) start and end the referential space, so write any parameters you want to fill followed by a colon and the content you want associated with it (title: Scientific Writing in R Markdown). When you create a new .rmd file, this is already started for you. Some parameters require a little extra characters, like abstracts or authors.  You’ll also need to include which output you want (a specific journal, word doc, html, pdf, etc.). If you want to format in a specific journal style, you can look up different csl (citation style and language) codes to reference journals here. You’ll also need to install and run rticles package. The rticles package allows you to reference different journal format styles so your .rmd can knit to that format style. After you finish the referential section, begin writing your paper outside the ending three dashes.
  2. Know and use your syntax. Writing in R Markdown means you’re writing in plain text as opposed to rich text. Rich text is when you’re writing but you have all these different formatting options–italics, font, colors–all the formatting options you can see in the the GUI interface. This is what you’re working with when you’re in Word. Plain text, which is just the text characters, is what you’ll use whenever you’re working in R. In order to get things like italics, or numbered lists, or bold, you need to use certain syntax. The rich text formatting will appear after you knit. Once you get used to this, it’s snap (here’s a handy guide to syntax). Plus, it’s one less thing to distract you when you’re trying to focus on content and ideas.
  3. Understand citations. Probably my single favorite thing about R Markdown is the ease with which I can include citations. It took me a minute to figure out the steps, but once I did, I never want to type out a citation or use a Word plugin again. All you have to do is export whichever papers you could possibly want to cite from your reference manager (I use Mendeley) into a .bib file. Notice what your citation key is. For Mendeley, it automatically formats your key to be author and year (@Lemon2018). After you create this, make sure your bibliography reference in your .rmd is your new .bib file. If you know your citation key, all you need to add a parenthetical citation is include [@author]. For example, you might type: “A cat like to be scratched behind its ears [@Lemon2018]”. This will automatically populate the entire citation at the end of the document. If you want to include multiple citations in one parenthetical, simply separate the keys with a semi-colon [@Lemon2017;@Lemon2014].
  4. Code! Don’t forget you’re writing in R Studio, so being able to directly code is a huge advantage of working in R Markdown. You can include any figures or tables you would in R Studio, just insert a new chunk. For tables, I recommend the kable function in the knitr package which creates an attractive table from a dataframe you already have. Just be sure to include “include=FALSE” at the beginning of your chunk so you only see the outputs of your code. Here’s a video that shows side-by-side screens of coding/writing in Markdown and how the code will look after knitting.

For me, it was a steep learning curve to make the transition from rich text programs to R markdown. In this post, I included some introductory tips for switching to R Markdown. There are lots of more advanced options with R Markdown, but for this post I wanted to focus on the challenges that I  struggled with while writing my first paper in an .rmd file. This doesn’t include steps that I found intuitive, or questions that are associated with learning to code in R, or tricks that are so advanced that I didn’t run into them. But I found the answers to most of my questions by scouring the web, so even if I didn’t answer something here, the answer is probably out there. Hopefully, the tips I devised can help an intermediate R coder get the most out of their work with R Markdown.


Invasive species versus Californian Natives Competition Trial Update – The Purpose

Salvia columbariae, Phacelia tanacetifolia, and Plantago insularis are key phytometers (plants that indicate ecosystem conditions) in the San Joaquin Desert of California. As the highly invasive exotic Bromus madritensis colonizes in this non-native environment it lacks the environmental suppressors and competitors it faces in its native habitat. This leads to native Californian desert ecosystems to shift to a new model where native plants are excluded due to competitive disadvantages. decreases in native biodiversity are directly correlated to the health of an ecosystem, ecosystem services, resiliency to climate change, as well as the resources and for these reasons, identifying methods of restoration ecology is crucial.

Using my 3 factor (ambient light vs shaded conditions, low vs high B.madritensis density, native seeds at 6 levels of density (0,3,6,9,15, or 30 natives)) greenhouse competition trials I aim to identify what density of native species must present in a pot with a surface area of 153cm2  to outcompete an exotic one.I have previously run an experiment to identify optimal density in pots of the same surface area using each of the native species in monoculture, implementing the same light versus shade conditions with a total of 365 replicates. I will assess if I am able to compare these differences in optimal monoculture mix density to a polyculture mix with invader presence. If my data finds an optimal density using these methods, I hope to further my research and apply my findings to population ecology by estimating necessary population metrics required to apply this to ecosystem for large scale restoration and contribute it towards field work.

My experiment currently contains 200 pots, 100 of which are shaded by a bamboo structure I suspended. Germination has begun, yet it is still difficult to differentiate among species this early on. As predicted, the shaded individuals have demonstrated leggy growth as they reach towards the light source, yet there seems to be leaf production in possibly higher concentrations in the shaded pots than the ones experiencing ambient light. It appears that the shaded pots have a higher germination and growth rate (measured by number of individuals and number of leaves per pot). Is it possible that the shade-preferring B.madritensis is facilitating growth through positive density dependence? Am I witnessing an Allee effect in the form of environmental conditioning? Or is the answer as simple as light levels in the shaded conditions being sufficient for the natives as well as B.madritensis? Using the metrics of germination of species per pot as well as leaves per species and finally above ground biomass at the end of my experiment I will continually assess success through the different factors and levels I have designed and implemented in my experiment and hope to achieve a successful conclusion.

Impact of Red Brome and Drought on 3 Native Californian Plants

I am currently running an experiment to observe how an invasion of red brome impacts the growth and success of 3 native Californian plants (Plantago insularis, Phacelia tancetifolia, Salvia columbariae) across 5 different watering regimes. These watering regimes simulate conditions from extreme drought to very wet years. The experiment utilizes a total of 300 pots with 10 replicates per treatment; 100 pots are being used for each species, with 50 of those pots containing brome and 50 lacking brome.

As the experiment progresses 4 measurements will be obtained:

  1. Germination Success
  2. Establishment by 5 weeks
  3. Final Census
  4. Total Biomass production at the conclusion of the experiment


Californian native versus invasive species trial

My native versus exotic competition experiment is all set up in the greenhouse, so just waiting on germination now. Have planted additive densities of 0,3,6,9,15 and 25 natives with brome at high (10 seeds) and low (5 seeds) densities, 10 reps per treatment at ambient light versus shaded conditions for a total of 200 pots. I hung up a wooden bamboo structure to provide shade and imitate shrub presence to half of the pots, and hung it in a way that it is easy to suspend for pot censusing. Here are photos of what it all looks like.

Does enemy release help explain Echinocystis lobata invasion in Poland?

The enemy release hypothesis (ERH) of plant invasion asserts that translocation to novel communities allows exotic plants to escape population controls imposed by natural enemies in native communities. The ERH predicts that 1) invader densities are greater in non-native communities than native communities, 2) natural enemies impose strong negative effects on invader abundance in the native range but in not the non-native range. These predictions are straightforward, but testing them involves conducting parallel vegetation surveys and enemy exclusion experiments in both the native and non-native ranges of invaders. Due to logistic challenges, very few studies have done this.

As part of an international team of collaborators from the USA, Canada, and Poland, we are explicitly testing the predictions above with respect to the prickly cucumber, Echinocystis lobata (fruit pictured below). This climbing vine is native to North America but invasive in Poland, where it can dominate local communities and extirpate native competitors.

So far, our surveys indicate that E. lobata is much more abundant in Poland than anywhere examined in N. America, and that E. lobata plants are larger and more fecund in Poland than in N. America. It also seems that physical defenses aimed at protecting seeds from generalist granivores are present at much higher frequencies in Poland than in N. America, which is very cool!  We look forward to results from enemy exclusion experiments.

We’ll keep you posted!

Jacob L.
Nick F.
Mario Z.