Cat at a laptop: Scientific writing in R Markdown

How I felt when first trying to work in R Markdown.

Writing can be scary. Writing can be scary for everyone, not just us scientists. But whether or not we enjoy it, or think we’re good at it, it’s probably the best tool for communicating our findings. So removing as much pain from the process is key.

That’s why I’ve started using R Markdown for writing.

If you’re like me, the worst part about writing scientific papers is formatting. I hate it. I hate getting bogged down in font size, citation style, line numbers–all that stuff. Not only does it take me forever to get just right, but it gives me so much room to mess up stuff that isn’t based in content. If I’m spending time fighting with format, that’s time away from thinking about stuff that really matters. And the idea of switching between different journals’ format style makes me want to cry. R Markdown made worrying about that a thing of the past.

But perhaps even better than the formatting convenience R Markdown provides, it makes collaboration so much easier. This is especially true when you pair R Studio with your Github account. All changes and additional files referenced are all neatly connected, and any code printout included in your paper is already sitting in your paper.

So, I’ve switched to writing in R Markdown. I’ve always worked in either Word or Google Docs, and I still will if I’m writing something that isn’t going to require a lot of coordinating; but for big projects, I’m moving on up. I’m ready to get productive.

When I first tried this new step in my workflow, I felt less than skilled. I have experience in R Studio and Markdown, but when learning anything new I feel like a cat trying to type. So here’s some important tips I’ve collected from my first time through the process to hopefully make it easier.

  1. Define and fill the space R will reference when filling in format details. Three dashes (—) start and end the referential space, so write any parameters you want to fill followed by a colon and the content you want associated with it (title: Scientific Writing in R Markdown). When you create a new .rmd file, this is already started for you. Some parameters require a little extra characters, like abstracts or authors.  You’ll also need to include which output you want (a specific journal, word doc, html, pdf, etc.). If you want to format in a specific journal style, you can look up different csl (citation style and language) codes to reference journals here. You’ll also need to install and run rticles package. The rticles package allows you to reference different journal format styles so your .rmd can knit to that format style. After you finish the referential section, begin writing your paper outside the ending three dashes.
  2. Know and use your syntax. Writing in R Markdown means you’re writing in plain text as opposed to rich text. Rich text is when you’re writing but you have all these different formatting options–italics, font, colors–all the formatting options you can see in the the GUI interface. This is what you’re working with when you’re in Word. Plain text, which is just the text characters, is what you’ll use whenever you’re working in R. In order to get things like italics, or numbered lists, or bold, you need to use certain syntax. The rich text formatting will appear after you knit. Once you get used to this, it’s snap (here’s a handy guide to syntax). Plus, it’s one less thing to distract you when you’re trying to focus on content and ideas.
  3. Understand citations. Probably my single favorite thing about R Markdown is the ease with which I can include citations. It took me a minute to figure out the steps, but once I did, I never want to type out a citation or use a Word plugin again. All you have to do is export whichever papers you could possibly want to cite from your reference manager (I use Mendeley) into a .bib file. Notice what your citation key is. For Mendeley, it automatically formats your key to be author and year (@Lemon2018). After you create this, make sure your bibliography reference in your .rmd is your new .bib file. If you know your citation key, all you need to add a parenthetical citation is include [@author]. For example, you might type: “A cat like to be scratched behind its ears [@Lemon2018]”. This will automatically populate the entire citation at the end of the document. If you want to include multiple citations in one parenthetical, simply separate the keys with a semi-colon [@Lemon2017;@Lemon2014].
  4. Code! Don’t forget you’re writing in R Studio, so being able to directly code is a huge advantage of working in R Markdown. You can include any figures or tables you would in R Studio, just insert a new chunk. For tables, I recommend the kable function in the knitr package which creates an attractive table from a dataframe you already have. Just be sure to include “include=FALSE” at the beginning of your chunk so you only see the outputs of your code. Here’s a video that shows side-by-side screens of coding/writing in Markdown and how the code will look after knitting.

For me, it was a steep learning curve to make the transition from rich text programs to R markdown. In this post, I included some introductory tips for switching to R Markdown. There are lots of more advanced options with R Markdown, but for this post I wanted to focus on the challenges that I  struggled with while writing my first paper in an .rmd file. This doesn’t include steps that I found intuitive, or questions that are associated with learning to code in R, or tricks that are so advanced that I didn’t run into them. But I found the answers to most of my questions by scouring the web, so even if I didn’t answer something here, the answer is probably out there. Hopefully, the tips I devised can help an intermediate R coder get the most out of their work with R Markdown.


Selecting a journal for submission, case study: Journal of Arid Environments is ‘hot’ for facilitation

We are currently working on a manuscript exploring the importance of microenvironmental conditions versus seed source for desert annual plants.  Plant facilitation is a central tenet of the paper, however, we are more focussed on plant-seed/seedling interactions and less on plant-plant interactions.  There are some confirmatory findings, i.e. that positive interactions are likely species-specific and that microenvironmental differences are important, but there are also some novel findings (teaser so you read the paper).  An exceptional collaborator did this research as part of her honor’s thesis project, and it is absolutely publishable and technically correct. This study adopted a similar protocol to a recent contribution from the ecoblender team in Austral Ecology but with different species and a different purpose (and in fact, it predates this publication and was the pilot for the protocol). However, it is sometimes a challenge to publish a good idea demonstrated empirically with either mixed results, a single protocol (i.e. controlled conditions and not field), repeated testing of previously published similar research, or limited in extent of capacity to explore either full range of variation or extensive sample sizes. I think this study is great, and it is so tempting to overinterpret because the idea is so attractive and I like it. Nonetheless, it is prudent to select an appropriate framing of the problem and matching journals for submission. In discussing the writing, we are also concurrently considering the outlet.
Here is the workflow we used in selecting the journal.
Generalized journal-choice workflow
1. Write first-second-third draft.
2. Edit, repeat, and begin discussions on relationship to larger literature landscape and ideas.
3. Make a list of top journals that fit the scope of study to test hypothesis.
4. Check each journal for contemporary papers on topic to ensure that we are correct in estimate of fit/niche.
5. Check lit cited of current ms to see if certain journals are cited more frequently. Add to list and explore/rule out journals that we may cite frequently for big, specific ideas that are likely beyond out reach.
6. Make a list of journals entitled ‘journal pipeline’ recognizing and reminding ourselves that rejection is part of the process and beneficial. Remind again 🙂
7. Select journal.
8. Check lit cited within manuscript for journal citation matching patterns.**Rule of thumb – a good fit should have a few key papers cited from that journal. The rationale is NOT to ingratiate with editors, but to ensure that the current research offering matches previous/related research.  Some editors do however check the lit cited of submissions, and if not a single citation to a previous publication in that journal, can consider rejection for offerings that are outside her/his primary research expertise.
Disclaimer: I am not a fan of ratcheting from higher-tier journals to lower. This wastes time all participants in the peer review process. Sometimes however, this is a disservice to my junior collaborators as we end up in lower-tier placements but waste less time. Efficiency-impact trade-off, but it is difficulty to predict handling times by perceived impact of journal. I also strongly advocate for OA journals and this also sometimes leads to non-ISI placements. I do recognize that we each have different career needs, but I am confident that strong work – regardless of journal -can be found online easily now and will capture interest.
Case study
Linking back to preamble that got me thinking of our collective workflow, that always include discussion within team, we generated a short list of three journals to consider.
Journal of Plant Ecology
Journal of Arid Environments
I have enjoyed many, many papers from all of these journals. A cursory search of the lit cited, online offerings, and discussion indicates that all three are viable with some caveats.
PLOSONE – High impact, great visibility, open access, and reviewed for technically correct designs.  However, it is our collective opinion that this could be a stretch. There are many general plant facilitation papers, but we have a narrower scope.  Whilst reviewing for technical correctness only and not impact, PLOSONE is nonetheless very reductionistic in their experimental/result/analyses reviews.  I have had perfectly appropriately, well-designed experiments rejected. Never for impact reasons.  There is no perfect experiment, but PLOSONE is nonetheless handling a very, very high number of experiments and thus seeks substantiative experimental designs.
Journal of Plant Ecology – A solid, mid-tier ecology journal. Interesting papers on facilitation. More emphasis on ecology then we necessarily tackle in this particular ms, and we are also focussed on plant-seed interactions.  Seeds are the key life-stage in this study.
Journal of Arid Environments – I have read many papers over the years and always enjoyed.  Sometimes less ecological and lower impact relative to previous two options.
How to decide – In summary, all three are certainly viable with difficult probabilities to estimate associated with both acceptance rate and handling time. We decided to examine the following questions explicitly to move forward, and in doing so, found the perfect fit (and a surprise too).
1. In PLOSONE are there a few seed biology/ecology papers or ecotype/reciprocal common garden papers that are comparable in sample size and number of species tested?
2. In JoPE, are there any seed biology/seed ecotype papers or is it more plant focussed?
3. In J of Arid Envts, are there a few plant facilitation papers or seed ones?
No other reason than assuming it was less ecological and more broad.  There were many perfect papers related to our topic and design in the Journal of Arid Environments!
Sample connectance publications 
Journal of Arid Environments is a great fit for this paper. Concerns include lowest IF, non-OA journal, and handling times.  We will keep you posted, but I thought it would be interesting to share how we approached submission of an interesting, well-executed experiment that is a mix of confirmatory and insightful findings.