In January 2019, Google updated their terms of service and has essentially removed the free access to Google Maps in R. This means that you’ll need to purchase the relevant APIs (compu-speak for Application Programming Interface) from Google in your google account to access these features in R.
So do you need it?
If you’re interested in mapping in R, you basically need it. There are some mapping packages that you can use to get around using any Google products (Leaflet is a great example). But for all the glorious customization and overwhelming ubiquity of the ggmap, this API key is essential for reproducible science in ecology related fields. When I first encountered the problem, troubleshooting was a nightmare–everyone used ggmap, and even those who didn’t still used Google Maps as a source for their base maps. Not. Fun.
Luckily, it’s relatively cheap at $2/month for the first 100,000 static maps in each month (dynamic maps, street maps, embed advance, and dynamic street maps cost more, but we aren’t likely using these tools in our work). Even luckier, there’s a $200 credit/month for the first year of use!
It’s a bit confusing to navigate the Google Cloud Console if you’re trying to figure it out solo (and scary considering you’re paying for something), but the actual steps are easy and quick. There are two main steps to the process: 1)Get an API key and 2) Show R your API key. There’s just a few ministeps in between.
Select a Project. If you don’t have one, create one. It won’t matter later.
Enter your billing information.
Copy your API key. Consider pasting it into a .txt file on your machine for safe keeping.
Show R your API key:
In your R console, enter this code:
register_google(key = “YOUR_API_KEY”)
Run this code for every new session you need to map in, and you’re ready to go!
a daRk tuRn
R is popular among scientists (especially ecology/conservation scientists) because of its power. But it’s basically essential for scientists because it’s free. In a field where funding is scarce and costs are high, R has been a blessing for open science and has seriously moved the discipline forward. But the same reason R is powerful is because it’s not entirely autonomous; it (in large part) relies on monolithic companies like Google to up the ante. It may not be a very expensive fee, but it is yet another barrier for researchers and open science. Hopefully someday we can return to a free, open access Google Maps. After all, open science benefits scientists, the general public, and corporations–even Google.
I had a great time last week representing the Lortie lab and discussing work from my research practicum with president and vice chancellor of York university Rhonda Lenton. Aside from gaining wonderful feedback from her, which will definitely assist me in my future endeavours, I was thrilled to present the importance and impact of biology research with her as well as the many other influential attendees. I am excited and eager to experience the opportunities that lie ahead.
For the bird-cactus double mutualism project, we had planned on observing two study species: Cylindropuntia anthrocarpa (Buckhorn Cholla) and Opuntia basilaris var. basilaris (Beavertail Cactus). We also needed 3 class sizes (small, medium, and large) in which to bin the cacti. This would impact our sample size and equipment list. That being said, the best laid plans of mice and men (and grad students) often go awry. I’d only briefly visited our study site the summer before I’d officially started at York, so we knew we would need to revisit Sunset Cove to do some preliminary exploration before getting into the trenches and collecting end-game data. Getting to the site, it was immediately apparent that we would need to examine our plants more closely; there was nearly no beavertail in sight. So we altered the protocol, then added Cylindropuntia enchinocarpa (Silver Cholla) into the mix. The goal? Determine the location, size, size-variability, and health. We want a tall-ish species (so pollinators and frugivores would be interested) with plenty of variability in size, enough of them to manipulate conditions, and healthy enough so we can expect some flowers and fruit later on. And, for fun, we took a quick look at shrubs to see if they’re associated with cacti in any respect (I don’t go into that here, but the data is available on Github).
Where are the cacti?
Let’s make a quick map and take a look at the cacti individuals sampled. For C. anthrocarpa, we were easily able to sample at every 5 meters along 5 transects that were spaced 5 meters a part (n=105). C. enchinocarpa, however, was more sparsely distributed. So, after doing our first two transects 5 meters apart, we realized we needed to increase the distance between transects to 10 meters. We also weren’t able to get a cactus sample at every 5 meters, so we sampled 9 transects in total (n=98). The least common species, however, was Opuntia basilaris, which was so rare that transects were ineffective, so we instead unsystematically searched the entire site only to find a paltry number of individuals (n=26).
Based on the proposed protocol, we need 150 individuals of each study species to replicate each combination of variables 10 times. Ideally, the individuals manipulated between flowering and fruiting season will not be resampled in the the fruiting season, as our manipulation of the flowers in April may impact the number of fruits in August. This means that C. anthrocarpa is a solid study species option. C. enchinocarpa is certainly possible, but not as dominant as its cousin, and O. basilaris is out of the question.
How big are the cacti?
We’ve seen the distribution of cacti, but size of the cacti is what’s really important for this study. We need to know if the sizes are variable enough to split into 3 class sizes (small, medium, and large). We also need a general idea of their height to consider if pollinating and frugivorous birds will engage with the flower and fruits of the cacti at all. The three species did indeed have significantly different mean heights (Kruskall Wallis Test, p > 0.0001, df = 52, x^2 = 151.52), with means of 1.04, 0.55, and 0.17 for Cylindropuntia anthrocarpa, Cylindropuntia echinocarpa, and Opuntia basilaris, respectively.
How should we bin the cacti?
One important variable of our project is size classes within a species: small, medium, and large. Because height is what may influence pollination and frugivory, we will use the “z-axis” that we measured as the factor for size. Each size class must contain enough individuals for replication. We need to decide how to bin the size classes; either we can use natural breaks present in the data, or we can create equally-sized bins for the study species. Let’s examine each species’ size distribution, and make decisions about size class breaks on that.
None of the species have distributions with natural breaks (see density plots), and, especially for our two Cylindropuntia species, we can see that there are even distances between quartiles (see boxplots). For these reasons, I propose an equal-size binning method to determine size class.
Size-classes of cacti
But what exactly are the equal size classes for each species?
86cm – 152cm
46cm – 72cm
16cm – 22cm
We can see that Buckhorn Cholla (C. anthrocarpa) has the largest class sizes, followed by Silver Cholla, and then Beavertail. Having large classes may translate more clearly to birds, and therefore be a suitable metric to see if bird visitation is influenced by cactus size.
Health of cacti
Another important factor to consider when exploring potential study species is their overall health. After all, are these individuals even capable of flowering and fruiting? To measure health, we created a health index based on the Wind Wolves Bakersfield Cactus Report, which classifies each individual’s health on a discrete scale of 1-5 (1 being the least healthy, and 5 being the healthiest). We considered overall paddle/branch death, as well as scarification and rot.
We can see that the Cylindropuntia species are healthier than their Opuntia counterparts. The question is, will an unhealthy population still flower/fruit as much as a healthy population? Perhaps, but this is not the question of my project.
Who is America’s Next Cactus Superstar?
Considering its abundance, size, and health, Opuntia basilaris is not a realistic contender as a study species. It is likely to be overlooked by birds, not bloom/fruit due to poor health, and is in small supply. Therefore I must remove it from the running. Both of Cylindropuntias are healthy. Silver Cholla, however, is still less dominant than Buckhorn Cholla, is smaller overall, and doesn’t have the width of size classes that Buckhorn Cholla does. While these traits do not mean the Silver Cholla could not be a viable study species, I propose that focusing more on Buckhorn Cholla by deepening the methods of observation (i.e., joy sampling: stationary versus mobile count data, and increased hours of focal observation) will be more beneficial to answering my study questions than a comparative study between cacti species would.
The reproductive ecology of cactus is not well-studied. A small, side project of mine is to determine the pollinator guild of buckhorn cholla at Sunset Cove, Mojave Desert, and with which plant species, if any, it shares pollinators. The genera Opuntia and Cylindropuntia are known to be insect-pollinated, but I am curious which of the more than 659 species of bees in the Mojave Desert desert are pollinators.
As visitation does not necessarily lead to pollination, I removed the pollen loads from 22 bee visitors I caught during insitu observation periods. I also removed stigma from the cholla to quantify heterospecific pollen deposition i.e. evidence of pollinator sharing. Pollen ID is not easy task and so I have developed a workflow to make it more streamlined.
Prep a reference collection:
Create a reference collection by removing pollen from the anthers of several flowers of every species blooming in the area. Store in ethanol.
Mount and stain the pollen with fushcin jelly.
Image each species of pollen grain at 3 magnifications. Measure the length and width of about 10 grains per species. I calibrated Lumenera’s Infinity Analyze software using a stage micrometer to make this really quick.
Make a reference document to consult. I use a word doc where every page is a species. Add in the photos at several magnifications, the mean size and any notes.
To go through the stigma or bee pollen load samples, I use my Canon EOS 60D dslr with a 60mm macro lens pointed confocally into a light microsite at 100x. I used the remote shooting utility from Canon to control the camera with my computer and display the view onto a second monitor.
I designate each coverslip on the slide as a zone and do 8 transects through each, counting the grains. Each line in my spreadsheet is a transect, each column is a species. I use 5 columns for buckhorn so I never have to count very high.
I don’t count damaged grains, or grains in air bubbles.
Each slide gets its own folder. I take photos of each heterospecific grain with the file name as the zone + transect + species, which is simple using the photo utility. Knowing where the grain is on the slide and what its surroundings are will be helpful if you need to find it again.
The species can be tentative for now so don’t get too bogged down.
Take photos of unknowns when first encountered and assign them morphospecies ID. I put these in a separate folder as a reference.
Some species are easy to ID. Quite a few are not. The more grains you see the easier it is to spot the differences.
To help ID, we can take a page from entomologists. Sort the photos by their tentative IDs, putting each species in a folder so they are visible all at once (do a bulk rename to append the folder name first). It is difficult to compare grains unless they are side by side, which isn’t realistic with one microscope.
Sort until each folder contains identical grains, then assign them a species from the reference collection. Or assign them to a species group for species that are virtually identical (likely Asteraceae!). Assign any remaining to morphospecies. Update the datasheet with the corrections.
After a 2 month growing and censusing period, followed by a harvesting, drying, and biomass census I have concluded my 200 pot competition series.
During this period, I had obtained a photometer to measure light levels and did two light census for both the overall pot as well as below canopy. I am hoping that these light measures will provide quantifiable insight on the effect light has on growth. I hypothesize that plants receiving ambient light will yield greater mean biomass per species, while those in shade conditions (to mimic shrub presence) will have a greater mean height due to leggy growth.
I wanted to quantify the growth of my plants through several metrics, and therefore chose to obtain both height and leaf measurements for each species from each pot. In order to acquire these measurements, I implemented a new censusing technique for my second and final census. In this census I counted the number of individuals of each separate species there were per pot. Following this, I took the tallest individual of each species, and recorded its height along with the number of leaves. This way, following the harvest and mechanical oven drying period I would be able to compare the biomass of the plant with its height and leaf count. This would allow me to evaluate plant growth using two separate dimensions; plant height along and number of leaves vs. plant biomass.
After using a mechanical drying oven set to 62 degrees Fahrenheit for 48 hours, I used a precision scale to obtain the biomass of each plant.
The experiment planning, seed counting, pot filling, plant censusing, harvesting, and biomass analysis processing were extensive processes. I am extraordinarily grateful to Dr. Christopher Lortie, Dr. Jacob Lucero, Masters graduate Jenna Braun, research practicum student Anuja, and Economics and Finance student Denis Karasik for their time, efforts, and immense assistance with running this experiment.
Statistical analyses for all of the results are still in work, and I am eager to see the conclusion my experiment comes to.
I want my final paper to be useful for and applicable to restoration ecology. This led me to inquire what data I should collect for my second census. My germination rates are up, and all four species are present, so would relying on number of individuals and biomass of each species per pot be enough data? I decided that since I am using light as a limiting factor I must include height in my data; the plants may have somewhat similar biomass, but if it is due to leggy growth in the shaded pots then it will be important to note that although biomass was similar resource allocation was not equal. Are great amounts of leggy, weak, and nutrient deficient plants with few leaves better for ecosystems then having fewer shorter but thicker, more leafy plants? I measured the number of individuals per species per pot, alongside with the height and number of leaves the tallest member of each species had per pot. I have yet to analyze these numbers, but did notice trends when doing the census!
Side note: I conducted a germination experiment in the greenhouse prior to using these seeds, and have let them grow out. My Phacelia tanacetifolia is growing a beautiful flower!
How I felt when first trying to work in R Markdown.
Writing can be scary. Writing can be scary for everyone, not just us scientists. But whether or not we enjoy it, or think we’re good at it, it’s probably the best tool for communicating our findings. So removing as much pain from the process is key.
That’s why I’ve started using R Markdown for writing.
If you’re like me, the worst part about writing scientific papers is formatting. I hate it. I hate getting bogged down in font size, citation style, line numbers–all that stuff. Not only does it take me forever to get just right, but it gives me so much room to mess up stuff that isn’t based in content. If I’m spending time fighting with format, that’s time away from thinking about stuff that really matters. And the idea of switching between different journals’ format style makes me want to cry. R Markdown made worrying about that a thing of the past.
But perhaps even better than the formatting convenience R Markdown provides, it makes collaboration so much easier. This is especially true when you pair R Studio with your Github account. All changes and additional files referenced are all neatly connected, and any code printout included in your paper is already sitting in your paper.
So, I’ve switched to writing in R Markdown. I’ve always worked in either Word or Google Docs, and I still will if I’m writing something that isn’t going to require a lot of coordinating; but for big projects, I’m moving on up. I’m ready to get productive.
When I first tried this new step in my workflow, I felt less than skilled. I have experience in R Studio and Markdown, but when learning anything new I feel like a cat trying to type. So here’s some important tips I’ve collected from my first time through the process to hopefully make it easier.
Define and fill the space R will reference when filling in format details. Three dashes (—) start and end the referential space, so write any parameters you want to fill followed by a colon and the content you want associated with it (title: Scientific Writing in R Markdown). When you create a new .rmd file, this is already started for you. Some parameters require a little extra characters, like abstracts or authors. You’ll also need to include which output you want (a specific journal, word doc, html, pdf, etc.). If you want to format in a specific journal style, you can look up different csl (citation style and language) codes to reference journals here. You’ll also need to install and run rticles package. The rticles package allows you to reference different journal format styles so your .rmd can knit to that format style. After you finish the referential section, begin writing your paper outside the ending three dashes.
Know and use your syntax. Writing in R Markdown means you’re writing in plain text as opposed to rich text. Rich text is when you’re writing but you have all these different formatting options–italics, font, colors–all the formatting options you can see in the the GUI interface. This is what you’re working with when you’re in Word. Plain text, which is just the text characters, is what you’ll use whenever you’re working in R. In order to get things like italics, or numbered lists, or bold, you need to use certain syntax. The rich text formatting will appear after you knit. Once you get used to this, it’s snap (here’s a handy guide to syntax). Plus, it’s one less thing to distract you when you’re trying to focus on content and ideas.
Understand citations. Probably my single favorite thing about R Markdown is the ease with which I can include citations. It took me a minute to figure out the steps, but once I did, I never want to type out a citation or use a Word plugin again. All you have to do is export whichever papers you could possibly want to cite from your reference manager (I use Mendeley) into a .bib file. Notice what your citation key is. For Mendeley, it automatically formats your key to be author and year (@Lemon2018). After you create this, make sure your bibliography reference in your .rmd is your new .bib file. If you know your citation key, all you need to add a parenthetical citation is include [@author]. For example, you might type: “A cat like to be scratched behind its ears [@Lemon2018]”. This will automatically populate the entire citation at the end of the document. If you want to include multiple citations in one parenthetical, simply separate the keys with a semi-colon [@Lemon2017;@Lemon2014].
Code! Don’t forget you’re writing in R Studio, so being able to directly code is a huge advantage of working in R Markdown. You can include any figures or tables you would in R Studio, just insert a new chunk. For tables, I recommend the kable function in the knitr package which creates an attractive table from a dataframe you already have. Just be sure to include “include=FALSE” at the beginning of your chunk so you only see the outputs of your code. Here’s a video that shows side-by-side screens of coding/writing in Markdown and how the code will look after knitting.
For me, it was a steep learning curve to make the transition from rich text programs to R markdown. In this post, I included some introductory tips for switching to R Markdown. There are lots of more advanced options with R Markdown, but for this post I wanted to focus on the challenges that I struggled with while writing my first paper in an .rmd file. This doesn’t include steps that I found intuitive, or questions that are associated with learning to code in R, or tricks that are so advanced that I didn’t run into them. But I found the answers to most of my questions by scouring the web, so even if I didn’t answer something here, the answer is probably out there. Hopefully, the tips I devised can help an intermediate R coder get the most out of their work with R Markdown.
Salvia columbariae, Phacelia tanacetifolia, and Plantago insularis are key phytometers (plants that indicate ecosystem conditions) in the San Joaquin Desert of California. As the highly invasive exotic Bromusmadritensis colonizes in this non-native environment it lacks the environmental suppressors and competitors it faces in its native habitat. This leads to native Californian desert ecosystems to shift to a new model where native plants are excluded due to competitive disadvantages. decreases in native biodiversity are directly correlated to the health of an ecosystem, ecosystem services, resiliency to climate change, as well as the resources and for these reasons, identifying methods of restoration ecology is crucial.
Using my 3 factor (ambient light vs shaded conditions, low vs high B.madritensis density, native seeds at 6 levels of density (0,3,6,9,15, or 30 natives)) greenhouse competition trials I aim to identify what density of native species must present in a pot with a surface area of 153cm2 to outcompete an exotic one.I have previously run an experiment to identify optimal density in pots of the same surface area using each of the native species in monoculture, implementing the same light versus shade conditions with a total of 365 replicates. I will assess if I am able to compare these differences in optimal monoculture mix density to a polyculture mix with invader presence. If my data finds an optimal density using these methods, I hope to further my research and apply my findings to population ecology by estimating necessary population metrics required to apply this to ecosystem for large scale restoration and contribute it towards field work.
My experiment currently contains 200 pots, 100 of which are shaded by a bamboo structure I suspended. Germination has begun, yet it is still difficult to differentiate among species this early on. As predicted, the shaded individuals have demonstrated leggy growth as they reach towards the light source, yet there seems to be leaf production in possibly higher concentrations in the shaded pots than the ones experiencing ambient light. It appears that the shaded pots have a higher germination and growth rate (measured by number of individuals and number of leaves per pot). Is it possible that the shade-preferring B.madritensis is facilitating growth through positive density dependence? Am I witnessing an Allee effect in the form of environmental conditioning? Or is the answer as simple as light levels in the shaded conditions being sufficient for the natives as well as B.madritensis? Using the metrics of germination of species per pot as well as leaves per species and finally above ground biomass at the end of my experiment I will continually assess success through the different factors and levels I have designed and implemented in my experiment and hope to achieve a successful conclusion.