Collaborative and open science writing

As we collectively move to platforms that support better reproducibility and open science, a few tiny challenges persist. Reference management. LaTex with BibTex is great, but at times, team members are interested less in reproducibility and more in just sharing the libraries. We recently faced this challenge because we were collaboratively writing a very long white paper and each of us worked in a different management ecosystem in spite of using GitHub to control the versioning and collaboration in the writing.

Here are some resources to support a decision. Anecdotal research similar.

List with satisfaction scores

Refworks, Easbib, Endnote, and Mendeley look promising.

Good contrast here including discussion of Zotero.

Gradhacker review here of offerings.

Writing in google docs collaboratively use paperpile.

Writing in RStudio use Zotero.

Summary

Great lists of pros and cons out there. Based on the various lists, I vote for key criteria as a. cloud storage, b. can use in RStudio easily, c. allows me to share library with collaborators for a given paper.

The three competitors seem to be Refworks, Mendeley, and Zotero.
Now, need to give them a head-to-head test shortly.

 

Seven steps for turning camera trap photos into useful datasets: manual processing workflow

Seven magical steps into a dataframe.


By Nargol Ghazian

This is a summary of the protocol I have been using for that past few months to process all the amazing camera trap photos from the Mojave National Preserve and the Carrizo National Monument. After reading a few papers on cam trap processing and exploring the CamtrapR package, the best approach would be to create your datasets manually as no other program is able to automatically detect animals for you. This method also ensures that you obtain the best dataset for the statistical analysis you wish to perform. This seven step guide should give you a quick rundown on how to get started with processing and maintaining a good workflow.

 

  1. Start by naming your columns as below. These heading are best suited for this project but yours can include more columns.
1.Year We are working on the 2017 images
2.Region MNP is Mojave and CNM is Carrizo
3.Site Mojave or Carrizo
4.Calendar date The date the picture was taken in dd-mm-yr. I like to do the pictures belonging to the same date for each photo rep in order. If the date is wrong, don’t worry too much, just do it all as one for the last date of the particular week you are working on
5.Microsite Carrizo is shrub or open, 3 weeks for each. Mojave is Buckhorn or Larea, also 3 weeks for each.
6.Day This goes in a 1,2,3..n order
7.Rep This refers to the camera trap station. There are 10 stations per microsite. For example you might have four pictures for the same day in station #2 of open, so you would write 2 four times: 2,2,2,2 each corresponding to an image
8.Photo Rep A continuous number starting at 1 and continuing until you’ve finished processing all your pictures for the particular site
9.Animal The animal in a hit photo. The most common are rat, rabbit, squirrel, fox, lizard and sometimes bird. There are times where you might have to guess. If it’s really hard then write ‘unidentifiable’. If it’s a false hit leave it blank.
10.Animal. Capture binary 0 = false hit, 1 = animal present
11.Time block Look at the timestamp. Is the photo taken at night, noon, afternoon or in the morning. If the timestamp is wrong, guess based on the darkness or lightness.
12.Night. Day Based on whether it’s dark or light.
13.Actual time Actual time written on the photo. Let’s hope it’s the correct timestamp J
14.Observations If you see absolutely anything interesting in the photo, note it! Otherwise leave this column blank. I usually write ‘x2’ or ‘x3’ if there is more than one animal in the photo. Sometimes I write ‘eyes visible’ if it’s dark and you can only tell the presence of the animal from its shining eyes (rats usually)
15.Temp of positive This is noted on the picture in Fahrenheit or Celsius. Whatever unit is shown, note it in your meta-data. If you’ve been working with one unit, and a certain photo rep has a different one, just use a converter to convert to the units you’ve been using for that particular photo rep.
16.Week This is either 1, 2 or 3 since there are only 3 weeks per microsite. This column is super important because sometimes the datestamps are wrong but at least the week of sampling is correct

*Note: The only time we actually fill in anything for columns 9 and 11-15 is when we have a “hit” and there is an actual animal.

  1. Since each row corresponds to a particular image, I liked to start with at least 100,000 just so I don’t have to go back and paste more rows. You can delete the extra rows when you’re done processing. For year, region, site and microsite write the correct label in the cell right below your column heading. Select the cell, then using control+C input the number of cells you want by using a colon in the area below the clipboard in Excel. For examples if you are in row B, it would be written as B2:B100000 then click enter.
  2. You can use the above method for the date as well. I input the date as I go along. For example if the photos start on 2017-05-22 I paste 100 thousand of that date. Obviously this is far too many, so once all the images for that particular date is finished, I control+C from the last cell and enter the new day for up to 100 thousand and so on until you are finished for the week and can delete the extra cells.
  3. Days work exactly like calendar dates.
  4. When it comes to rep, select all the image belonging to a particular date in your particular station of cam trap files and click properties. This tells you how many images you have for that date, in that station. Use the above method discussed to paste as many cells needed for that station. Keep a calculator handy because you would need to determine your ending cell by adding the number of images to the current cell you’re on.
  5. Photo rep is just a continuous number, as already mentioned in the chart. It’s basically the total of all the images processed for either Carrizo or Mojave. Use the ‘fill’ button in excel for numbers in a series to do this step.
  6. If you come across a cool photo, make sure to copy and paste it in a different folder. This includes two animals together, animals fighting, clear photos of animals etc. (use your judgement). Do not cut and paste!

 

 

How to download data from @HOBODataLoggers using a Mac #openscience #opendata @apple

Connecting most peripherals to a Mac is typically a snap. However, about two years ago, updates to OSX introduced to challenges to connecting Onset Hobo Micro Data Loggers to initialize and then download stored data. I decided to finally work through these challenges instead of switching machines. This may seem trivial, but it was a bit finicky; so, here are the steps quickly summarized.

Configurations: any version of OSX 10.8 and higher likely needs these steps particularly if your Onset product uses a serial port for communication.  The steps listed below were developed on a late 2012 iMac running 10.13.1 OSX (High Sierra).

Steps to connect

  1. Install Java Version 7 or higher (I installed Version 8 Build 151) because this is a dependency in later step.
  2. Reboot.
  3. You will need a usb to serial adapter for older Onset Stations. The one provided with loggers is manufactured by Tripp Lite and entitled Keyspan High Speed USB to Serial Adapter (USA-19HS). This step was a nightmare. You must install the correct driver and reboot. I tried earlier versions of the adapter and failed. Here is the manufacturer support page with driver downloads.  Continue forward with steps, but if you have everything plugged in ready to go and cannot see logger in HoboWare app, return to web and search for driver beta version and install. I had to do this, and I used driver version 4.0b4 (beta version) – not the version 3 from site. Here is a good explanation (need to go to Tripp Lite overview page and search for first three characters of unit, i.e. ‘USA’). If you do any updates to OSX, you also have to reinstall the driver.
  4. Install HoboWare Free Version (I prefer over the Pro Version that is provided with loggers because it is more updated more frequently and seems to run more smoothly). I used Version 3.7.13 here.
  5. Connect adapter to station then to serial-to-usb adapter and launch HoboWare.  You are now ready to either initialize or download if you have already deployed loggers.
  6. Initialize via device > launch. This is pretty self-explanatory.
  7. Download was a bit less obvious to me. Use device > readout. Allow readout to complete (progress bar pops up).
  8. Once complete, data are read into this instance of HoboWare as a project not saved. I recommend doing plot right away (see below).
  9. You can now use file > save datafile to locally save a file ending in .dtf. This is not ideal for me. The save project saves data, thumbnail, details, and plots for future reading by app. Again, do not prefer because I work in R. See final step if you want to work outside ecosystem.
  10. If you prefer to save as .csv, you first select the plot option after readout because this writes workspace to .csv. Then go to file > export data table and you have a nice clean dataframe for additional wrangling. Before you move on to next logger, ensure you close project because it can retain former instance of plots and datable even after you do a readout from new unit.

Now, you are ready to explore some microclimate for your sites!

Field sample processing

This fall I have been processing the insects and pollen samples that I collected this spring from my fieldwork in the Mojave Desert. The insects were primarily caught using pantraps, and were transferred into 90% isopropyl alcohol for preservation. With the help of our lab’s two undergraduate practicum students, Shobika and Shima, we are gradually getting them nicely organized into collection boxes.

I pinned many, many bees and wasps when I worked on a pollinator census during my undergrad in West Hamilton. These are the steps I use for processing insect samples:

  • Remove insects from alcohol.
  • Give the bees a rinse in water to fluff out their body hairs (this step works variably well, we may need to give some of the larger specimens a spa day in the future)
  • Gently dry with a paper towel, this causes the wings to uncurl. Wing venation is very important for identification.
  • Under a dissecting microscope, pin from top to bottom through the upper right-hand side of the insect’s thorax into a stryofoam block. You want the insect to be completely horizontal.
  • Gently uncurl the legs from the body and unfurl 1 antenna.
  • Affix an insect identification label underneath the insect with the text readable from the left side of the insect. These labels should have date and location of collection, unique identifier and the name of the collector.
  • Place into foam lined box.
  • Repeat!
  • Very small insects get pointed rather than pinned. The right side of the thorax is glued to a triangle cut out of cardstock, and the triangle based is pinned instead.

I have also been mounting pollen samples whenever I can squeeze the time in. I collected stigmas from the field and have been storing them in ethanol-filled small tubes.

Process:

  • Let slide warmer heat up
  • Using a transfer pipette, remove the pollen-ethanol suspension and transfer drop by drop onto warm slide, letting the alcohol evaporate and ensuring it does not run over the edges.
  • Place stigma onto slide.
  • Rub the inside of the centrifuge tube that was storing the sample with a bit of fushcin jelly, place  onto slide as well. Cut out 2 more small cubes of jelly, place over drop locations. Cover with slide cover and leave on warmer to melt jelly. Label slide.

For a different experiment that I have not yet processed, I will put the tubes into a centrifuge, spin down and pipette out the pellet to save time and labour. Quite a few tubes from the current experiment are extremely small and I am concerned about their ability to hold up under the force of a centrifuge. I need a less labour intensive process to make slides for my upcoming field season. I can think of two main options right now – use sturdy tubes that I can centrifuge, or collect into small tubes without adding ethanol, and mount each evening while at the research station. This will cut down the need to let the alcohol evaporate.

Cam trapping: ten simple steps to process all those amazing photos

Workflow

  1. Read some cam trap papers.
  2. Check camtrapR package and see what it does to decide if it suits your specific needs.
  3. Open folder for each site, each day, each rep.  Do a folder ‘get info’ to count # of total pics.  These are your ‘reps’ within reps, i.e. literally total number of snapshots (or use command line to get dir info for all your photo data).
  4. I would honestly just paste 0’s all the way down because many will be ‘false hits’.
  5. Then, open them all up and scroll through.
  6. Every time a positive hit, overwrite 0 in ‘animal.capture’ vector and in ‘animal’ vector record what it is.
  7. Also, copy all positive hits photos into a separate folder for additional analyses.  Use a folder structure or ID system that keeps track of the place and time that photo was from.  For instance, have a folder entitle positive-hits for each site, day, location, rep or aggregate into a single positive hit folder but use a mechanism to ensure we know where/when photo was taken. Do not cut and paste, copy. This is a backup mechanism for additional analyses and sharing data.
  8. We also want to know when animals are most active, or not; and hence, check timestamps and paste down in that column too. Ideal is actual time but morning, afternoon, night is absolutely adequate and more rapid if we cannot automate the scraping using R-package.
  9. If timestamp is incorrect, do a light-dark assessment to code as night or day – this is a very rapid process.
  10. Record observations if more than one animal or if the same animal was recaptured from previous instance. Record anything of note ecologically to calibrate the quantitatives and link photo-capture processing to data mapping/translation. The goal is to accurately map photos onto numbers that represent the dynamics of the system in study.

Outcomes

The goal is have to have both an evidence folder of positive hits and a dataframe that can then be wrangled to estimate relative efficacy of sampling, frequencies of different animals, spatiotemporal dynamics, and differences between structured treatments in the implementation of trapping.

Meta-data for manual processing spreadsheet workflow

Attribute is the column headers.

attribute description
year we have many years for Carrizo (evil laugh) so good to list here in a vector
region MNP for Mojave, CNM for Carrizo
site if you have more than one site, put name of site
calendar date dd-m-year
microsite larrea, buchhorn, ephedra or open depending on region
day this is census day, 1,2,3, to however many days sampled
rep if more than one rep per day
photo rep just cut and paste to total number of photos each cam took on one day, could be 10 to 10000
animal.capture binary 0 = false hit, 1 = animal present
animal list animal as ‘none’ if false hit, then animal name if one was there
timeblock with animal telemetry work, morning, afternoon, night is usually sufficient 6am to noon, noon to 6pm, them night time
night.day back up if timestamps are incorrect – just do by night and day using light and darkness in photos. very quick
filename.ID.positive.hits
optional depending on your filing system, copy all positive hits to a separate folder. somehow, keep track of positive locations and times for subsequent analyses
observations
record observations of anything ecological that pops such as if there was another animal in the photo OR if it was the same animal repeatedly recaptured

CSEE 2017 Highlights

This year the ecoblender lab attended CSEE 2017. The conference was great and covered four days of talks, workshops, and networking events. I attended a free workshop that taught some basics in mapping spatial data and different packages to use in R. There was also a wide range of talks that mostly seemed interdisciplinary. This included discussions of uncertainty in ecology, estimate the value of natural resources, and developing models of habitat selection. Here are some of the highlights I took away from the conference:

Modelling:

There was discussion over the usage and power of mechanistic vs. phenomenological models. This is a topic discussed often in ecology (see of that discourse here), but can be defined here as:

mechanistic: includes a process (physical, biological, chemical, etc) that can be predicted and described.

phenomenological: Is a correlative model that describes trends in associated data but not the mechanism linking them.

The discussion mostly described the relationship between phenomenological and Mechanistic models as not binary and rather a gradient of different models that describe varying amounts of a particular system. However, it did touch upon models such as GARP and MaxEnt that are often used for habitat selection or SDM but neglect the mechanism that is driving species occurrence. Two techniques I would like to learn more about are Line Search MCMC and HMSC which is a newly developed method for conducting joint species distribution models.

Camera traps:

There was also a morning session that described benefits and tools for using camera traps. These sessions are always great as they give a chance to see some wildlife without disturbance. Topics focus around deer over abundance harming caribou populations, how wildlife bridges do not increase predation through the Prey-Trap Hypothesis and techniques for using wildlife cameras or drones. One talk that was particularly interested used call back messages when triggered to see how animals respond to noises such as human’s talking or a mating call.

One of the more useful things I believe to have taken out of the session is how to estimate animal abundance and movement when the animals in your camera traps are unmarked. One modelling technique using Bayesian modelling and was found to be equivalent to genetic surveys of animal fur for estimating animal abundance. This is in contrast to the more frequent spatial capture-recapture (SCR) methods that either mark individuals or supplement camera trap data with other surveys. I also discovered there the eMammal project at the Smithsonian that is an Open Access project for the management and storage of camera trap data.

Ecology and climate change:

Climate change as always is a big topic at these conferences. There was a good meta-analysis out of the Vellend lab that show artificial warming of plant communities does not result in significant species loss. However, there was evidence that changes in precipitation does significant impact plant communities. The results are very preliminary, but I look forward to seeing more about it in the future. I also liked a talk that is now a paper in Nature that models networks in the context of climate change. The punchline of the results being that species composition in communities is dependent on dispersal, and high dispersal rates can maintain network structure although members of the community may change.

I presented results from our upcoming paper modelling positive interactions in desert ecosystems:

Overall I learned a lot from the CSEE 2017 conference and thought it was a health balance of size and events. Victoria was also a great city and made hosting the conference very easy. Next year it will in the GTA and I plan on connecting with the organization committee to potentially host an R workshop at the beginning of the conference. Until then!

Ecoblender hosting a workshop: An Introduction to R and Generalized Linear Models

Full details are provided here.
https://afilazzola.github.io//YorkU.GLM.2017-04-28/

General Information

The purpose of this workshop is to provide tools for a new/novice analyst to more effectively and efficiently analyse their data in R. This hands-on workshop will introduce the basic concepts of R and use of generalized linear models in R to describe patterns. Participants will be encouraged to help one another and to apply what they have learned to their own problems.

Who: The course is aimed at R beginners and novice to intermediate analysts. You do not need to have any previous knowledge of the tools that will be presented at the workshop.

Where: 88 Pond Road, York University. Room 2114 DB (TEL). Google maps

Requirements: Participants should bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) with administrative privileges. If you want to work along during tutorial, you must have R studio installed on your own computer. However, you are still welcome to attend because all examples will be presented via a projector in the classroom. Coffees and cookies provided for free.