Seven steps for turning camera trap photos into useful datasets: manual processing workflow

Seven magical steps into a dataframe.


By Nargol Ghazian

This is a summary of the protocol I have been using for that past few months to process all the amazing camera trap photos from the Mojave National Preserve and the Carrizo National Monument. After reading a few papers on cam trap processing and exploring the CamtrapR package, the best approach would be to create your datasets manually as no other program is able to automatically detect animals for you. This method also ensures that you obtain the best dataset for the statistical analysis you wish to perform. This seven step guide should give you a quick rundown on how to get started with processing and maintaining a good workflow.

 

  1. Start by naming your columns as below. These heading are best suited for this project but yours can include more columns.
1.Year We are working on the 2017 images
2.Region MNP is Mojave and CNM is Carrizo
3.Site Mojave or Carrizo
4.Calendar date The date the picture was taken in dd-mm-yr. I like to do the pictures belonging to the same date for each photo rep in order. If the date is wrong, don’t worry too much, just do it all as one for the last date of the particular week you are working on
5.Microsite Carrizo is shrub or open, 3 weeks for each. Mojave is Buckhorn or Larea, also 3 weeks for each.
6.Day This goes in a 1,2,3..n order
7.Rep This refers to the camera trap station. There are 10 stations per microsite. For example you might have four pictures for the same day in station #2 of open, so you would write 2 four times: 2,2,2,2 each corresponding to an image
8.Photo Rep A continuous number starting at 1 and continuing until you’ve finished processing all your pictures for the particular site
9.Animal The animal in a hit photo. The most common are rat, rabbit, squirrel, fox, lizard and sometimes bird. There are times where you might have to guess. If it’s really hard then write ‘unidentifiable’. If it’s a false hit leave it blank.
10.Animal. Capture binary 0 = false hit, 1 = animal present
11.Time block Look at the timestamp. Is the photo taken at night, noon, afternoon or in the morning. If the timestamp is wrong, guess based on the darkness or lightness.
12.Night. Day Based on whether it’s dark or light.
13.Actual time Actual time written on the photo. Let’s hope it’s the correct timestamp J
14.Observations If you see absolutely anything interesting in the photo, note it! Otherwise leave this column blank. I usually write ‘x2’ or ‘x3’ if there is more than one animal in the photo. Sometimes I write ‘eyes visible’ if it’s dark and you can only tell the presence of the animal from its shining eyes (rats usually)
15.Temp of positive This is noted on the picture in Fahrenheit or Celsius. Whatever unit is shown, note it in your meta-data. If you’ve been working with one unit, and a certain photo rep has a different one, just use a converter to convert to the units you’ve been using for that particular photo rep.
16.Week This is either 1, 2 or 3 since there are only 3 weeks per microsite. This column is super important because sometimes the datestamps are wrong but at least the week of sampling is correct

*Note: The only time we actually fill in anything for columns 9 and 11-15 is when we have a “hit” and there is an actual animal.

  1. Since each row corresponds to a particular image, I liked to start with at least 100,000 just so I don’t have to go back and paste more rows. You can delete the extra rows when you’re done processing. For year, region, site and microsite write the correct label in the cell right below your column heading. Select the cell, then using control+C input the number of cells you want by using a colon in the area below the clipboard in Excel. For examples if you are in row B, it would be written as B2:B100000 then click enter.
  2. You can use the above method for the date as well. I input the date as I go along. For example if the photos start on 2017-05-22 I paste 100 thousand of that date. Obviously this is far too many, so once all the images for that particular date is finished, I control+C from the last cell and enter the new day for up to 100 thousand and so on until you are finished for the week and can delete the extra cells.
  3. Days work exactly like calendar dates.
  4. When it comes to rep, select all the image belonging to a particular date in your particular station of cam trap files and click properties. This tells you how many images you have for that date, in that station. Use the above method discussed to paste as many cells needed for that station. Keep a calculator handy because you would need to determine your ending cell by adding the number of images to the current cell you’re on.
  5. Photo rep is just a continuous number, as already mentioned in the chart. It’s basically the total of all the images processed for either Carrizo or Mojave. Use the ‘fill’ button in excel for numbers in a series to do this step.
  6. If you come across a cool photo, make sure to copy and paste it in a different folder. This includes two animals together, animals fighting, clear photos of animals etc. (use your judgement). Do not cut and paste!

 

 

Determining Regional Gradient

Ephedra regional gradient

My biggest project examines positive interactions along a regional gradient of continentality. The immediate question though is what is continentality? What abiotic and biotic variables change along this gradient in addition to plant-plant interactions. When we initially constructed this gradient the two main considerations were aridity and cold stress. For plants in the Deserts of California these are two very important considerations. After two years of conducting this experiment, I had very different climate profiles during the seasons. The most striking was the differences in my plant phytometers between the two seasons. In 2015-2016 growing season, the majority of my plants were present in the San Joaquin Desert. This desert is generally colder and wetter than the more continental Mojave Desert to the east. However, in the 2016-2017 the San Joaquin Desert sites had few plants of my chosen phytometer relative to the abundant Mojave Desert sites. All my plants were present at all my sites at some point, suggesting that this gradient shifts with inter-annual variability. Let’s take a look at what some of that looks like:

San Joaquin Desert year

The 2015-2016 shown in black had similar temperatures on average relative to the 2016-2017 growing season (in grey). The precipitation patterns though were different between years. These sites form a parabola with distances from the ocean. Sites closest to the ocean and most inland have the highest precipitation, while sites in the middle are the least. Overall the 2016-2017 season saw significantly more rainfall. Sites in the 2015-2016 season were extremely arid. For instance, Barstow and my site along Hwy40 saw as little as 30 mm of rainfall. The low abundance of my phytometer in the Mojave sites for that season is therefore likely because of low rainfall amounts. However, the San Joaquin sites has similar rainfall between years so then why so few plants in the 2016-2017. I believe this has to do with the cold stress factor:

Precipitation in mm (black) and temperature in C° (red) during the  2015-2016 growing season for the San Joaquin desert (top) and Mojave Desert (bottom).

Precipitation in mm (black) and temperature in C° (red) during the 2016-2017 growing season for the San Joaquin desert (top) and Mojave Desert (bottom).

Mojave Desert year

Both of these seasons had similar precipitation and temperature patterns. The patterns were also similar between the two deserts, but the noticeable difference that I believe contributed to low plant abundance in the San Joaquin in 2016-2017 is temperature. The year before had warmer temperatures from January onward, which is a key period for plant development. In January 2017 following the majority of rainfall there was a long freeze period of approximately 5 days, followed by another cold period with freezing temperatures end of February. This pattern was much warmer in 2016 and is why I believe cold stress negatively affected plants in San Joaquin Desert for 2017. On the other hand, the Mojave saw significantly ore precipitation and cooler temperatures that all contributed to greater plant abundance.

Slicing through this climate data was interesting and challenging because of all the different ways to summarize variables. Using season means collapses a significant amount of the information and can make conclusions more difficult to derive. I am primed and excited now to dig into the plant responses!

 

Rules-of-thumb for collaboration

Rules-of-thumb for reuse of data and plots
1. If you use unpublished data from someone else, even if they are done with it, invite them to be a co-author.
2. If you use a published dataset, at the minimum contact authors, and depending on the purpose of the reuse, consider inviting them to become a co-author. Check licensing.
3. If you use plots initiated by another but in a significantly different way/for a novel purpose, invite them to be co-author (within a reasonable timeframe).
4. If you reuse the experimental plots for the exact same purpose, offer the person that set it up ‘right of first refusal’ as first author (within a fair period of time such as 1-2 years, see next rule).
5. If adding the same data to an experiment, first authorship can shift to more recent researchers that do significant work because the purpose shifts from short to long-term ecology.  Prof Turkington (my PhD mentor) used this model for his Kluane plots.  He surveyed for many years and always invited primary researchers to be co-authors but not first.  They often declined after a few years.
6. Set a reasonable authorship embargo to give researchers that have graduated/changed focus of profession a generous chance to be first authors on papers.  This can vary from 8 months to a year or more depending on how critical it is to share the research publicly.  Development pressures, climate change, and extinctions wait for no one sadly.
Rules-of-thumb for collaborative writing
1. Write first draft.
2. Share this draft with all potential first authors so that they can see what they would be joining.
3. Offer co-authorship to everyone that appropriately contributed at this juncture and populate the authorship list as firmly as possible.
4. Potential co-authors are invited to refuse authorship but err on the side of generosity with invitations.
5. Do revisions in serial not parallel.  The story and flow gets unduly challenging for everyone when track changes are layered.

How to use colour in manuscripts

untitled

I thought this was a very helpful guide on using colour in figures. There are a few rules, but one comment from the whole document stands out. “If colour serves a purpose, but something other than colour would do the job better, avoid using it”.

http://www.perceptualedge.com/articles/visual_business_intelligence/rules_for_using_color.pdf

Here are the simple rules:

1. If you want different objects of the same color in a table or graph to look
the same, make sure that the background—the color that surrounds
them—is consistent.

2. If you want objects in a table or graph to be easily seen, use a background
color that contrasts sufficiently with the object.

3. Use color only when needed to serve a particular communication goal.

4. Use different colors only when they correspond to differences of meaning
in the data.

5.  Use soft, natural colors to display most information and bright and/or dark
colors to highlight information that requires greater attention.

6.  When using color to encode a sequential range of quantitative values,
stick with a single hue (or a small set of closely related hues) and vary
intensity from pale colors for low values to increasingly darker and brighter
colors for high values.

7.  Non-data components of tables and graphs should be displayed just
visibly enough to perform their role, but no more so, for excessive salience
could cause them to distract attention from the data.

8.  To guarantee that most people who are colorblind can distinguish groups
of data that are color coded, avoid using a combination of red and green in
the same display.

9. Avoid using visual effects in graphs.

 temp

Filazzola 2016 updates

Its 2016 whoa!

I enjoyed a solid break at the end of December but am looking forward to getting back into the grind. There are some crucial things I would like to tackle before my progress reports:

  1. RDM paper – I have finalized the statistics and need to get back to the writing. I am reading the literature to get a better idea on how to properly structure an animal distributions paper. Something I’m not accustom to.
  2. Field work – Gradient – I’m very excited for this experiment. It is been raining in California so  am hoping for some heavy germination. The rain patterns have somewhat followed my gradient pattern as well with my more coastal sites seeing higher precipitation. About time the rain came
  3. Field work – Exclosures – I am reconducting my exclosure experiment and building 60 new plots. Recreating the experiment I did two years ago in the worse drought, now in heavy rains will be the perfect contrast. It shall be interesting to see which way the trends flip
  4. Other stuff? – There is a lot but two primary ones I need to get done is finalize the Ecography paper before the due date and come up with a workflow for the facilitation ecologists who want to write a synthesis paper.

Overall, please with the way my winter is shaping up. Cali is a lot colder than it has been in the past, but lots of rain makes for a happy ecologist! I am also going to try to shoot a video short during my time in the field. Stay tuned for all the gory details!temp

Nomenclature for pollinator and animal stills and videos

Given that an increasing proportion of Big Data globally are not in alphanumeric formats, nomenclature with appropriate semantic tags are critical to enable retrieval and use of videos and pictures. In ecology, animal cameras are a popular tool to ‘trap’ or capture vertebrates in the field. In our lab, we use these extensively for this purpose but also use small, HD cameras to record pollinators and flying invertebrates (description of methodology: From birds to bees: applying video observation techniques to invertebrate pollinators). However, as we explore sharing these data online via youtube or as data packages with new data journals, we have come to realize that even with appropriate, well-articulated meta-data, file names are important and must be descriptive. Zooniverse as a platform has many animal-cam projects now and you are typically logged in to collaborate so perhaps it less critical. In all other domains however, it is likely useful to have file names that support rapid assessment of ecological context (maybe even for the kitten clips too).

WP-DearKitten

We are discussing this format now but would love additional input on nomenclature – i.e. from systematic nomenclature domains.

year-site-microsite-rep.extensions seems like a reasonable starting point. There are many other options though.

2-4-LCD-Farm-Hunting-Cam-Night-Vision-Camera-Animal-Hunter-42-LED-12MP-1080P-HD.jpg_350x350

Open Science Framework

Yesterday we had a Skype training session with Courtney Soderberg from the Open Science Framework (www.osf.io). We have been debating incorporating this system into our collaboration model to help organize research projects within the lab, as well as distribute materials and manage group work within a classroom setting. We are going to pilot this platform with the Experimental Design course this semester and from there determine whether or not it would be useful for our collaborative research projects as we already use several platforms for this already (Google drive, dropbox, figshare, slideshare, etc).

Here are some of the main take-away points from the meeting:

PROS:

  • All components of research project available in one place
  • Advanced versioning allows you to easily track what has been added by who, as well as jump back to older versions of the same document(s).
  • Can link to other platforms that sync with each other (Google, Dropbox, figshare, Github, and several more)
  • Can exit text-based files within the program
  • Privacy settings by component (i.e. can allow certain collaborators to see some sections such as data, but not others)
  • Can make certain components public, and share widely
  • Useful for group work within a classroom setting to monitor progress and individual student contributions/workload

CONS:

  • Learning curve of using a new platform
  • Component organization may be too complicated (especially for students to learn quickly and make use of)
  • Cannot edit non-text files within, must edit externally and re-upload (word doc, powerpoint, etc)
  • Many features not entirely novel–we have seen this before on other platforms so may not be worth the effort to switch over.
Screen Shot 2015-09-10 at 9.50.15 PM

Screenshot of the main project page within the OSF website.

 

Prelims & progress report experience

I think I have come a long way through the graduate program, which has become particularly evident during my progress reports. My project has evolved from MSc to PhD and there has been numerous modifications or alterations to the experimental design. Only two of my originally proposed experiments from my first year in the program have remained consistent. I don’t think this is a surprise and was actually mentioned during my PhD preliminary examination. That stage was a good learning experience and besides developing my thesis also taught me a few things about progress reports:

  • Committee: It is important to pick your committee early. It requires minimal effort and is very help to bring other professors into the loop as soon as possible. Don’t pick easy professors either. You are doing yourself a disservice by trying to squeeze through your degree with minimal effort. Of course this doesn’t mean pick a professor that has any prejudice towards you.
  • Supplementary files: Some topics are hard to digest even if you explain it well and short presentation may not be enough time to delve into specifics. This is why prepping some supplemental figures or slides that better explain complex topics is a good preparation for inevitable questions on it. Some popular choices are species ranges, theories or any supplemental work not present in your thesis.
  • Know the literature: Typically you will have read many papers throughout your graduate career. It is extremely difficult though to remember the author, year, or even title name especially when put under pressure. Instead, make a table or reference guide that quickly sums up most of the key papers you have read. Don’t get caught explaining “some paper, by that person”.
  • Demonstrate potential: Often your committee is less interested in the specifics of what soil mixture to use, but rather the overall purpose. Is it novel? Is it solving a question? Your project is very transient and will change often. What is more important is that your project is novel, pushing research forward and broad enough to cover your degree (MSc or PhD).
  • Be cool: You are the expert on your topic. You likely have read the most about it, conduct the most experiments and know all the little details. Do not get overwhelmed. Often your committee doesn’t know the answer to your question, but instead are trying to push you to see what answer you come up with. If you develop a rational explanation, most likely your doing fine.

Conceptual classroom series

Conferences 2016

2015 was a good year for conferences. The Ecoblender lab attended the California Native Plant Society, Society of Ecological Restoration, the San Joaquin Valley Ecological Conference and the Centennial for ESA.  Lots of great ideas from these conferences that being reflected in up coming projects within our lab. What about 2016 though?

Unfortunately, both CNPS and SER operate bi-annually, which means we will have to wait until 2017 to attend again. The San Joaquin conference is a really fun one day conference in March that the Ecoblender lab will likely attend again. ESA is on the maybe list in Fort Lauderdale. Some other potential options?

conference

Why @the_zooniverse and #openscience sharing platforms are also tools for social restoration

Person in Room with 500 Monitors --- Image by © Louie Psihoyos/CORBIS

Person in Room with 500 Monitors — Image by © Louie Psihoyos/CORBIS

Zooniverse is primarily a research platform with a big citizen science component.  Reasons to consider putting image data you collected (even smaller datasets or image/video libraries) include the following:
1. It raises your public reputation and gets your name out there.  This is really important as you get sometimes get surprise funding and collaborations too from places you might not expect.
2. It makes the project and the funding from agencies look great, and it is really another form of publication/dissemination.  It is also another financial return on the all those cameras and all your time.  REMEMBER – many of these tools also have an interaction component to engage people and not just flip through pics.
3.  We can engage with a wider audience and get local stakeholders to see the photos, see the odd animal, and begin to care more and appreciate their system from a very different perspective. Seeing those pics at ground level often at night etc is a total window into a world that ranchers, the oil staff, the managers, and even many biologists do not get.  I see this is an important social-restoration tool in that it promotes looking at and paying attend to plants and animals within the system.  I feel the exact same way about the pollinator videos.
Not only do natural systems need restoration but people need their social perspective on the ecology of systems restored.
n06rgb.gif
4. This system (and flickr) provide the free cloud storage of ALL your photo data for us, ie this is our cloud backup.
5. We can show students what we do and teach them or volunteers to help.  Of course, it is best if each PI does it, but the students or techs can also look at other things for us at some point. Windiness, total number of plants, total number of false hits for us, etc… or do other things we have not imagined yet – filter some, change contrast, etc.  that is the point of these tools.
In summary, I see all these outreach efforts (flickr, youtube, blog, twitter, etc) as a really important process of ecology, restoration, communication, and doing some social good.
These ‘snapshots’ are also an opportunity to illuminate natural history for individuals to see the system you care about.
A-picture-is-worth-a-thousand-words-2