A two day work session at UCSB was extremely informative covering a wide range of topics for programming and ecology. The course was divided into four components: bash-cmd, intro to R-studio, Github, and data manipulation in R-studio. I especially liked how the course took a more abstract approach without going through statistics. Rather it was focused on data manipulation for the day to day ecologist. One of the more unexpected things I learned from the process was R-markdown and developing websites using it with Github. All of these tools can significantly help with collaboration. Although most of what I had been doing in R is not wrong, it may be difficult for a collaborator to pick up my code and start using it. I think this course really helps me bridge that gap and it is something I am going to push forward on. Gone are the days of sharing Word Documents with 6 versions of the same figure.
All the course materials are found on the website here! I would recommend anyone even slightly interested in the above topics to go through it. Below are some highlighted parts that I believe deserve a little extra attention.
One thing that was lightly talked about within the short 3 hour time frame we had, is the power of Bash Shell. Bash Shell (cmd) is Neo from the matrix. All the rules are off and boundaries are endless. It has happened to me before on simple tasks that files or hard drives will be written off as corrupt, yet all the files are still there. Bash Shell has allowed me to see what my OS restricts. This unrestricted access and combination for programming can allow tasks to be committed that otherwise are not possible in real-time or at all. To bring in another movie reference, “with great power comes great responsibility”. Despite the overwhelming power of Bash Shell, it is easy to do things wrong… very wrong. It that way it may be intimidated to users because there is no undo or recycling bin. Still, a very powerful tool for the ecologist who wants to do something on their computer, but can’t.
Base vs Dplyr
Why is Dplyr better than Base? I haven’t quite found out if Dplyr better, but I have noticed that it is easier to understand when sharing with collaborators. Nesting functions within funtions may make sense to you, but to others it can look like a disaster. Will I switch over to dplyr? Maybe. It does mean learning a bunch more commands and most are the same character length as base. However, collaboration is everything and seeing subset(subset(subset… may scare a few people off.
Such an unexpected surprised! I really like Rmarkdown and how easy it is to generate a quality website with little code. The best part is that it still has easy functionality to link to CSS or HTML files. Nothing is perfect, and it unfortunately means learning another series of commands and codes for something that already exist. However, it does tie in better with R scripting. This allows for the development of half-websites, half-experiment results that can be used as a blog post, shared with others, etc. The course taught us a lot about it and I’m already forgetting much of what I heard, but I will begin to incorporate as much as I can.