Experimentation workflow

For science and business alike, workflows are necessary in structuring large projects with multiple collaborators. In the big data world, data management tools are becoming extremely popular and even necessary when building large databases. This has been described as well when conducting science experiments but I don’t think I have ever seriously contemplated the structure behind it. I have always done things in a certain way when conducting an experiment but is this the right way? I decided then to write out my mental workflow on how I envision the experimentation process, all the way from conception to publication.

Design phase – I never realized how much work is involved in this stage. Often we view the publication process or field experiment to be more effect but I would argue this step is most limiting. It is a lot of work to generate research questions, understand the system and then determine how to effectively test it. There is also very restricted by many different constraints such as permissions, available resources, timing with other responsibilities and, for the ecologist, weather. This stage is also where the most dialogue is occurring and revisions to the design. Certain limitations may end up send the whole experiment back to the idea phase. During this phase, the purpose of the experiment and method of implementation needs to be clearly laid out because it will be easiest to change and changes later on may not be possible.

Implementation phase – Assuming that the design phase was completed effectively, this stage will involve the least amount of mental effort but the most amount of labour. There will also be unexpected challenges that will occur and immediate restructuring that could not have been planned for originally. Data construction is also important to conduct while collecting to ensure little details are recorded.

Manuscript phase – This phase brings the whole experiment together. The results are first generated to be shared with collaborators. Ideally, if the experiment stayed as planned, the statistical tests should already be known and will directly test the predictions. Subsequent statistical tests can also be added in from discussions with others who may provide more insight into the design. The experiment at this stage might be mostly written, particularly the methods, results, hypothesis and predictions. At this point, a meeting with collaborators might be necessary to discuss structuring the remaining parts of the manuscript. The paper is finalized, revised, submitted for publication and likely revised a few more times.

Products – Finally, once the paper is accepted for publication there should be three products that emerge. First, the publication, which unfortunately is the important of the three, particularly for early career researchers. Second, all the data collected from experiment should be published online in respective archives. Determining the right repository for the type of data is also something important that should be discussed with collaborators. Lastly, the statistical workflow should also be published online and shared. This is least commonly shared, but extremely important for many reasons. Primarily it helps the scientific community better understand what you did, increasing transparency and also can be beneficial to another ecologist wanting to conduct similar analyses. This is most simplistic with R where code can be shared on GitHub and a URL embedded into the manuscript post-acceptance.