Tutorial: Workflows for reproducible research in computational neuroscience¶
Andrew Davison
Unité de Neurosciences, Information et Complexité (UNIC), Centre National de la Recherche Scientifique, Gif sur Yvette, France. http://andrewdavison.info
Notes for a tutorial given at CNS 2012.
Version 0.3
July 23, 2012
Abstract¶
Reliably repeating previous experiments, one of the cornerstones of the scientific method, ought to be easy in computational neuroscience, given that computers are deterministic, not suffering from the problems of inter-subject and trial-to-trial variability that make reproduction of biological experiments so challenging. In general, however, it is not at all easy, especially when running someone else’s code, or when months or years have elapsed since the original experiment.
The failure to routinely achieve replicability in computational neuroscience (probably in computational science in general, see Donoho et al., 2009 [1]) has important implications for both the credibility of the field and for its rate of progress (since reuse of existing code is fundamental to good software engineering). For individual researchers, as the example of ModelDB has shown, sharing reliable code enhances reputation and leads to increased impact.
In this tutorial we will identify the reasons for the difficulties often encountered in reproducing computational experiments, and some best practices for making our work more reliable and more easily reproducible by ourselves and others (without adding a huge burden to either our day-to-day research or the publication process).
We will then cover a number of tools that can facilitate a reproducible workflow and allow tracking the provenance of results from a published article back through intermediate analysis stages to the original models and simulations. The tools that will be covered include Git, Mercurial, Sumatra and VisTrails.
Contents¶
- Why reproducible research?
- Best practices for reproducible research
- Version control
- Basic ideas
- Examples of version control systems
- The importance of tracking projects, not individual files
- Advantages of formal version control systems
- Installing Mercurial
- Creating a repository
- Adding files to the repository
- Committing changes
- Viewing the history of changes
- Seeing what’s changed
- Switching between versions
- Giving informative names to versions
- Recap #1
- Making backups
- Working on multiple computers
- Collaborating with others
- Recap #2
- A comparison of Git and Mercurial
- A comparison of Subversion and Mercurial
- Graphical tools
- Web-based tools
- Testing
- Provenance tracking
- Conclusions
[1] | Donoho, D.L., Maleki, A., Rahman, I.U., Shahram, M. and Stodden, V. (2009) 15 Years of Reproducible Research in Computational Harmonic Analysis, Computing in Science and Engineering 11:8-18. doi:10.1109/MCSE.2009.15 |
Licence¶
This document is licenced under a Creative Commons Attribution 3.0 licence. You are free to copy, adapt or reuse these notes, provided you give attribution to the author, and include a link to this web page.
Sources¶
https://bitbucket.org/apdavison/reproducible_research_cns - feel free to fork the repository!