Version control

Versioning and archiving of processing routines

  • Version control has a liberating effect on software development.
  • Version control is an essential prerequisite
    to unequivocally identify the version of a routine employed.
  • Distributed version control systems facilitate their use
    and reduce external dependencies.
  • For a larger project, a clear work flow should be established
    and followed consequently.
  • Version control has the largest impact on how we program –
    besides automatic tests.

Why versioning?

  • Software is always subject to change – that's why it is called “soft”.
  • Routines for data processing and analysis develop, often together with the understanding of the task.
  • Reverting to the last version that “just worked” is an (easy) task.

What should be versioned?

  • Basically everthing, from a script for data processing to the final manuscript/thesis/project report

What are prerequisites for ubiquitious versioning?

  • Open file formats allowing for easy versioning (ideally text formats)
  • Minimal infrastructure requirements, as met by a distributed version control system
  • Possibility to share versioned data with others (via simple-to-use web interfaces)

Beware: Versioning without version numbers is only half the battle.

Which tool to use

Note: This section is clearly opinionated. There are definitely other tools available. However, this is the tool the author recommends from own experience.

git – Everything is local
Wide-spread distributed version control system with excellent support
Largest development and code-sharing online platform, with free accounts

For local applications and given a minimum of infrastructure, Gitea - Git with a cup of tea is highly recommended as a fast and light-weight alternative to services such as GitLab. Gitea describes itself as a “painless self-hosted Git service” – and this really is true.

