Managing Research Software Projects: Glossary

Key Points

Introduction
  • Requirements for small research software projects are typically emergent.

  • Research software is ‘good enough’ when people other than its authors can use it with confidence and extend it with reasonable effort.

Organize Deliberately
  • Name each project component (code, data, metadata, etc.) according to its content or purpose.

  • Group similar project files into dedicated directories.

  • Take advantage of widely used project organization conventions unless there is a compelling reason not to.

Use Version Control
  • Use version control for everything created manually, not just code.

  • Create a new branch for each feature.

  • Only use that branch for that feature.

  • Merge and delete the branch when the feature is complete.

Host Your Project Responsibly
  • Projects can be hosted on university servers, on personal domains, or on public forges.

  • Rules regarding intellectual property and storage of sensitive information apply no matter where code and data are hosted.

Automate Frequent Tasks
  • Use a build manager to manage repetitive tasks.

  • Make build files explain themselves.

  • Use checklists for tasks that have to be done repeatedly, but can’t be done by a computer.

  • Have new contributors go through checklists to look for omissions and inaccuracies.

Make the Software Robust
  • Robust software is software that works for people you’ve never met on machines you’ve never heard of.

  • Every project should have a README that briefly explains its purpose and dependencies.

  • Every program should be able print a short usage message.

  • Every program should be able to log its actions.

  • Use semantic versioning (major.minor.patch) to identify software versions.

  • Make old versions of software available.

  • Do not require root permissions or other special privileges to install software.

  • Do not use hard-coded paths in software.

  • Provide command-line parameters for commonly-changed options.

  • Provide hierarchical configuration files for all options.

  • Do not invent your own syntax for configuration files.

  • Every application should have a small sanity-testing test suite.

  • Always produce identical results for particular inputs.

Issue Trackers
  • An issue-tracking system is a shared to-do list for a project.

  • Every issue has a few mandatory fields to help with searching, and free-form text for details.

  • Every issue is in a particular state.

  • A project can define a workflow by specifying who can change tickets’ states when.

  • Use tickets to prioritize work: what needs be done now, what can be deferred until later.

Choose a License
  • People who incorporate GPL’d software into their own software must make their software also open under the GPL license; most other open licenses do not require this.

  • The Creative Commons family of licenses allow people to mix and match requirements and restrictions on attribution, creation of derivative works, further sharing, and commercialization.

  • People who are not lawyers should not try to write licenses from scratch.

Distribute Your Software
  • Every software ecosystem now includes a package manager to track and update dependencies.

  • Describe software versions in precise machine-readable form to take advantage of these.

  • Always set up development machines using the package description in order to ensure it’s up to date.

Code Review
  • Code review is the most cost-effective way to find bugs known.

  • Use pre-commit review.

  • Review code after all mechanical checks have passed.

  • Keep changes short enough to be reviewed in less than an hour.

  • Use code review to share knowledge within the team.

  • Pair programming is the practice of having two people share one computer while writing code.

  • One person is the driver (typing) while the other is the navigator (watching and commenting).

  • Pairs should switch roles periodically.

  • Pair programming helps spread knowledge.

  • Pair programming is particularly useful for onboarding and difficult tasks.

Test All The Things
  • Write tests to define explicit tolerances.

  • Use a unit testing framework to write and run tests.

  • Isolate tests.

Continuous Integration
  • Continuous integration rebuilds and/or re-tests software every time something changes.

  • Use continuous integration to check changes before they are inspected.

  • Check style as well as correctness.

  • Adopt a widely-used pre-defined coding style rather than inventing one of your own.

Agile Development
  • Agile development is a software development process based on short iterations and rapid feedback.

  • Key feedback loops are pair programming, test-driven development, continuous integration, and stand-up meetings.

  • Agile works well for exploratory projects.

  • Agile depends on high-bandwidth communication between developers and users, and on developers wanting to be empowered.

Sturdy Development
  • Sturdy development is a software development process based on planning suitable for larger teams and more mature projects.

  • Key features are division of labor, effort estimation, and long-term scheduling.

  • Sturdy development is most suitable for large projects with well-defined goals.

Test Driven Development
  • Test-driven development (TDD) is the practice of writing tests before writing code.

  • Writing tests first helps clarify the intent and interface of the code to be written.

  • Empirical evidence for TDD’s benefits is unclear, but many programmers find it very useful.

Compromise
  • Technical debt must often be accrued to get a project off the ground, but must eventually be paid down for a project to grow.

  • Use rapid iterative development to bring a project up to a minimally viable standard.

  • Software engineering best practices are not a goal unto themselves, but means to an end.

  • Ensure the scientific core of your project is sound. Otherwise, fix bugs when they come along, and then get back to science.

Build a Community
  • The two biggest factors affecting participation in open projects are ease of setup and warmth of response to first contribution.

  • Specify a contributor code of conduct for your project.

  • Use one primary channel for communication.

  • Avoid common communication pitfalls.

Mentor
  • Power is always distributed unequally in groups - the only question is whether that’s explicit or not.

  • Be clear about what you have delegated and what ‘done’ looks like.

  • Make conditions and criteria for performance reviews explicit.

  • Accept the same kind of feedback that you want to give.

  • Crunch mode doesn’t work.

Building an Organization
  • Every project has a governance structure, either officially or unofficially.

  • Give people pathways to contribution.

Marketing
  • Marketing is the craft of seeing the value of your research from someone else’s point of view.

  • Your papers are your advertisements - your grant proposals are your product.

  • Check your project’s social media profile regularly.

Conclusion
  • Managing a research software project is a learnable skill like any other.

Glossary

feature branch workflow
A style of using version control in which a new branch is created for each feature from a main development branch, and then merged into the development branch when the feature is done.