2019-05-12

Version Numbers

A version number is used to identify different versions, revisions or editions of an artifact. It is common to start to count from 0 and then continue with 1, 2, 3 and so on. No implied meaning should be assigned to any specific number. They are all equally good at identifying a version of the artifact.

A universal version number is open for branching at any existing version. There is always at least one free version number to pick. There are many ways to achieve this, but a simple variant only needs one extra rule; trailing zeros are free. You can always add ".0" to any version and it will still be the same version, e.g. 2, 2.0, 2.0.0, 2.0.0.0 are all the same version. Now when we want to branch, we add one or more zeros to designate the head of the branch. Then we increase the last zero with one for the first version on the branch.

   0 - 1 - 2 - 3 - 4 - 5 - (main line)
            \
      (2.0)  \- 2.1 - 2.2 - 2.3 - (1:st branch of v. 2)
              \
      (2.0.0)  \- 2.0.1 - 2.0.2 - 2.0.3 - (2:nd branch of v. 2)
                   \
      (2.0.1.0)     \- 2.0.1.1 - (1:st branch of v. 2.0.1)

It is easy to see that if you limit the number of dots in your version numbers, they are not universal.

In software it is common to track the version of different aspects of the artifact using dot notation and branches, e.g. 2.4, where the first number tracks the version of the interface and the second tracks the version of the implementation of that interface. Semantic Versioning[1] is one example of a versioning scheme that does this. Another case is when development takes place in multiple branches in parallel, for example when an old release is maintained at the same time as new functionality is added to a future release. Then you need a number to separate the branches.

   0.0           1.0 ---------> 2.0     3.0
      \           ^ \              \     ^
       \          |  \              \    |
        \- 0.1 - 0.2  \- 1.1 - 1.2   \- 2.1 - 2.2


Sometimes it is necessary to depart from the main line and branch off a separate version of the artifact. Semantic versioning lets you add an extra number to indicate that it is a patched version, e.g. 2.4.1. This has its limitations though. It is for example not possible to make changes to the interface as the interface version number is fixed. The insight that leads to a solution is that semantic version numbers always comes in pairs. Add a pair of numbers instead of just one, e.g. 2.4.0.1. The first is for interface changes and the second for implementation changes, just like the main line.

A few rules I try to follow for version numbers are...
  • All artifacts, insofar it is possible, should be able to tell you which version they are.
  • Change happens all the time. Don't assume you know beforehand when an artifact is free of errors and no longer will change.
  • Let the version numbers map to the branching strategy.
  • Use version numbers that are universal (see above).
  • Only change the version number when the artifact changes.
  • Don't use the version number for other things like review status, approval status or quality status.

The first version with a new major number, e.g. 1.0, 2.0 and so on, is not the end goal but the start of something new. If you treat it as the end goal you create a lot of practical problems for yourself. It is likely that 1.0 is less than perfect and you need to release a bugfix. You could call it 2.0 to replace 1.0 as the new perfect version with the high quality moniker .0, but it is better to reserve the first number to interface or other major changes where it serves a better purpose. You settle for 1.1. Now what do you call the pre-release version leading up to 2.0? Not 1.2 because that is the second bugfix release of 1.0. And you do want to give those internal pre-release artifacts a version number to be able to relate test results, reviews and other feedback to something traceable. Therefore it is better to start off the development with version 2.0 and tag the first few versions in the 2.x series as experimental pre-releases and then later decide which version of the 2.x series that happens to be good enough to qualify as the first usable version.

Release Candidate is also a useful concept, where you build an artifact that you hope will be good enough to later be released, but you don't know for sure until you have tested it. I am against rebuilding a good release candidate just to change the version number, therefore the RC designator will always be part of the version number, e.g. 2.0.12 which is the 12:th release candidate of version 2.0.

It is usually a mistake to use development versions and unstable pre-releases a la Semantic Versioning. Say you have decided to use something like Semantic Versioning to communicate with your users, but then say you won't follow it for 0.x versions. Then what's the point? It is likely that you will be stuck on 0.x for many years and your users will struggle to understand what will break in the next release. Use your versioning scheme right from the start, if only for the purpose of learning about its consequences. If you are too lazy to figure out if the interface has changed in the last iteration, just increase the major number to not give your users any unpleasant surprises. The urge not to waste numbers is a delusion. You will not run out of numbers and used numbers don't pollute.

To reiterate, a specific version number, say 2.0, is never a goal in itself. The purpose of the version number is to communicate what it supersedes and how extensive the changes are. The version number you eventually end up with is dictated by the changes you make and how often you release.

Another common misconception is when you insist that reviewed and approved documents should have the version number 1.0, 2.0, 3.0 and so on. The version number gets overloaded with meaning when its main purpose should be to uniquely identify the document and simplify traceability. Also, formal review is just one method among others to get feedback for improvement and if you link it to the version number it will suppress other types of feedback. One of the problems that this x.0 rule causes is that documents never get to be reviewed because the author thinks there still are too many loose ends for the document to be worthy a 1.0. Reviews are good for finding errors and spreading knowledge in the organization and a process that adds obstacles for reviews to take place is sub optimal. Another problem with x.0 is that you create artificial barriers to improve x.0 documents when you find errors because it costs too much effort to get it to the next y.0 version. There are very few organizations that can afford such rigor in the process. Most organizations I have worked with would fare much better with a process that prioritizes small and cheap iterative corrections to documents and only in rare cases require that documents should be reviewed and approved before the development can proceed. What I see in practice is that a 0.6 or a 2.1 never prevents the development to proceed, so why pretend to have a rigorous process when it only causes problems and discomfort? Track the quality status of a document with their own properties, e.g. as additional notes in the document changelog, so that both 0.7 and 2.3 can be reviewed as well as approved.

It is better to let the version number of documents follow the branching of the artifact they describe. This means you don't need to maintain a cross reference matrix for traceability. Documents with version 2.x contain changes that are valid for artifact version 2.y. If you need separate documentation for each minor version, then documents with version 2.1.x belong to artifact 2.1. You choose the granularity that is suitable for your circumstances. With documents in source control the most significant numbers in the version can be tracked globally by the branch. The least significant number can be assigned automatically, e.g. using the global Subversion repo revision or by counting the number of commits of the document in the branch.

Another reason to prefer the branch centric versioning scheme instead of the review centric is the way it changes how you think about your work. The artifact is never finished, only continuously improved.

A third superficial concern regarding version numbers is that users somehow are entitled to get all consecutive versions and that they will be confused if there is a jump in minor or major version between releases. If the user has version 2.3, they expect to get 2.4 next. To get 2.7 instead would cause severe psycological trauma. This is simply not true. Don't let this concern prevent you from tracking changes, may they be internal or external, with a universal version number. The persons that insist on this anti-pattern are typically also the ones that want to haphasardly jump to 10.0 just to catch up with competitors.

For practical advice on how to assign versions to software builds, see [2].

[1] https://semver.org/
[2] https://embeddedartistry.com/blog/2016/10/27/giving-you-build-a-version


No comments:

Debugging with Popper