Software and code quality

Software and code quality matters because it is directly related to maintainability and therefore to the costs for developing software systems, and for extending them with new features.

Measuring and evaluating software and code quality objectively is a difficult task which is illustrated by the following quote in a funny way:

The only valid measurement of code quality is WTFs/minute
Thom Holwerda

This implies that the only way to measure software quality is a subjective, individual and broad process. This is certainly true to some extend, but we believe that it is possible to apply concrete practices and methods to ensure a good software quality.

Since software quality is a topic that a whole series of books1 could be dedicated to, I just want to give a high level overview in this post.

Definition of software quality

If we want to evaluate software quality it basically boils down to the question of how much the software meets the particular requirements that where specified for the software in question.

The requirements can be divided into two groups:

Software quality characteristics

Some of the most important characteristics of software quality2 are:

  • Functional correctness
  • Reliability
  • Usability
    • Adequacy
    • Learnability
    • Robustness
  • Maintainability
    • Readability
    • Evolvability
    • Testability
  • Efficiency
  • Portability
  • Security

Functional correctness is a related to the functional requirements. All other aspects tend to be more related to the non-functional requirements.

Functional correctness

A software is considered correct when it meets all the functional requirement.

Reliability

[latex]Reliability = Correctness + Availability[/latex]

Availability of a software system is the percentage of time where the system is fully operational.

Availability ([latex]A[/latex])is defined as:

[latex]A = frac{MTBF}{MTBF + MTTR}[/latex]

([latex]MTBF[/latex]: mean time between failure, [latex]MTTR[/latex]: mean time to repair)

Usability

The usability describes the ease of use of a software and is defined by adequacy, learnability and robustness.

Adequacy means that

  1. the user input
  2. the program execution
  3. the result of the program execution

are reasonable for the given task.

The learnability of a software is influenced by the documentation and by the user interface which should provide efficient use of functionality.

A software is considered robust if the consequences of defect are inversely proportional to the probability of occurrence of the defect.

Maintainability

A software is considered maintainable if

  • defects and their cause can be isolated easily
  • defects can be fixed easily
  • changes can be made easily
  • extensions can be made easily

Evolvability is defined as the capability of a system to be extended with new features.

Therefore maintainability is closely related to readability, evolvability, and testability of the code.

Efficiency

Efficient software emphasizes a careful use of resources like e.g. CPU, RAM or external (secondary) storage. Efficiency is not to be mixed up with product efficiency. The product efficiency of a software system is related to the investment of time and money.

Portability

The portability describes how easy it is to port the program to another platform or device.

A software is considered to be portable if the changes that have to be made to the system to run on another device or operating system are less expensive than reimplementation.

Security

The security of a software is partly related to the requirements as security concerns should be part of the functional requirements for the system in question. Besides that security is also related to defects and design flaws which can be minimized by a good code quality and software architecture. There are tools which can help finding security issues like e.g. penetration testing tools.

Here are some aspects that have to be considered for good security:

  • Authorization
  • Encryption
  • Handling passwords
  • Session management
  • Validation
  • Preventing SQL injection

Clean Code Developer

Clean Code Developer (CCD) is an initiative that promotes software quality and the cultivation of higher professionalism within the software industry.

CCD defines four main values:

  • Evolvability
  • Correctness
  • Product efficiency
  • Continuous improvement

From these values the CCD initiative derives a hierarchy of principles, patterns and tools that support the values and that are divided into different skill levels.

Evolvability and correctness have already been defined above.

Product efficiency

Product efficiency is influenced by the time and the costs for developing a software system. A software system is considered to be (product) efficient when the investment of time and money is reasonable low. Product efficiency should not be mixed up with efficiency as defined earlier.

Continuous improvement

Continuous improvement is not a software quality characteristic. However, in order to keep the software quality high, it is important for software development teams to make reviews and retrospectives.

Maintainability is crucial

While all the characteristics described above are important, maintainability is crucial. If the code base is messy, it will significantly slow down the development process. Introducing new features will be expensive because they are hard to weave in. Throwing more staff at the project will only result in a growing mess and productivity yet decreases. This scenario is described in Robert C. Martin book Clean Code. He calls it “the total cost of owning a mess”. The cost is that productivity will tend to drive towards zero over time.

alt text

Methods and practices

Software engineering practices and methods have great influence on the level of software quality. Some of the most important are:

  • Agile software development
  • Domain driven design
  • Type driven development
  • Iterative software development
  • Code reviews
  • TDD
  • Refactoring
  • Automated testing
  • Continuous integration
  • Issue tracking
  • Source code conventions
  • Penetration testing

Design principles

Here is a set of design principles that are closely related to high maintainability:

  • Don’t repeat yourself (DRY)
  • Keep it simple stupid (KISS)
  • SOLID principles
    • Single responsibility principle (SRP)
    • Open closed principle (OCP)
    • Liskov substitution principle (LSP)
    • Interface segregation principle (ISP)
    • Dependency inversion principle (DIP)
  • Separation of concerns (SoC)
  • Favor composition over inheritance (FCoI)
  • You ain’t gonna need it (YAGNI)

Metrics

In order to find a starting point for improvement of quality and maintainability of the code, it can be valuable to examine some code metrics.

The most important code metrics are:

  • Lack of cohesion of methods (LCOM)
  • Afferent and efferent coupling (Ca, Ce)
  • Cyclomatic complexity (CC)
  • Lines of code (LOC)
  • Number of methods / fields per class (NbMethods, NbFields)
  • Abstractness (A)
  • Instability (I)
  • Distance from the main sequence (D)

There are several tools for deriving those metrics like NDepend (.NET) or Sonar (Java).

Software Tests

For good software quality it is essential to have a good testing strategy which should include:

  • Unit tests
  • Integration tests
  • System tests
  • Regression tests

There are tools that help with testing like

  • Unit testing frameworks
  • Mocking frameworks
  • Test coverage tools

Additional concerns

Make invalid state unrepresentable

Making invalid state unrepresentable is a very valuable way to make a software maintainable, testable and robust. Here is an example of a poor type design:

public class Contact
{
  public string FirstName { get; set; }
  public string MiddleName { get; set; }
  public string LastName { get; set; }

  public string EmailAddress { get; set; }
  public bool IsEmailVerified { get; set; }
}

If this was a domain model from the business layer, then there are several problems. It is not clear

  • which values are optional / required
  • what the constraints are
  • which fields are linked together
  • what the domain logic is

E.g. possible invalid states would be, all fields are NULL, EmailAddress is NULL and IsEmailVerified is true, last name is missing, etc.

These problems can be solved by a good design of the types, which will improve the software quality a great deal.

The problem with NULL

Even though most software developers got very much used to it NULL is a problem. In this post the author describes how NULL

  • subverts types
  • is sloppy
  • is a special case
  • makes poor APIs
  • exacerbates poor language decisions
  • is difficult to debug
  • is non-composable

Software quality can be improved by choosing a language that does not know NULL like e.g. F# or Haskell.

The problem with state

Managing state can become complex very easily. When passing a mutable object to a function, it can potentially be changed by that function. The function signature might or might not imply this. Anyway, there is no way to be sure that the object won’t be changed. If the object is passed to many functions from many different modules, it will become very hard to keep track.

If state is even mutated from different threads it becomes even more difficult. Using locks is only a poor workaround that results in more complex and less maintainable code.

Here immutability comes to the rescue. Some languages support immutable data structures by default as e.g. F#, Haskell, Clojure and other functional programming languages.

With immutable data structures it is much easier to reason about the correctness of the code. Because of referential transparency there can’t be implicit side effects3. The order of function calls doesn’t matter and the code becomes much more predictable. Less tests are needed because functions can be tested in isolation. We do not have to test a function in the context of all possible or potential system states which will result in a better product efficiency.

Also concurrency is much simpler because we do not have to worry about concurrent updates.

The programming language

Also the choice of programming language can affect the quality of the software. The following language features can improve code quality and product efficiency:

  • A good type system
  • Type inference
  • Concise syntax
  • Pattern matching
  • Immutability by default
  • Structural equality by default
  • Scriptability
  • Ability for rapid prototyping

A closer look at dependencies

Visualizations of the dependencies between types in a software system can be a good measure for quality besides code metrics. In this article the author presents some interesting examples.

Here are two graphs of the two libraries SpecFlow and TickSpec which both have a similar sized feature set.

SpecFlow:
alt text4

TrickSpec:
alt text5

The author argues that the complexity of the code is closely related to the programming language. TrickSpec was implemented in F# and shows a clear and clean dependency graph. On the other hand SpecFlow is a well designed library in C#, but still might be more difficult to maintain because of the more complex dependencies.

Conclusion

Software quality is an important but complex concern.

You can see this post as a broad summary of the topic. It can serve as a starting point, a check list, or a guide line when you make considerations about software quality.

Resources


The header image by Luis Llerena is licensed under a UNSPLASH LICENSE. The picture was modified to fit this article.


  1. Clean Code 
  2. ISO/IEC 9126 Software engineering — Product quality 
  3. Note that side effects in this context refers to the object(s) passed to a function. This disregards other side effects like database updates, console outputs, sending emails, etc. 
  4. The SpecFlow dependency graph by Scott Wlaschin is licensed under CC-BY-NC
  5. The TrickSpec dependency graph by Scott Wlaschin is licensed under CC-BY-NC

Our Hackathon from the ChatGPT

Our team meets at regular intervals to work on projects together, to programme or to hold a “hackathon”. It is important to us that every

How to write cleaner parsing code

When tasked to extract information from a string, most seasoned developers – especially the ones with some kind of Linux background – will resort to