Complexity in Software

When it comes to complexity, somewhere between Hello World and Google exists every other piece of software ever written. What makes one piece of software more complex than another? Lines of code might be an obvious answer especially considering the link above but I believe that complexity causes an increase in the number of lines of code not the other way around.

The number of components that need to play well together is directly proportional to complexity. Web applications need a database, a back end, and a front end. The front end is rendered, it asks the back end for data, the back end retrieves the data from the database, formats it, and sends it to the front end. Let’s think of software in terms of MVC. Even though the front end and the back end both seem to utilize a variation of MVC, for our purposes, we’ll consider the database to be the model, the front end to be the view, and the back end to be the controller. Most features involve changing all 3 components and they all need to work well, and work well together, for the feature to work. This adds complexity. Not only because there’s code whose sole purpose is to communicate with the other layers, but also because you need to test all three components together.

This applies to all software of the same category though but some pieces of software are inherently more complex than others. Let’s take the current side project I’m working on. It is, in its core, a very CRUDy application. Users are presented with things, they can save said things, and the things they saved are presented back to them. Simple, right? Well, sure, but that’s just the techy elevator pitch. The things need to be presented nicely, they need to be laid out in a meaningful way, the pages need to make sense, the things need to have other meta data associated with them, the things need to be imported from 3rd parties, and there needs to be a way to keep the things up to date. The list goes on, and on, and on.

The number of features in a piece of software is an indicator of complexity but it’s not enough on its own since features have different complexities. A list of featured items is not as complex as a recommendation engine. In the former, data needs to be retrieved from the database, aggregated, and displayed. A recommendation engine, assuming one is written from scratch, requires certain usage metrics to be saved and an algorithm to be written that determines what people are interested in with a lot of planning involved.

If lines of code are caused by complexity, and features don’t correlate 1:1 with complexity, then what determines complexity in a piece of software? Requirements do. Features have multiple requirements. Generally speaking, the more complex a feature is, the more requirements it has and/or the more complex each requirement is.

Let’s take the two example features above as an example. Both features display a set of things to the user. The first, a list of featured things, might have the following requirements:

  • Featured things must refresh every 10 minutes.
  • Featured things are things that are saved most recently by users.

Now let’s look at the recommendation engine’s requirements:

  • Each user sees a recommendation based on their own activity.
  • Similar things are determined based on what users save together.

They both have two requirements, how can we say that one is more complex than the other? That’s because requirements don’t map directly to code. The second requirement here involves creating new tables, code to save user activity, and a relatively-complex query to determine which items are saved together. More granular requirements and subtasks of a feature provide these details.

Once we get to very granular details like “create users table”, “create API endpoint to sign in”, “create signin page”, we have a pretty good idea of what the complexity will be but almost no one bothers to create such granular subtasks beforehand and that’s why when a developer sets out to build a “simple” application, they underestimate the complexity involved. This is also what makes estimating so hard but that’s another topic for another day.