Consulting

Why We Code Review Before We Estimate

Mercedes Bernard — Tandem Alum

We don’t always build greenfield products. We love working with clients who have big dreams for their current systems. Before we start working with an existing application, we like to have the opportunity to audit its current state and do a code review. There’s so much to be learned from what’s already there and more information always leads to more accurate estimates.

In this post, I outline a variety of things we look at during a code review, what they tell us about the application, and how that may affect our prioritization decisions, recommendations, and/or estimates.

Benefits of robust code review

When we have a holistic understanding of your application, we’re able to provide informed recommendations of the technical areas you should invest in and which areas should be deprecated. Technical debt is a fact of software life, but we can help you know where it’s worth paying down and where it’s a sunk cost.

Everything that we look at during a code review is in service of making sound recommendations driven by your business goals while also providing more accurate estimates of the effort it will take to deliver on those goals. We review many different aspects of your application, but we don’t judge what we find. Our code reviews are an opportunity to learn about your application’s history, which helps us prepare for your application’s future.

Does it run?

Before we begin our code review, the first thing we try is to get the application running on our local machines. If we’re unable to get the application running within an hour or two there may be a bigger issue lurking. We don’t want to spend valuable code review time debugging setup issues. We’ll reach out to your internal team but if no one is able to run the code, we start having larger conversations about stabilizing the code base before estimating new features for an unstable application.

Language/framework audit

Once we have the code base running, we start by auditing the language and framework version(s) of the application to determine its current support status. Running old versions of languages and frameworks can open your application up to unpatched security vulnerabilities. Language and framework maintainers will deprecate older versions and no longer release updates after a certain date. Using out-of-date versions may also limit our ability to deliver on your dreams if we’re constrained by a lack of features in legacy versions.

Infrastructure audit

Following an audit of the backbone of the application, we look at the supporting infrastructure to audit the versions, support, and necessary dependencies. We pay special attention to the database (the powerhouse of an application) but also look at other infrastructure you may have such as your storage services, caches, message brokers, etc. As before, we are auditing version support and whether or not the infrastructure is currently supported.

We are also making note of any interesting dependencies and considerations to be aware of with the chosen infrastructure. For example, if we discover an Oracle database hooked up to the Rails app, we may do a little extra research to document any limitations since Oracle isn’t a standard choice for Rails.

Finally, we pay attention to how the infrastructure is configured to note any architectural considerations we have when estimating and planning your project.

Environment audit

The next step in our code review is to learn about the various configured environments for your application and their automated continuous integration and continuous delivery (CI/CD) pipelines or lack thereof. Understanding the environments and how they fit together is key to planning developer workflows, as well as determining quality assurance (QA) and user acceptance testing (UAT) processes.

If there are no automated CI/CD pipelines configured, we plan to budget time for setting those up. Automated CI/CD pipelines increase developer productivity, decrease regressions, increase code quality, and shorten time to market by making it easier to deploy code early and often.

Data model and database design

Then we look at the database and the relationships between the data models. We pay attention to how normalized the data is, or how limited the data redundancy is. We also pay attention to what data constraints and validations are present. A strong combination of normalization and constraints protect the data from anomalies and the application from bugs due to out of sync relationships.

A database that is very denormalized or lacks constraints informs us to budget time for remediating these concerns or for writing extra validation and error handling into the application code to prevent dirty data that could cause hard to find application bugs.

Automated static analysis

After gaining an understanding of how the pieces of an application fit together, we dig into the details of the code itself. We begin by using a variety of tools for automated static analysis. Static analysis focuses on the structure of the code and established software practices. We look at an array of metrics within the code base to give us a high-level understanding of how easy it will be to work with and how ready it is for extending.

Lines of code

The simplest metric to measure is the size of the code base. Size is helpful to consider when estimating so we can account for the time it will take to gain context and understand the various parts of the application.

Next, we take a look at class and method length. Long classes and methods take longer to understand and can be harder to refactor since side effects are more likely present in long methods.

Duplication

We can also run checks to see how DRY (Don’t Repeat Yourself) code is. Code that is DRY means that code is not duplicated in multiple places within the application. This minimizes the number of places where a change would need to be made, which limits the number of opportunities to introduce bugs into the code.

Opportunities for refactoring

Then we measure complexity and churn. On their own, neither one of those metrics tells us much but taken together, they can tell us where the most appropriate candidates for refactoring are based on how likely it is that a regression will be accidentally introduced.

Test coverage

The last metric that we use static analysis tools to measure is test coverage. Test coverage is a quick measure for us to determine our confidence in changing the code base.

Taken together, these numbers can help us decide how stable we feel the code is and how quickly we’ll be able to move within the code base, adding and refactoring functionality.

Automated security audit

While a full security audit is often out of scope for this type of code review, there are some static analysis tools we can use that can identify known security vulnerabilities. Using a tool like this can help us plan for remediating any high priority vulnerabilities in the application.

Human analysis

Static analysis is great for quickly understanding the structure of a code base, but there are some cases where we can’t substitute the keen eye of an experienced engineer. We like to examine the code to see how closely it follows established conventions and idioms of it’s framework.

Conventions

Code that is highly idiomatic makes it easier to onboard engineers because they can quickly navigate the code base knowing where to look for files and functionality based on past experience.

Dead code

We also like to see how much of the code in an application is unused. Unused code, or dead code, can affect the overall health of a code base. We may be investing time stabilizing dead areas when we should remove dead code so that we have more time to devote to the lively areas of the application.

Dependencies

Then we examine the major dependencies in an application to check that they are actively maintained. If we find dependencies that are not actively maintained, we may budget time for replacing those dependencies with modern, maintained alternatives so that the project can continue to be upgraded and features continually added.

Documentation

The last bit that we like to rely on human analysis for is reading the project documentation. If a project is not well documented, we may take that as a risk that there are potentially unknown features and functionality to be prepared for.

Dynamic analysis

Finally, we like to run the application locally and use performance profiling tools to determine if there are any risks to your scaling goals that should be addressed earlier rather than later. If we notice poor performance in an important area of the application, we may recommend addressing that early in order to be able to meet the load demands once the project is launched.

Mercedes Bernard