You now have the data to trace the spatial movements of programmers within the code. In Chapter 2, Code as a Crime Scene, we pointed out the importance of combining that data with a complexity dimension. Let’s see where the complexity is hiding in this code.
We’re going to use lines of code as a proxy for software complexity. Lines of code is a terrible metric, but the other ones are just as bad. (See the research by Herraiz and Hassan in Making Software [OW10] for a comparison of complexity metrics.) Using lines of code at least gives us some advantages:
It’s fast and simple. More elaborate metrics need to understand the language they’re processing. That means they need to parse the code, which may take some time. Lines of code is a fast way to get the same approximation of complexity.
It’s language-neutral. Language neutrality is the main reason I prefer lines of code. In today’s polyglot systems, sophisticated metrics lose their meaning. As we start to parse individual language constructs, we lose the opportunity for cross-language comparisons. For example, web applications often combine HTML, CSS, and JavaScript in addition to a server-side technology, such as Java, C#, or Clojure. A language-neutral metric lets us get a holistic picture of all these parts, no matter what language they’re written in.
We can always turn to language-specific techniques later to get more details on hotspots. Similarly, we can use any metric to represent complexity. For now, let’s summarize the lines of code in our system.
Many tools count lines of code. My favorite is cloc. It’s free and easy to use. You can get a copy of cloc on SourceForge.[11]
With cloc installed, let’s put it to work:
| | prompt> cloc ./ --unix --by-file --csv --quiet |
| | |
| | language,filename,blank,comment,code |
| | Clojure,./src/code_maat/analysis/logical_coupling.clj,23,14,145 |
| | Clojure,./test/code_maat/end_to_end/scenario_tests.clj,23,19,117 |
| | Clojure,./src/code_maat/analysis/churn.clj,14,11,99 |
| | Clojure,./src/code_maat/app/app.clj,13,6,94 |
| | Clojure,./test/code_maat/analysis/logical_coupling_test.clj,15,5,89 |
| | ... |
Here we told cloc to count all files in the code-maat directory. We also specified that we want statistics --by-file (the alternative is a summary) and --csv output. As you can see in the following figure, cloc does a good job of detecting the programming language the code is written in.

Based on language, cloc separates lines containing comments from real code. We don’t want blank lines or comments in our analysis.