Your Code as a Crime Scene

Visualize Hotspots

Large-scale systems will have massive amounts of analysis data. Even if Code Maat identifies the hotspots, it will still be hard to compare subsystems against each other or detect other trends, such as clusters of volatile modules. We need more help.

Visualizations are powerful when you have to make sense of large data sets. Our human brain is an amazing pattern-matching machine. The amount of visual information we’re able to process is astonishing. Let’s tap into all that brain power.

Use Circle Packing for Large Systems

We haven’t identified the hotspots in Hibernate yet. But let’s sneak ahead and see where we’re heading. Here’s how our Hibernate data looks in an enclosure diagram (a visualization form that works well for large systems):

Look at all those nested circles. Enclosure diagrams are based on a geometric layout algorithm called circle packing. Each circle represents a part of the system. The more complex a module, as measured by lines of code, the larger the circle. And the more effort we spend on a module, as measured by its number of revisions, the more intense its color.

Even if you don’t know anything about Hibernate, the visualization gives you an entry point into understanding the system. In the preceding figure, you can see both the good and the fragile parts of the codebase. And that’s even before you actually look at the code. Can you think of a better starting point as you enter a large-scale project? Let’s see how you collect and interpret all that information.

Mining Hibernate

The steps used to mine Hibernate are identical to the ones you learned earlier in Chapter 3, Creating an Offender Profile.

This time, we use the size of the codebase as a proxy for complexity. We determine the code size with cloc:

prompt>​ cloc ./ --unix --by-file --csv --quiet --report-file=hib_lines.csv​

The change frequencies of the modules are used to represent effort. These are calculated with Code Maat:

prompt>​ maat -l hib_evo.log -c git -a revisions > hib_freqs.csv​

Combining the two views gives you the now-familiar overlap between complexity and effort—the hotspots:

	prompt> python scripts/merge_comp_freqs.py hib_freqs.csv hib_lines.csv
	module,revisions,code
	build.gradle,79,402
	hibernate-core/.../persister/entity/AbstractEntityPersister.java,44,3983
	hibernate-core/.../cfg/Configuration.java,40,2673
	hibernate-core/.../internal/SessionImpl.java,39,2097
	hibernate-core/.../internal/SessionFactoryImpl.java,34,1384
	...

The results we just got form the basis of the visualization in the preceding figure; it’s just another view of the same data.

Previous Chapter

Analyze a Large Codebase

Next Chapter

Explore the Visualization

Table of Contents for Your Code as a Crime Scene

Visualize Hotspots

Use Circle Packing for Large Systems

Mining Hibernate

Table of Contents for
Your Code as a Crime Scene