Table of Contents for
Your Code as a Crime Scene

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Your Code as a Crime Scene by Adam Tornhill Published by Pragmatic Bookshelf, 2015
  1. Title Page
  2. Your Code as a Crime Scene
  3. Your Code as a Crime Scene
  4. For the Best Reading Experience...
  5. Table of Contents
  6. Early praise for Your Code as a Crime Scene
  7. Foreword by Michael Feathers
  8. Acknowledgments
  9. Chapter 1: Welcome!
  10. About This Book
  11. Optimize for Understanding
  12. How to Read This Book
  13. Toward a New Approach
  14. Get Your Investigative Tools
  15. Part 1: Evolving Software
  16. Chapter 2: Code as a Crime Scene
  17. Meet the Problems of Scale
  18. Get a Crash Course in Offender Profiling
  19. Profiling the Ripper
  20. Apply Geographical Offender Profiling to Code
  21. Learn from the Spatial Movement of Programmers
  22. Find Your Own Hotspots
  23. Chapter 3: Creating an Offender Profile
  24. Mining Evolutionary Data
  25. Automated Mining with Code Maat
  26. Add the Complexity Dimension
  27. Merge Complexity and Effort
  28. Limitations of the Hotspot Criteria
  29. Use Hotspots as a Guide
  30. Dig Deeper
  31. Chapter 4: Analyze Hotspots in Large-Scale Systems
  32. Analyze a Large Codebase
  33. Visualize Hotspots
  34. Explore the Visualization
  35. Study the Distribution of Hotspots
  36. Differentiate Between True Problems and False Positives
  37. Chapter 5: Judge Hotspots with the Power of Names
  38. Know the Cognitive Advantages of Good Names
  39. Investigate a Hotspot by Its Name
  40. Understand the Limitations of Heuristics
  41. Chapter 6: Calculate Complexity Trends from Your Code’s Shape
  42. Complexity by the Visual Shape of Programs
  43. Learn About the Negative Space in Code
  44. Analyze Complexity Trends in Hotspots
  45. Evaluate the Growth Patterns
  46. From Individual Hotspots to Architectures
  47. Part 2: Dissect Your Architecture
  48. Chapter 7: Treat Your Code As a Cooperative Witness
  49. Know How Your Brain Deceives You
  50. Learn the Modus Operandi of a Code Change
  51. Use Temporal Coupling to Reduce Bias
  52. Prepare to Analyze Temporal Coupling
  53. Chapter 8: Detect Architectural Decay
  54. Support Your Redesigns with Data
  55. Analyze Temporal Coupling
  56. Catch Architectural Decay
  57. React to Structural Trends
  58. Scale to System Architectures
  59. Chapter 9: Build a Safety Net for Your Architecture
  60. Know What’s in an Architecture
  61. Analyze the Evolution on a System Level
  62. Differentiate Between the Level of Tests
  63. Create a Safety Net for Your Automated Tests
  64. Know the Costs of Automation Gone Wrong
  65. Chapter 10: Use Beauty as a Guiding Principle
  66. Learn Why Attractiveness Matters
  67. Write Beautiful Code
  68. Avoid Surprises in Your Architecture
  69. Analyze Layered Architectures
  70. Find Surprising Change Patterns
  71. Expand Your Analyses
  72. Part 3: Master the Social Aspects of Code
  73. Chapter 11: Norms, Groups, and False Serial Killers
  74. Learn Why the Right People Don’t Speak Up
  75. Understand Pluralistic Ignorance
  76. Witness Groupthink in Action
  77. Discover Your Team’s Modus Operandi
  78. Mine Organizational Metrics from Code
  79. Chapter 12: Discover Organizational Metrics in Your Codebase
  80. Let’s Work in the Communication Business
  81. Find the Social Problems of Scale
  82. Measure Temporal Coupling over Organizational Boundaries
  83. Evaluate Communication Costs
  84. Take It Step by Step
  85. Chapter 13: Build a Knowledge Map of Your System
  86. Know Your Knowledge Distribution
  87. Grow Your Mental Maps
  88. Investigate Knowledge in the Scala Repository
  89. Visualize Knowledge Loss
  90. Get More Details with Code Churn
  91. Chapter 14: Dive Deeper with Code Churn
  92. Cure the Disease, Not the Symptoms
  93. Discover Your Process Loss from Code
  94. Investigate the Disposal Sites of Killers and Code
  95. Predict Defects
  96. Time to Move On
  97. Chapter 15: Toward the Future
  98. Let Your Questions Guide Your Analysis
  99. Take Other Approaches
  100. Let’s Look into the Future
  101. Write to Evolve
  102. Appendix 1: Refactoring Hotspots
  103. Refactor Guided by Names
  104. Bibliography
  105. You May Be Interested In…

Analyze Complexity Trends in Hotspots

In a healthy codebase, you can add new features with successively less effort. Unfortunately, the reverse is often true: new features add complexity to an already tricky design. Eventually, the system breaks down, and development slows to a crawl.

This phenomenon was identified and formalized by Manny Lehman[22] in a set of laws on software evolution. In his law of increasing complexity, Lehman states that “as an evolving program is continually changed, its complexity, reflecting deteriorating structure, increases unless work is done to maintain or reduce it.” (See On Understanding Laws, Evolution, and Conservation in the Large-Program Life Cycle [Leh80].)

You already know about hotspot analyses to identify these “deteriorating structures” so that you can react and reduce complexity. But how do we know if we are improving the code over time or just contributing to the grand decline? Let’s see how we uncover complexity trends in our programs.

Use Indentation to Analyze Complexity Trends

An indentation analysis is fast and simple. That means it scales to a range of revisions without eating up your precious time. Of course, you may well wonder if different indentation styles could affect the results. Let’s look into that.

This chapter has its theoretical foundations in the study Reading Beside the Lines: Indentation as a Proxy for Complexity Metric. Program Comprehension, 2008. ICPC 2008. The 16th IEEE International Conference on [HGH08]. That research evaluated indentation-based complexity in 278 projects. They found that indentation is relatively uniform and regular. Their study also suggests that deviating indentations don’t affect the results much.

The explanation is also the reason the technique works in the first place: indentation improves readability. It aligns closely with underlying coding constructs. We don’t just indent random chunks of code (unless we’re competing in the International Obfuscated C Code Contest).[23]

Similarly, it doesn’t really matter if we indent two or four spaces. However, a change in indentation style midway through the analysis could disturb your results. For example, running an auto-indent program on your codebase would wreck its history and show an incorrect complexity trend. If you are in that situation, you can’t compare revisions made before and after the change in indentation practices.

Even if individual indentation styles don’t affect the analysis results as much as we’d think, it’s still a good idea to keep a uniform style as it helps build consistency. With that sorted out, let’s move on to an actual analysis.

Focus on a Range of Revisions

You’ve already seen how to analyze a single revision. Now we want to:

  1. Take a range of revisions for a specific module.

  2. Calculate the indentation complexity of the module as it occurred in each revision.

  3. Output the results revision by revision for further analysis.

With version-control systems, we can roll back to historical versions of our code and run complexity analyses on them. For example, in git we look at historical versions with the show command.

The receipe for a trend analysis is pretty straightforward, although it requires some interactions with the version-control system. Since this book isn’t about git or even version-control systems, we’re going to skip over the actual implementation details and just use the script already in your scripts directory. Don’t worry, I’ll walk you through the main steps to understand what’s happening so that you can perform your own analysis on your code.

Discover the Trend

In your cloned Hibernate Git repository, type the following into the command prompt (and remember to reference your own scripts path) to run git_complexity_trend.py:

 
prompt>​ python scripts/git_complexity_trend.py \
 
--start ccc087b --end 46c962e \
 
--file hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java
 
rev,n,total,mean,sd
 
e75b8a7,3080,7610,2.47,1.76
 
23a6280,3092,7649,2.47,1.76
 
8991100,3100,7658,2.47,1.76
 
8373871,3101,7658,2.47,1.76
 
...

This looks cryptic at first. What just happened is that we specified a range of revisions determined by the --start and --end flags. Their arguments represent our analysis period, as we see in the following image.

images/Chp7_RevisionRangeLog.png

After that, you gave the name of the --file to analyze. In this case, we focus on our suspect, Configuration.java.

The analysis generates CSV output similar to the file you got during the earlier single-module analysis. The difference here is that we get the complexity statistics for each historical revision of the code. The first column specifies the commit hash from each revision’s git code. Let’s visualize the result to discover trends.

Visualize the Complexity Trend

Spreadsheets are excellent for visualizing CSV files. Just save the CSV output into a file and import it into Excel, OpenOffice, or a similar application of your choice.

Let’s look at the total complexity growth first. That would be the total column.

images/Chp7_ConfigTotalTrend.png

As you can see in the image, Configuration.java accumulated complexity over time.

This growth can occur in two basic ways:

  1. New code is added to the module.

  2. Existing code is replaced by more complex code.

images/Chp7_ConfigSdTrend.png

Case 2 is particularly worrisome—that’s the “deteriorating structure” Lehman’s law warned us about. We calculated the standard deviation (in the sd column) to differentiate between these two cases. Let’s see how it looks.

The standard deviation decreases. This means lines get more alike in terms of complexity, and it is probably a good thing. If you look at the mean, you see that it, too, decreases. Let’s see what that means for your programs.