Table of Contents for
Your Code as a Crime Scene

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Your Code as a Crime Scene by Adam Tornhill Published by Pragmatic Bookshelf, 2015
  1. Title Page
  2. Your Code as a Crime Scene
  3. Your Code as a Crime Scene
  4. For the Best Reading Experience...
  5. Table of Contents
  6. Early praise for Your Code as a Crime Scene
  7. Foreword by Michael Feathers
  8. Acknowledgments
  9. Chapter 1: Welcome!
  10. About This Book
  11. Optimize for Understanding
  12. How to Read This Book
  13. Toward a New Approach
  14. Get Your Investigative Tools
  15. Part 1: Evolving Software
  16. Chapter 2: Code as a Crime Scene
  17. Meet the Problems of Scale
  18. Get a Crash Course in Offender Profiling
  19. Profiling the Ripper
  20. Apply Geographical Offender Profiling to Code
  21. Learn from the Spatial Movement of Programmers
  22. Find Your Own Hotspots
  23. Chapter 3: Creating an Offender Profile
  24. Mining Evolutionary Data
  25. Automated Mining with Code Maat
  26. Add the Complexity Dimension
  27. Merge Complexity and Effort
  28. Limitations of the Hotspot Criteria
  29. Use Hotspots as a Guide
  30. Dig Deeper
  31. Chapter 4: Analyze Hotspots in Large-Scale Systems
  32. Analyze a Large Codebase
  33. Visualize Hotspots
  34. Explore the Visualization
  35. Study the Distribution of Hotspots
  36. Differentiate Between True Problems and False Positives
  37. Chapter 5: Judge Hotspots with the Power of Names
  38. Know the Cognitive Advantages of Good Names
  39. Investigate a Hotspot by Its Name
  40. Understand the Limitations of Heuristics
  41. Chapter 6: Calculate Complexity Trends from Your Code’s Shape
  42. Complexity by the Visual Shape of Programs
  43. Learn About the Negative Space in Code
  44. Analyze Complexity Trends in Hotspots
  45. Evaluate the Growth Patterns
  46. From Individual Hotspots to Architectures
  47. Part 2: Dissect Your Architecture
  48. Chapter 7: Treat Your Code As a Cooperative Witness
  49. Know How Your Brain Deceives You
  50. Learn the Modus Operandi of a Code Change
  51. Use Temporal Coupling to Reduce Bias
  52. Prepare to Analyze Temporal Coupling
  53. Chapter 8: Detect Architectural Decay
  54. Support Your Redesigns with Data
  55. Analyze Temporal Coupling
  56. Catch Architectural Decay
  57. React to Structural Trends
  58. Scale to System Architectures
  59. Chapter 9: Build a Safety Net for Your Architecture
  60. Know What’s in an Architecture
  61. Analyze the Evolution on a System Level
  62. Differentiate Between the Level of Tests
  63. Create a Safety Net for Your Automated Tests
  64. Know the Costs of Automation Gone Wrong
  65. Chapter 10: Use Beauty as a Guiding Principle
  66. Learn Why Attractiveness Matters
  67. Write Beautiful Code
  68. Avoid Surprises in Your Architecture
  69. Analyze Layered Architectures
  70. Find Surprising Change Patterns
  71. Expand Your Analyses
  72. Part 3: Master the Social Aspects of Code
  73. Chapter 11: Norms, Groups, and False Serial Killers
  74. Learn Why the Right People Don’t Speak Up
  75. Understand Pluralistic Ignorance
  76. Witness Groupthink in Action
  77. Discover Your Team’s Modus Operandi
  78. Mine Organizational Metrics from Code
  79. Chapter 12: Discover Organizational Metrics in Your Codebase
  80. Let’s Work in the Communication Business
  81. Find the Social Problems of Scale
  82. Measure Temporal Coupling over Organizational Boundaries
  83. Evaluate Communication Costs
  84. Take It Step by Step
  85. Chapter 13: Build a Knowledge Map of Your System
  86. Know Your Knowledge Distribution
  87. Grow Your Mental Maps
  88. Investigate Knowledge in the Scala Repository
  89. Visualize Knowledge Loss
  90. Get More Details with Code Churn
  91. Chapter 14: Dive Deeper with Code Churn
  92. Cure the Disease, Not the Symptoms
  93. Discover Your Process Loss from Code
  94. Investigate the Disposal Sites of Killers and Code
  95. Predict Defects
  96. Time to Move On
  97. Chapter 15: Toward the Future
  98. Let Your Questions Guide Your Analysis
  99. Take Other Approaches
  100. Let’s Look into the Future
  101. Write to Evolve
  102. Appendix 1: Refactoring Hotspots
  103. Refactor Guided by Names
  104. Bibliography
  105. You May Be Interested In…

Find the Social Problems of Scale

In the first parts of this book, we discussed large codebases and how we fail to get a holistic view of them. We just can’t keep it all in a single brain. We recognize when we suffer from quality problems or when the work takes longer than we’d expect it to, but we don’t know why.

The reasons go beyond technical difficulties and include an organizational component as well. On many projects, the organizational aspects alone determine success or failure. Let’s understand them better.

Know the Difference Between Open-Source and Proprietary Software

So far we’ve used real-world examples for all our analyses. The problems we’ve uncovered are all genuine. But when it comes to the people side, it gets harder to rely on open-source examples because the projects don’t have a traditional corporate organization.

Open-source projects are self-selected communities, which creates different motivational forces for the developers. In addition, open-source projects tend to have relatively flat and simple communication models. As a result, research on the subject has found that Brooks’s law doesn’t hold up as well: the more developers involved in an open-source project, the more likely that the project will succeed (source: Brooks’ versus Linus’ law: an empirical test of open source projects [SEKH09]).

However, there are other aspects to consider. In a study on Linux, researchers found that “many developers changing code may have a detrimental effect on the system’s security” (source: Secure open source collaboration: an empirical study of Linus’ law [MW09]). More specifically, with more than nine developers, the modules are sixteen times more likely to contain security flaws. The result just means that open source cannot evade human nature; we pay a price for parallel development in that setting, too.

Anyway, we’ll need to pretend a little in the following case studies. We need to pretend that the open-source projects are developed by a traditional organization. The intent is to show you how the analyses work so that you can use them on your own systems. Proprietary or not, the analyses are the same, but the conclusions may vary. With that in mind, let’s get started!

Understand How Hotspots Attract Multiple Authors

Adding more people to a project isn’t necessarily bad as long as we can divide our work in a meaningful way. The problems start when our architecture fails to sustain all developers.

We touched on the problem as we investigated hotspots. Hotspots frequently arise from code that accumulates responsibilities. That means programmers working on independent features are forced into the same part of the code. (Hotspots are the traffic jams of the software world.)

images/Chp12_MultipleAuthors.png

When multiple programmers make changes in parallel to the same piece of code, things often go wrong. We risk conflicting changes and inconsistencies, and we fail to build mental models of the volatile code.

If we want to work effectively on a larger scale, we need to ensure a certain isolation. Here’s how you find that information.

Analyze Your Code for Multiple Authors

As you can see in the following figure, each commit contains information about the programmer who made the change. Just as we calculated modification frequencies to determine hotspots, let’s now calculate author frequencies of our modules.

images/Chp12_AuthorsGitLog.png

In this case study, we’ll move back to Hibernate because the project has many active contributors. You can reuse your hib_evo.log log file if you still have it. Otherwise, just create a new one, as we did back in Generate a Version-Control Log.

Use the authors analysis to discover the modules that are shared between multiple programmers:

 
prompt>​ maat -l hib_evo.log -c git -a authors
 
entity,n-authors,n-revs
 
../persister/entity/AbstractEntityPersister.java,14,44
 
libraries.gradle,11,28
 
../internal/SessionImpl.java,10,39
 
../loader/Loader.java,10,23
 
../mapping/Table.java,9,28
 
...

The results show all modules in Hibernate, sorted by their number of authors. The interesting information is in the n-authors column, which shows the number of programmers who have committed changes to the module.

As you see, the AbstractEntityPersister.java class is shared between fourteen different authors. That may be a problem. Let’s see why.

Learn the Value of Organizational Metrics

In an impressive research effort, a team of researchers investigated one of the largest pieces of software ever written: Windows Vista. The project was investigated for the links between product quality and organizational structure. (Read about the research in The Influence of Organizational Structure on Software Quality [NMB08].) The researchers found that organizational metrics outperform traditional measures, such as code complexity or code coverage. In fact, the organizational structure of the programmers that create the software is a better predictor of defects than any property of the code itself!

One of these super-metrics was the number of programmers who worked on each component. The more parallel work, the more defects in that code. This is similar to the analysis you just performed on Hibernate. Let’s dig deeper.