Table of Contents for
Your Code as a Crime Scene

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Your Code as a Crime Scene by Adam Tornhill Published by Pragmatic Bookshelf, 2015
  1. Title Page
  2. Your Code as a Crime Scene
  3. Your Code as a Crime Scene
  4. For the Best Reading Experience...
  5. Table of Contents
  6. Early praise for Your Code as a Crime Scene
  7. Foreword by Michael Feathers
  8. Acknowledgments
  9. Chapter 1: Welcome!
  10. About This Book
  11. Optimize for Understanding
  12. How to Read This Book
  13. Toward a New Approach
  14. Get Your Investigative Tools
  15. Part 1: Evolving Software
  16. Chapter 2: Code as a Crime Scene
  17. Meet the Problems of Scale
  18. Get a Crash Course in Offender Profiling
  19. Profiling the Ripper
  20. Apply Geographical Offender Profiling to Code
  21. Learn from the Spatial Movement of Programmers
  22. Find Your Own Hotspots
  23. Chapter 3: Creating an Offender Profile
  24. Mining Evolutionary Data
  25. Automated Mining with Code Maat
  26. Add the Complexity Dimension
  27. Merge Complexity and Effort
  28. Limitations of the Hotspot Criteria
  29. Use Hotspots as a Guide
  30. Dig Deeper
  31. Chapter 4: Analyze Hotspots in Large-Scale Systems
  32. Analyze a Large Codebase
  33. Visualize Hotspots
  34. Explore the Visualization
  35. Study the Distribution of Hotspots
  36. Differentiate Between True Problems and False Positives
  37. Chapter 5: Judge Hotspots with the Power of Names
  38. Know the Cognitive Advantages of Good Names
  39. Investigate a Hotspot by Its Name
  40. Understand the Limitations of Heuristics
  41. Chapter 6: Calculate Complexity Trends from Your Code’s Shape
  42. Complexity by the Visual Shape of Programs
  43. Learn About the Negative Space in Code
  44. Analyze Complexity Trends in Hotspots
  45. Evaluate the Growth Patterns
  46. From Individual Hotspots to Architectures
  47. Part 2: Dissect Your Architecture
  48. Chapter 7: Treat Your Code As a Cooperative Witness
  49. Know How Your Brain Deceives You
  50. Learn the Modus Operandi of a Code Change
  51. Use Temporal Coupling to Reduce Bias
  52. Prepare to Analyze Temporal Coupling
  53. Chapter 8: Detect Architectural Decay
  54. Support Your Redesigns with Data
  55. Analyze Temporal Coupling
  56. Catch Architectural Decay
  57. React to Structural Trends
  58. Scale to System Architectures
  59. Chapter 9: Build a Safety Net for Your Architecture
  60. Know What’s in an Architecture
  61. Analyze the Evolution on a System Level
  62. Differentiate Between the Level of Tests
  63. Create a Safety Net for Your Automated Tests
  64. Know the Costs of Automation Gone Wrong
  65. Chapter 10: Use Beauty as a Guiding Principle
  66. Learn Why Attractiveness Matters
  67. Write Beautiful Code
  68. Avoid Surprises in Your Architecture
  69. Analyze Layered Architectures
  70. Find Surprising Change Patterns
  71. Expand Your Analyses
  72. Part 3: Master the Social Aspects of Code
  73. Chapter 11: Norms, Groups, and False Serial Killers
  74. Learn Why the Right People Don’t Speak Up
  75. Understand Pluralistic Ignorance
  76. Witness Groupthink in Action
  77. Discover Your Team’s Modus Operandi
  78. Mine Organizational Metrics from Code
  79. Chapter 12: Discover Organizational Metrics in Your Codebase
  80. Let’s Work in the Communication Business
  81. Find the Social Problems of Scale
  82. Measure Temporal Coupling over Organizational Boundaries
  83. Evaluate Communication Costs
  84. Take It Step by Step
  85. Chapter 13: Build a Knowledge Map of Your System
  86. Know Your Knowledge Distribution
  87. Grow Your Mental Maps
  88. Investigate Knowledge in the Scala Repository
  89. Visualize Knowledge Loss
  90. Get More Details with Code Churn
  91. Chapter 14: Dive Deeper with Code Churn
  92. Cure the Disease, Not the Symptoms
  93. Discover Your Process Loss from Code
  94. Investigate the Disposal Sites of Killers and Code
  95. Predict Defects
  96. Time to Move On
  97. Chapter 15: Toward the Future
  98. Let Your Questions Guide Your Analysis
  99. Take Other Approaches
  100. Let’s Look into the Future
  101. Write to Evolve
  102. Appendix 1: Refactoring Hotspots
  103. Refactor Guided by Names
  104. Bibliography
  105. You May Be Interested In…

Differentiate Between the Level of Tests

In Code Maat, the partitioning between tests and application code isn’t a stable architectural boundary; you identified a temporal coupling of 80 percent. That means they’ll change together most of the time.

But our current analysis has a limitation. Code Maat uses both unit tests and system-level tests. In our analysis, we grouped them all together. Let’s see what happens when we separate the different types of tests.

If you look into the folder test/code_maat of your Code Maat repository, you’ll find four folders, shown in the following figure. Each of them contains a particular suite of test cases. Let’s analyze them by their individual boundaries.

images/Chp9_TestSuitesInMaat.png

Open a text editor, enter the following mapping, and save it as maat_src_test_boundaries.txt:

 
src/code_maat => Code
 
test/code_maat/analysis => Analysis Test
 
test/code_maat/dataset => Dataset Test
 
test/code_maat/end_to_end => End to end Tests
 
test/code_maat/parsers => Parsers Test

With the individual test groups defined, launch a coupling analysis:

 
prompt>​ maat -l maat_evo.log -c git -a coupling \
 
-g maat_src_detailed_test_boundaries.txt
 
entity,coupled,degree,average-revs
 
Code,End to end Tests,42,50
 
Analysis Test,Code,42,49
 
Code,Parsers Test,41,49
Joe asks:
Joe asks:
Code Coverage? Seriously, Is It Any Good?

Code coverage is a simple technique to gain feedback. However, I don’t bother with analyzing coverage until I’ve finished the initial version of a module. But then it gets interesting. The feedback you get is based on your understanding of the application code you just wrote. Perhaps there’s a function that isn’t covered or a branch in the logic that’s never taken?

To get the most out of this measure, try to analyze the cause behind low coverage. Sometimes it’s okay to leave it as is, but more often you’ll find that you’ve overlooked some aspect of the solution.

The specific coverage figure you get is secondary; while it’s possible to write large programs with full coverage, it’s not an end in itself, nor is it meaningful as a general recommendation. It’s just a number. The value you get from code coverage is by the implicit code review you perform when you study uncovered lines.

Finally—and this is a double-edged sword—code coverage can be used for gamification. I’ve seen teams and developers compete with code coverage high scores. To a certain degree this is good. I found it useful when introducing test automation and getting people on a team to pay attention to tests. Who knew automated tests could bring out the competitiveness in us?

These results give us a more detailed view:

  • Analysis Test and Parsers Test contain unit tests. These tests change together with the application code in about 40 percent of all commits. That’s a reasonable number. Together with the coverage results we saw earlier, it means we keep the tests alive, yet manage to avoid having them change too frequently. A higher coupling would be a warning sign that the tests depend on implementation details, and that we’re testing the code on the wrong level. Again, there are no right or wrong numbers; it all depends on your test strategy. For example, if you use test-driven development, you should expect a higher degree of coupling to your unit tests.

  • Dataset Test was excluded by Code Maat because its coupling result was below the default threshold of interest. (You can fine-tune these parameters—look at Code Maat’s documentation.[26])

  • End to end Tests define system-level tests. These change together with the application code in 40 percent of all commits. This is a fairly high number compared to the unit tests—we’d expect the higher-level tests to be more stable and have fewer reasons to change. Our data indicate otherwise. Is there a problem?

Encapsulate Test Data

It turns out there’s a reason that almost every second change to the application code affects the system-level tests, too. And, unfortunately for me as the programmer responsible, it’s not a good reason. So, let me point this out so you can avoid the same problem in your own codebase.

The system tests in Code Maat are based on detailed test data. Most of that data is collected from real-world systems. I did a lot of experimentation with different data formats during the early development of Code Maat. Each time I changed my mind about the data format, the system tests had to be modified, too.

So, why not choose the right test data from the beginning?

That would be great, wouldn’t it? Unfortunately, you’re not likely to get there. To a large degree, programming is problem-solving. And as the following figure illustrates, human problem-solving requires a certain degree of experimentation.

images/Chp6_GreenoKintschModel.png

The preceding figure presents a model from educational psychology. (See Understanding and solving word arithmetic problems [KG85].) We programmers face much the same challenges as educators: we have to communicate with programmers who come to the code after we’ve left. That knowledge is built by an iterative process between two mental models:

  • The situation model contains everything you know about the problem, together with your existing knowledge and problem-solving strategies.

  • The system model is a precise specification of the solution—in this case, your code.

You start with an incomplete understanding of the problem. As you express that knowledge in code, you get feedback. That feedback grows your situation model, which in turn makes you improve the system model. It means that human problem-solving is inherently iterative. You learn by doing. It also means that you don’t know up front where your code ends up.

Remember those architectural principles we talked about earlier in this chapter? This is where they help. Different parts of software perform different tasks, but we need consistency to efficiently understand the code. Architecture specifies that consistency.

This model of problem-solving above lets us define what makes a good design: one where your two mental models are closely aligned. That kind of design is easier to understand because you can easily switch between the problem and the solution.

The take-away is that your test data has to be encapsulated just like any other implementation detail. Test data is knowledge, and we know that in a well-designed system, we don’t repeat ourselves.

Violating the Don’t Repeat Yourself (DRY) principle with respect to test data is a common source of failure in test-automation projects. The problem is sneaky because it manifests itself slowly over time. We can prevent this, though. Let’s see how.