Late Binding: cleancode

Behavior is the mirror in which everyone shows their image. - Johann Wolfgang von Goethe

RSpec is a Behaviour-Driven Development tool for Ruby programmers. Created shortly after JBehave in 2005, its raison d'être has been writing human-readable specifications in the BDD tradition - accessible to not just developers but also business users, analysts, and testers. Over the years, though, that idea of accessibility to an audience beyond just developers seems to have been lost on many. I suspect that's because many view RSpec as little more than a testing tool with a DSL rather than the BDD tool that it's meant to be.

This narrower developer-focused view leads to dubious guidelines like the ones documented on Better Specs. One of the very first ones there is about how to describe your methods.

How to describe your methods

The only guideline needed on that topic is: Don't do it. Describe behaviors, not methods.

Instead, you get guidelines like: Be clear about what method you are describing. For instance, use the Ruby documentation convention of . (or ::) when referring to a class method's name and # when referring to an instance method's name.

Below is an example of a supposedly good practice according to that guideline.


# (supposedly) good
describe Lemon
  describe "#yellow?" do
    context "when ripe" do
      it "returns true"
    end
  end
end

# vs (supposedly) bad
describe Lemon
  it "is yellow if ripe"
end

When trying to reason about good practice, I find it always instructive to go back to the origins of an idea. Here too, Dan North's article introducing BDD offers great insight. Some quotes:

[...] when they wrote the method name in the language of the business domain, the generated documents made sense to business users, analysts, and testers.

Notice how, in contrast to the above example, the emphasis is on the language of the business domain (e.g. Lemon is yellow if ripe), rather than the language of code (e.g. Lemon#yellow?, when ripe, returns true) that would make little sense to someone other than a developer.

This sentence template – The class should do something – means you can only define a test for the current class.

Notice how the reference is to a class, not a method. The class, not its methods, should do something. Why should anyone care about whether Lemon#yellow? returns true rather than whether Lemon is yellow?

If you find yourself writing a test whose name doesn’t fit this template, it suggests the behaviour may belong elsewhere.

Consider this Active Record test (assuming Lemon is an Active Record model).


describe Lemon
  describe ".ripe" do
    it "returns only ripe lemons"
  end
  describe ".rotten" do
    it "returns only rotten lemons"
  end
end

This test just can't be written in the form 'Lemon should do something' following the suggested template. Because after all, we aren't really describing a Lemon here. What we're describing is the equivalent of a Lemon Repository (or Lemon Basket, perhaps, in the UbiquitousLanguage of the domain) or Data Access Object or whatever floats your goat when it comes to persistence. Active Record sadly conflates a model's class and its persistence mechanism, even though they're two clearly distinct concepts with distinct concerns. If the two concepts weren't conflated, you'd write tests that don't need to describe methods anymore but can describe the behavior/responsibilities of a class, as below:


describe LemonBasket
  it "should provide only ripe lemons"
  it "should provide only rotten lemons"
end

Another way to think about this is to remember that a class is an object, and class methods are, therefore, really instance methods of the (singleton) class object. So, if you include descriptions of .class_methods and #instance_methods in the description of a class, you're really mixing the descriptions of two different classes (pun intended) of objects. Consider this sample from RSpec's own documentation:


RSpec.describe Widget, :type => :model do
  it "has none to begin with" do
    expect(Widget.count).to eq 0
  end

  it "has one after adding one" do
    Widget.create
    expect(Widget.count).to eq 1
  end
end

That description just doesn't make any sense as soon as you try to read it as sentences that'd appear in RSpec output formatted as documentation.

Widget
 has none to begin with
 has one after adding one

That doesn't make sense because what you're really describing is conceptually something like a WidgetWarehouse.

WidgetWarehouse
 has no widgets to begin with
 has one widget after adding (storing?) one

With Active Record, the Widget class above doubles up as a singleton WidgetWarehouse (WidgetDAO/WidgetRepository/whatever). But instead of letting that incidental implementation detail drive the structure of your tests, you could recognize that they're essentially two distinct concepts/concerns, and describe them in independent tests rather than first mixing them up in the same class test and then trying to distinguish them through syntactic conventions.

Moreover, method descriptions aren't conducive to true BDD or TDD style in general. The behaviors of an object are known before the tests, but the structure (method names, signatures, etc.) quite often emerges. For instance, the behavior WidgetWarehouse has one widget after adding one is known at the time of writing the test, but the method that'd provide that behavior isn't (.add or .store? - I shouldn't have to make up my mind before writing the first test.).

Using cutesy syntactic conventions to describe classes based on their structure allows, even encourages, you to sidestep such considerations and suppresses the feedback generation mechanism that's at the heart of TDD and BDD. Focusing on and describing behavior forces you to think more deeply about the behavior and where it belongs.

There are burdens which are bad and blameworthy, and these it is our duty at once to cast away. - James Hamilton

A team-mate recently sought help over email while debugging a piece of JavaScript code that he was trying to get working. He had a working reference implementation in Python, but the JS equivalent didn't seem to be working. Below are the Python and JS snippets he shared, asking if anybody could spot the difference.

Python code:


self.Loan_amount/((1.0-(1.0/(1+(self.Interest_rate/(self.Payment_peryear*100)))**(self.Payment_peryear*self.Payment_period)))/(self.Interest_rate/(self.Payment_peryear*100)))

JS code:

var paymentPerYear = totalPayments($scope.loan.interestRepaymentTerm);


parseInt($scope.loan.loanAmount) / ((1.0 - (1.0 / Math.pow(1 + (parseFloat($scope.loan.interest) / (paymentPerYear * 100)), (paymentPerYear * parseInt($scope.loan.tenor))))) / (parseFloat($scope.loan.interest) / (paymentPerYear * 100)));

It took just a glance for me to realize that spotting the difference in the pieces of code as they were written was gonna take me dangerously close to the limits on my capacity for processing information, what with upto 7 - 2 = 5 levels of nesting in the code. So, instead of trying to rise up to the challenge, I decided to begin by lowering the challenge for myself (a strategy I've found generally useful in software development).

The code and the situation here, even though almost trivial, present an instructive example of how programmers often fail to consider - to their own and their teams' detriment - the effects of cognitive load when both reading and writing code. At such a small scale, such failures may seem like minor condonable infractions, but they add up superlinearly to the difficulty of understanding, and resultantly to that of debugging and modifying code. So, let's look at some of the problems.

First, for the JS code:

The last line is the meat of the solution, and it's supposed to be a simple mathematical formula to calculate equated periodic installments. However, the formula, which is the signal here, seems to be somewhat drowning in the noise of incidental and irrelevant details like $scope, parseInt and parseFloat.
The names are a bit misleading. paymentPerYear seems to indicate it stands for an amount, whereas it really is meant to be the number of paymentsPerYear (note the plural). interest should actually be interestRate.

Now, for the Python code:

The only noise is the self references. While I really appreciate the beauty and the power of the self in Python, I feel it often hurts readability by just the slightest amount.
Thelackofanyspacinginthecodemakesithardertocomprehend.
The inconsistent naming forces you to do a double-take every once in a while. You see the first variable, Loan_amount, and tentatively assume the naming convention to be 'snake case starting with an uppercase first letter'. You see the second variable, Interest_rate, and it kinda confirms your assumption. And then you see... wait a minute, Payment_peryear. Double-take... Payment_per_year. Ah, that wasn't too bad. Except that now you know you can't just cruise along at a comfortable consistent speed on this road - you gotta be prepared to backtrack every once in a while, or simply slow down throughout.

Now, coming to spotting the difference. When diffing as well as debugging, I find it useful to strip out all the noise so you can focus on the signal and analyze it. When diffing, the commonalities are the noise, while the differences are the signal. In order to strip out the commonalities though, you first need to play them up and make them obvious. So, here's what I did to make the commonalities (or is it the differences?) stand out:

insert some spacing for readability
get rid of the following extraneous things: self, $scope.loan., parseInt and parseFloat
factor out some duplication into a variable: r = interestRate / (paymentPerYear * 100)
factor out a computation into a variable: n = paymentPerYear * tenor
rename loanAmount to P

And here's what I got:

P / ((1.0 - (1.0 / ((1 + r) ** n))) / r) # Python

P / ((1.0 - (1.0 / Math.pow(1 + r, n))) / r) // JS

Or better yet (just the Python version):

(P * r) / (1.0 - (1.0 / ((1 + r) ** n))) # Python

In both cases, I see a somewhat convoluted form of the formula for periodic amortization payment. I can't spot anything but syntactical differences between the two versions, and the only clue about what could be wrong with the JS code I have is a few guesses. But that's beside the point.

The point is, I had to expend effort I think was undue to transform both the versions of the code into a form that wouldn't strain my brain before I could say anything about them. The code would've turned out much better if the original author(s) had paid attention to minimizing the cognitive load for themselves and future readers and maintainers. With upto six levels of nesting (requiring the building up and subsequent winding down of a stack that deep in your mind) and the problems listed above, and having seen what the mathematical formula actually is, it would almost seem as if the original version was set up to obfuscate the solution rather than reveal it. This last version, with two-thirds the levels of nesting, is a direct and concise transliteration of the mathematical formula into programming language syntax, making it so transparent and easy to test that it highly minimizes, even nearly obviates, the possibility of bugs and the need to debug. As a bonus, one wouldn't need to seek help to spot the differences between versions in two different programming languages.

In his excellent book "Specification by Example", Gojko Adzic says this about examples used to illustrate specifications:

With traditional specifications, examples appear and disappear several times in the software development process. Everyone [e.g.Business analysts, developers and testers] invents their own examples, but there’s nothing to ensure that these examples are even consistent, let alone complete. In software development, this is why the end result is often different from what was expected at the beginning. To avoid this, we have to prevent misinterpretation between different roles and maintain one source of truth.

I think this description very nearly fits what most often happens with program source code as well. Every reader/maintainer ends up creating a less noisy representation of the code in their head while trying to understand it, and most often, in the absence of MercilessRefactoring, there it disappears into a black hole, only to be recreated the next time the code is read. So, the burden of the cognitive load is created once when writing the code, but is borne by everyone who has to read the code every time they do it... "create once carry forever", anyone?

We commonly subject our code to load tests to understand how much load they can handle. How about if our code could undergo some form of cognitive load tests so we could gauge how much cognitive load it would impose? Now, I wish "cognitive load testing" were a thing.

Late Binding

HighlightJS

Wednesday, August 29, 2018

Emphasize behavior over structure

How to describe your methods

Monday, February 8, 2016

Cognitive load testing, anyone?