Up to Main Index                             Up to Journal for April, 2019

                    JOURNAL FOR WEDNESDAY 10TH APRIL, 2019
______________________________________________________________________________

SUBJECT: Idly pondering what if…
   DATE: Wed 10 Apr 15:47:45 BST 2019

Yesterday I saw a commit on the golang-checkins mailing list that made me
pause and ponder for a while. It concerned a test used in an ‘if’ statement.

The ‘if’ statement is one that programmers use regularly when coding. It is
used to control the conditional execution of code. A simple example is:


  if x > 0 {
    foo()
  }


In this example if x is greater than 0 then we call foo. There are other forms
such as ‘if…else’ and ‘if…else if…else’. However, what I want to focus on is
the testing expression.

The commit that made me pause and ponder was titled:


  strings: use Go style character range comparison in ToUpper/ToLower


This was the original code:


  if c >= 'a' && c <= 'z' {
    c -= 'a' - 'A'
  }


Which was changed to:


  if 'a' <= c && c <= 'z' {
    c -= 'a' - 'A'
  }


The first way of writing the ‘if’ statement is how I would normally write it.
It’s how you would say it: if c is greater than or equal to 'a' and c is less
than or equal to 'z' do something. This seems very natural and is easy to
comprehend.

The second form needs some additional thinking before you can be sure it is
correct. However, I can see the merit of having the “c” variable in the middle
and the comparison values 'a' and 'z' on the outside, visually it’s saying if
“c” is between 'a' and 'z'.

What would the if statement be to test for “c” being outside the range 'a' to
'z'? Normally I would write:


  if c < 'a' || c > 'z' {

  }


However to get the visual style we would write:


  if c < 'a' || 'z' < c {

  }


Here we have “c” on the outside and the comparison values in the middle.
Visually we are saying “c” outside of the range 'a' to 'z'. This one is harder
to comprehend. What is “ 'z' < c ” actually testing for, what are the limits
being tested here?

I did some digging in the Go source tree and found a lot of tests using the
form ‘x <= lo && hi <= x’ and only a few of the form ‘x < lo || hi < x’.

I did some searching around and found conventions for comments, indenting,
line lengths and naming — but nothing on ‘if’ statements. So I wonder where
this came from? The only vague reference I found was on Wikipedia[1]:


  Early toolsmiths writing in C under Unix began developing idioms at a rapid
  rate to classify characters into different types. For example, in the ASCII
  character set, the following test identifies a letter:

  if (('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z'))


However, no citation had been provided for this information. So based on the
above I went looking for some old Unix source code. I downloaded the source
for Unix 6th edition and started looking for myself.

I found both of the ‘x >= lo && x <= hi’ and ‘lo <= x && x <= hi’ styles in
the Unix source code. I also found usage of the ‘x < lo || x > hi’ style but
not the ‘x < lo || hi < x’ style. In addition to the ‘if’ statement the styles
were also used for ‘while’ loops and ‘switch’ statements.

So it looks like this style of writing an ‘if’ statement, when testing for a
range, has a very long, old history associated with it. I’m still not sure if
I like the ‘lo <= x && x <= hi’ and ‘x < lo || hi < x’ styles though :P

--
Diddymus

  [1] Wikipedia, C character classification:
      https://en.wikipedia.org/wiki/C_character_classification


  Up to Main Index                             Up to Journal for April, 2019