Up to Main Index                           Up to Journal for October, 2012

                   JOURNAL FOR THURSDAY 4TH OCTOBER, 2012
______________________________________________________________________________

SUBJECT: Tabs + Variable width characters (in a monospaced font)
   DATE: Thu Oct  4 21:49:26 BST 2012

This post has currently taken 3 days. It was initially going to be a post for
the 2nd, then the 3rd and now it's the 4th of October :(

The text folding code has now been rewritten and is much cleaner and performs
better than the last iteration. All of the previous corner cases have been
covered and everything seems to be working correctly. As long as you don't use
tabs in the text yet... or some 'weird' Unicode characters ...

Tabs are currently not catered for because they are of 'unknown' width. By tab
I mean the typical tab character produced by the normal tab key - a horizontal
tab which is escaped as \t or is 0x09 in ASCII or Unicode, however there are
also vertical tabs - which are not discussed here.

Why is the width of a tab 'unknown'? Well the traditional or de facto size of
a tab is the width of eight characters in a monospaced font[1]. Some people,
mostly programmers who use tabs for indenting lines of code, say eight is too
large and it should be four. Others say it should be two, other still say you
should only use spaces anyway. If any two of these different camps cross paths
there is usually a great gnashing of teeth.

So lets take a small function and see how much it matters. First a width of
eight, then four, lastly a width of two:


  func Colorize(in string) (out string) {
          if strings.Index(in, "]") != -1 {
                  for color, code := range colorTable {
                          in = strings.Replace(in, color, code, -1)
                  }
          }
          return in
  }

  func Colorize(in string) (out string) {
      if strings.Index(in, "]") != -1 {
          for color, code := range colorTable {
              in = strings.Replace(in, color, code, -1)
          }
      }
      return in
  }

  func Colorize(in string) (out string) {
    if strings.Index(in, "]") != -1 {
      for color, code := range colorTable {
        in = strings.Replace(in, color, code, -1)
      }
    }
    return in
  }


Personally I feel that a width of eight characters is too much. However if you
look at the Linux coding style document it says:

  Tabs are 8 characters, and thus indentations are also 8 characters.
  There are heretic movements that try to make indentations 4 (or even 2!)
  characters deep, and that is akin to trying to define the value of PI to
  be 3.

The only other people I know who use eight characters are those who don't know
how to configure their editor :)

A width of four I find more readable and I notice that Brian Kernighan &
Dennis Ritchie use a width of four for all the code in 'The C Programming
Language' (2nd edition) - but I don't know if this was their preferred style
or if it was for layout reasons in the book. Four is also a common size on
Windows and Macs.

That leaves a width of just two. It's commonly used in shell scripts and
apparently Ruby.

I personally use a tab width of two. I have done for thirty years and I don't
think it will change anytime soon.

So now you know all about tabs and why due to their 'unknown' size I've
ignored them for now.

What about the 'weird' Unicode characters? Well even when using a monospaced
font some Unicode characters take up more space than a single character cell.


  Japanese for love: 愛
    Character cells: 12


Now depending on your font and the size you are viewing it at the Japanese
character may only span the '1' or it may span the '2' as well. Yes, it's font
dependant - well sometimes. This would mean that the folding routine would
have to know what font the CLIENT was using in order to fold correctly.

--
Diddymus

  [1] For the curious the traditional or de facto size of a vertical tab is
      six lines.


  Up to Main Index                           Up to Journal for October, 2012