Up to Main Index Up to Journal for April, 2019
JOURNAL FOR THURSDAY 4TH APRIL, 2019
______________________________________________________________________________
SUBJECT: 16 years that are gone forever and I’ll never have again
DATE: Thu 4 Apr 21:55:08 BST 2019
Title inspired by Guns N’ Roses “14 years” ;)
Monday morning I found an email in my spam folder. It was sent late Sunday
night, and contained a scan of a letter. My employment was terminated and I
wasn’t to finish serving my notice period. An impersonal email, no call, no
thank you for your service, no heads-up, nothing. So much for 16 years service
and loyally seeing it through till the very bitter end instead of bailing. I
am now unemployed and find myself in a strange limbo between jobs.
But enough of that!
While rewriting my static site generator I’ve been slowly fixing up entries to
this journal, and the website in general, as I spot things. It was during one
round of such changes that I noticed something odd. Some of the glyphs for
older journal entries appeared different. They were being rendered using a
different font as they didn’t appear in Roboto Mono. Odd.
Way back in 2016 I switched the site to use the then new Go Mono font. Then on
a whim I changed to Roboto Mono back in April/May 2018. I say a whim because
I’m not sure why I made the change. I suspect it was a performance thing.
Anyway, Roboto has a lot fewer glyphs in it than Go. I’m still not 100% happy
with either font. However, not finding a suitable replacement, I’ve switched
back to the Go font for now. The question is: are all of the glyphs I’ve used
now covered? How would I find out?
First step is to get a list of all the glyphs in each of the two fonts. Easy
enough:
$ fc-match --format='%{file}\n%{charset}\n' 'Go Mono'
/usr/share/fonts/fonts-go/Go-Mono.ttf
20-7e a0-17f 192 1fa-1ff 218-21b 2c6-2c7 2c9 2d8-2dd 384-38a 38c 38e-3a1
3a3-3ce 400-45f 490-491 1e80-1e85 1ef2-1ef3 2013-2015 2017-201e 2020-2022
2026 2030 2032-2033 2039-203a 203c 203e 2044 207f 20a3-20a4 20a7 20ac 2105
2113 2116 2122 2126 212e 215b-215e 2190-2195 21a8 2202 2206 220f 2211-2212
2215 2219-221a 221e-221f 2229 222b 2248 2260-2261 2264-2265 2302 2310
2320-2321 2500 2502 250c 2510 2514 2518 251c 2524 252c 2534 253c 2550-256c
2580 2584 2588 258c 2590-2593 25a0-25a1 25aa-25ac 25b2 25ba 25bc 25c4
25ca-25cb 25cf 25d8-25d9 25e6 263a-263c 2640 2642 2660 2663 2665-2666
266a-266b f800 fb01-fb02 fffd
$ fc-match --format='%{file}\n%{charset}\n' 'Roboto Mono'
/home/rolfea/.fonts/RobotoMono-Regular.ttf
20-7e a0-17f 192 1a0-1a1 1af-1b0 1f0 1fa-1ff 218-21b 237 259 2bc 2c6-2c7 2c9
2d8-2dd 2f3 300-301 303 309 30f 323 384-38a 38c 38e-3a1 3a3-3ce 3d1-3d2 3d6
400-486 488-513 1e00-1e01 1e3e-1e3f 1e80-1e85 1ea0-1ef9 1f4d 2000-200b
2013-2015 2017-201e 2020-2022 2025-2026 2030 2032-2033 2039-203a 203c 2044
2074 207f 20a3-20a4 20a7 20ab-20ac 2105 2113 2116 2122 2126 212e 215b-215e
2202 2206 220f 2211-2212 221a 221e 222b 2248 2260 2264-2265 25ca f6c3 feff
fffc-fffd
I then mixed in a little ed-fu. I really like regular expressions and because
of that I'm probably one of the few people who actually like the ed editor[1].
I ended up with glyphs.ed as:
# Delete file name from 1st line
1d
# Split ranges onto separate lines
,s/\s/\
/g
# Turn single value X into a range X-X
,s/^\([0-9a-f]\+\)$/&-&/
# add leading ‘ {0x’
,s/^/ {0x/
# Change the ‘-’ in the range to ‘, 0x’
,s/-/, 0x/
# Add trailing ‘},’
,s/$/},/
# Insert ‘{’ and the ASCII range at start of file
1i
{
{0x00, 0x1f}, // Added ASCII range
.
# Add ‘}’ at the end of the file
$a
}
.
# Write out to a temporary file
w temp.txt
# Quit ed
q
Scripting ed with this file produced something I could use in a struct with
start and end ranges for a font:
$ cat glyphs.ed | ed <(fc-match --format='%{file}\n%{charset}\n' 'Go Mono')
$ cat temp.txt
{
{0x00, 0x1f}, // Added ASCII range
{0x20, 0x7e},
{0xa0, 0x17f},
{0x192, 0x192},
//
// Table truncated for brevity
//
{0xf800, 0xf800},
{0xfb01, 0xfb02},
{0xfffd, 0xfffd},
}
I then wrote some quick and nasty Go code that simply loads a file, converts
the data to a []rune and then checks if each rune is in the generated table:
package main
import (
"bytes"
"fmt"
"io/ioutil"
"os"
"strconv"
)
var glyphs = []struct {
from, to rune
}{
{0x00, 0x1f}, // Added ASCII range
{0x20, 0x7e},
{192,192},
//
// Table truncated for brevity
//
{0xf800, 0xf800},
{0xfb01, 0xfb02},
{0xfffd, 0xfffd},
}
func main() {
data, err := ioutil.ReadFile(os.Args[1])
if err != nil {
fmt.Println(err)
return
}
notFound := make(map[rune]struct{})
nextr:
for _, r := range bytes.Runes(data) {
for _, p := range glyphs {
if r >= p.from && r <= p.to {
continue nextr
}
}
notFound[r] = struct{}{}
}
if len(notFound) > 0 {
fmt.Printf("File: %s\nGlyphs not in font:\n", os.Args[1])
for r := range notFound {
fmt.Printf(" %-8U %12v %-4q\n", r, utf8(r), r)
}
}
}
func utf8(r rune) string {
b := []byte(string(r))
s := bytes.Repeat([]byte(" .."), 4-len(b))
for _, b := range b {
s = append(s, ' ')
if b < 16 {
s = append(s, '0')
}
s = append(s, strconv.FormatInt(int64(b), 16)...)
}
return string(s)
}
Like I said, quick and nasty hack :) Running it produces:
$ find ./public -name "*.txt" | xargs -n1 ./glyphs
File: ./public/journal/2016/10/16.txt
Glyphs not in font:
U+82B3 .. e8 8a b3 '芳'
U+0301 .. .. cc 81 '́'
File: ./public/journal/2016/11/11.txt
Glyphs not in font:
U+2420 .. e2 90 a0 '␠'
File: ./public/journal/2012/10/4.txt
Glyphs not in font:
U+611B .. e6 84 9b '愛'
File: ./public/journal/2012/11/26.txt
Glyphs not in font:
U+0336 .. .. cc b6 '̶'
File: ./public/journal/2013/4/11.txt
Glyphs not in font:
U+23CE .. e2 8f 8e '⏎'
I could then edit the files as necessary and remove or replace any offending
glyphs.
All in all, a nice quite afternoon just hacking around :)
--
Diddymus
[1] I've written about ed before: ../../2016/6/8.html
Up to Main Index Up to Journal for April, 2019