Up to Main Index Up to Journal for April, 2019 JOURNAL FOR THURSDAY 4TH APRIL, 2019 ______________________________________________________________________________ SUBJECT: 16 years that are gone forever and I’ll never have again DATE: Thu 4 Apr 21:55:08 BST 2019 Title inspired by Guns N’ Roses “14 years” ;) Monday morning I found an email in my spam folder. It was sent late Sunday night, and contained a scan of a letter. My employment was terminated and I wasn’t to finish serving my notice period. An impersonal email, no call, no thank you for your service, no heads-up, nothing. So much for 16 years service and loyally seeing it through till the very bitter end instead of bailing. I am now unemployed and find myself in a strange limbo between jobs. But enough of that! While rewriting my static site generator I’ve been slowly fixing up entries to this journal, and the website in general, as I spot things. It was during one round of such changes that I noticed something odd. Some of the glyphs for older journal entries appeared different. They were being rendered using a different font as they didn’t appear in Roboto Mono. Odd. Way back in 2016 I switched the site to use the then new Go Mono font. Then on a whim I changed to Roboto Mono back in April/May 2018. I say a whim because I’m not sure why I made the change. I suspect it was a performance thing. Anyway, Roboto has a lot fewer glyphs in it than Go. I’m still not 100% happy with either font. However, not finding a suitable replacement, I’ve switched back to the Go font for now. The question is: are all of the glyphs I’ve used now covered? How would I find out? First step is to get a list of all the glyphs in each of the two fonts. Easy enough: $ fc-match --format='%{file}\n%{charset}\n' 'Go Mono' /usr/share/fonts/fonts-go/Go-Mono.ttf 20-7e a0-17f 192 1fa-1ff 218-21b 2c6-2c7 2c9 2d8-2dd 384-38a 38c 38e-3a1 3a3-3ce 400-45f 490-491 1e80-1e85 1ef2-1ef3 2013-2015 2017-201e 2020-2022 2026 2030 2032-2033 2039-203a 203c 203e 2044 207f 20a3-20a4 20a7 20ac 2105 2113 2116 2122 2126 212e 215b-215e 2190-2195 21a8 2202 2206 220f 2211-2212 2215 2219-221a 221e-221f 2229 222b 2248 2260-2261 2264-2265 2302 2310 2320-2321 2500 2502 250c 2510 2514 2518 251c 2524 252c 2534 253c 2550-256c 2580 2584 2588 258c 2590-2593 25a0-25a1 25aa-25ac 25b2 25ba 25bc 25c4 25ca-25cb 25cf 25d8-25d9 25e6 263a-263c 2640 2642 2660 2663 2665-2666 266a-266b f800 fb01-fb02 fffd $ fc-match --format='%{file}\n%{charset}\n' 'Roboto Mono' /home/rolfea/.fonts/RobotoMono-Regular.ttf 20-7e a0-17f 192 1a0-1a1 1af-1b0 1f0 1fa-1ff 218-21b 237 259 2bc 2c6-2c7 2c9 2d8-2dd 2f3 300-301 303 309 30f 323 384-38a 38c 38e-3a1 3a3-3ce 3d1-3d2 3d6 400-486 488-513 1e00-1e01 1e3e-1e3f 1e80-1e85 1ea0-1ef9 1f4d 2000-200b 2013-2015 2017-201e 2020-2022 2025-2026 2030 2032-2033 2039-203a 203c 2044 2074 207f 20a3-20a4 20a7 20ab-20ac 2105 2113 2116 2122 2126 212e 215b-215e 2202 2206 220f 2211-2212 221a 221e 222b 2248 2260 2264-2265 25ca f6c3 feff fffc-fffd I then mixed in a little ed-fu. I really like regular expressions and because of that I'm probably one of the few people who actually like the ed editor[1]. I ended up with glyphs.ed as: # Delete file name from 1st line 1d # Split ranges onto separate lines ,s/\s/\ /g # Turn single value X into a range X-X ,s/^\([0-9a-f]\+\)$/&-&/ # add leading ‘ {0x’ ,s/^/ {0x/ # Change the ‘-’ in the range to ‘, 0x’ ,s/-/, 0x/ # Add trailing ‘},’ ,s/$/},/ # Insert ‘{’ and the ASCII range at start of file 1i { {0x00, 0x1f}, // Added ASCII range . # Add ‘}’ at the end of the file $a } . # Write out to a temporary file w temp.txt # Quit ed q Scripting ed with this file produced something I could use in a struct with start and end ranges for a font: $ cat glyphs.ed | ed <(fc-match --format='%{file}\n%{charset}\n' 'Go Mono') $ cat temp.txt { {0x00, 0x1f}, // Added ASCII range {0x20, 0x7e}, {0xa0, 0x17f}, {0x192, 0x192}, // // Table truncated for brevity // {0xf800, 0xf800}, {0xfb01, 0xfb02}, {0xfffd, 0xfffd}, } I then wrote some quick and nasty Go code that simply loads a file, converts the data to a []rune and then checks if each rune is in the generated table: package main import ( "bytes" "fmt" "io/ioutil" "os" "strconv" ) var glyphs = []struct { from, to rune }{ {0x00, 0x1f}, // Added ASCII range {0x20, 0x7e}, {192,192}, // // Table truncated for brevity // {0xf800, 0xf800}, {0xfb01, 0xfb02}, {0xfffd, 0xfffd}, } func main() { data, err := ioutil.ReadFile(os.Args[1]) if err != nil { fmt.Println(err) return } notFound := make(map[rune]struct{}) nextr: for _, r := range bytes.Runes(data) { for _, p := range glyphs { if r >= p.from && r <= p.to { continue nextr } } notFound[r] = struct{}{} } if len(notFound) > 0 { fmt.Printf("File: %s\nGlyphs not in font:\n", os.Args[1]) for r := range notFound { fmt.Printf(" %-8U %12v %-4q\n", r, utf8(r), r) } } } func utf8(r rune) string { b := []byte(string(r)) s := bytes.Repeat([]byte(" .."), 4-len(b)) for _, b := range b { s = append(s, ' ') if b < 16 { s = append(s, '0') } s = append(s, strconv.FormatInt(int64(b), 16)...) } return string(s) } Like I said, quick and nasty hack :) Running it produces: $ find ./public -name "*.txt" | xargs -n1 ./glyphs File: ./public/journal/2016/10/16.txt Glyphs not in font: U+82B3 .. e8 8a b3 '芳' U+0301 .. .. cc 81 '́' File: ./public/journal/2016/11/11.txt Glyphs not in font: U+2420 .. e2 90 a0 '␠' File: ./public/journal/2012/10/4.txt Glyphs not in font: U+611B .. e6 84 9b '愛' File: ./public/journal/2012/11/26.txt Glyphs not in font: U+0336 .. .. cc b6 '̶' File: ./public/journal/2013/4/11.txt Glyphs not in font: U+23CE .. e2 8f 8e '⏎' I could then edit the files as necessary and remove or replace any offending glyphs. All in all, a nice quite afternoon just hacking around :) -- Diddymus [1] I've written about ed before: ../../2016/6/8.html Up to Main Index Up to Journal for April, 2019