Up to Main Index Up to Journal for August, 2023 JOURNAL FOR THURSDAY 31ST AUGUST, 2023 ______________________________________________________________________________ SUBJECT: What’s my line again? DATE: Thu 31 Aug 20:47:05 BST 2023 With Mere v0.0.7 out I’ve been casting my eye over the todo list and one item in particular has been nagging me for ages. Debugging and errors. Having a language that blows-up with an underlying stack trace is not good. It should provide helpful error messages and say where the error occurred. Which brings me nicely to today’s topic. Tracking line numbers is hard. Don’t think so? Take this simple little program: 1 trace true 2 call test 3 3 dump 4 5 // test function 6 test: func limit 7 for x = 1; x <= limit; x++ 8 if x == 1 9 println "one" 10 elif x == 2 11 println "two" 12 else 13 println "many" 14 fi 15 next 16 endfunc To compile this into something that can be executed it needs to go through a number of compile phases. Mere has some debugging flags that let me capture the output from the different phases. This is after the ‘Rewrite2’ phase where the code is still kind of readable, if you can read reverse polish notation: 1 … true trace ; 2 … test 3 call ; 3 dump ; 4 ; 5 ; 6 guard∙0 goto ; 7 test: … ·test·limit func ; 8 x 1 = ; 9 loop0∙cond goto ; 10 test·loop0∙start: x 1 == ! is ; 11 if1∙0 goto ; 12 … "one" println ; 13 if∙1 goto ; 14 test·if1∙0: x 2 == ! is ; 15 if1∙1 goto ; 16 … "two" println ; 17 if∙1 goto ; 18 test·if1∙1: ; 19 … "many" println ; 20 test·if1∙2: test·if∙1: ; 21 x ++ ; 22 test·loop0∙cond: x limit <= is ; 23 loop0∙start goto ; 24 … endfunc ; 25 guard∙0: ; We have now gone from 16 lines of code to 25. Which lines were the original source code? Here is the problem, how do you keep track of the original source code lines when the lines are rewritten and new lines added? A ‘conventional’ compiler would generate an abstract syntax tree, or AST, and there would be a field for such information. Mere is not conventional, it’s very simple under the hood, and it cheats… When Mere initially tokenizes the original source code it adds internal labels for the original source lines. The same program as initially tokenized: 1 ·L∙1: trace true ; 2 ·L∙2: call test 3 ; 3 ·L∙3: dump ; 4 ·L∙4: ; 5 ·L∙5: ; 6 ·L∙6: test: func limit ; 7 ·L∙7: for x = 1 ; 8 x < = limit ; 9 x + + ; 10 ·L∙8: if x = = 1 ; 11 ·L∙9: println "one" ; 12 ·L∙10: elif x = = 2 ; 13 ·L∙11: println "two" ; 14 ·L∙12: else ; 15 ·L∙13: println "many" ; 16 ·L∙14: fi ; 17 ·L∙15: next ; 18 ·L∙16: endfunc ; There are two extra lines, 8 and 9 in the above, where the for-next loop has been split into its three parts. Mere is happy to cope with multiple labels for the same line, as in line 6 where we have “·L∙6:” and “test:”. The nice thing about tagging the lines using labels is they move around with the code. Here is the ‘Rewrite2’ phase again, this time with the additional labels: 1 ·L∙1: … true trace ; 2 ·L∙2: … test 3 call ; 3 ·L∙3: dump ; 4 ·L∙4: ; 5 ·L∙5: ; 6 guard∙0 goto ; 7 ·L∙6: test: … ·test·limit func ; 8 ·L∙7: x 1 = ; 9 loop0∙cond goto ; 10 test·loop0∙start: ·L∙8: x 1 == ! is ; 11 if1∙0 goto ; 12 ·L∙9: … "one" println ; 13 ·L∙10: if∙1 goto ; 14 test·if1∙0: x 2 == ! is ; 15 if1∙1 goto ; 16 ·L∙11: … "two" println ; 17 ·L∙12: if∙1 goto ; 18 test·if1∙1: ; 19 ·L∙13: … "many" println ; 20 ·L∙14: test·if1∙2: test·if∙1: ; 21 ·L∙15: x ++ ; 22 test·loop0∙cond: x limit <= is ; 23 loop0∙start goto ; 24 ·L∙16: … endfunc ; 25 guard∙0: ; The labels can now be used to identify the original source lines. Don’t worry, your executable isn’t going to be littered with labels. There is an ‘Unlabel’ phase which cleans up and optimizes internal labels :) If we run the program, the dump produced by the original line 3 in the code, looks like this: ============================== State Dump ============================== SP: 9 inst: 67/0 stacks: 1 vids: 0/3 xref: 15 -- Global Storage ------------------------------------------------------ 0x0007:L test 13 -- Stack Storage ------------------------------------------------------- 0: 9 -- Code ---------------------------------------------------------------- 1 0 | O[77:…] b[true] F[21:trace] O[78:;] 2 4 | O[77:…] V[0x0007:L "test"] i[3] F[22:call] O[78:;] 3 9 | O[3:dump] O[78:;] 4 11 | L[77] O[5:goto] O[78:;] 0x0007 > test 6 14 | O[77:…] V[0x001b:? "test·limit"] F[24:func] O[78:;] 7 18 | V[0x001c:? "test·x"] i[1] O[58:=] O[78:;] 7 22 | L[65] O[5:goto] O[78:;] 8 25 | V[0x001c:? "test·x"] i[1] O[50:==] O[28:!] L[34] O[57:is] O[78:;] 8 32 | L[41] O[5:goto] O[78:;] 9 35 | O[77:…] s["one"] F[17:println] O[78:;] 10 39 | L[62] O[5:goto] O[78:;] 10 42 | V[0x001c:? "test·x"] i[2] O[50:==] O[28:!] L[51] O[57:is] O[78:;] 10 49 | L[58] O[5:goto] O[78:;] 11 52 | O[77:…] s["two"] F[17:println] O[78:;] 12 56 | L[62] O[5:goto] O[78:;] 13 59 | O[77:…] s["many"] F[17:println] O[78:;] 14 63 | V[0x001c:? "test·x"] O[29:++] O[78:;] 14 66 | V[0x001c:? "test·x"] V[0x001b:? "test·limit"] O[47:<=] L[74] O[57:is] O[78:;] 14 72 | L[24] O[5:goto] O[78:;] 16 75 | O[77:…] F[25:endfunc] O[78:;] 18 78 | =============================== End Dump =============================== The first column of numbers is the original source line number and the second column is Mere’s instruction number. That’s not all. Line numbers are now provided in traces, as produced by the original line 1 in the code: 1 3 ; | b[false] 2 4 … | 2 7 call | O[77:…] L[0x0007 test] i[3] 6 14 … | i[3] 6 16 func | i[3] O[77:…] V[0x001b:? test·limit] 6 17 ; | 7 20 =i | V[0x001c:? test·x] i[1] 7 21 ; | i[1] 7 23 gotoL | L[65] 14 68 <=ii | V[0x001c:i test·x] V[0x001b:i test·limit] 14 70 isLb | b[true] L[74] 14 71 ; | 14 73 gotoL | L[24] 8 27 ==ii | V[0x001c:i test·x] i[1] 8 28 !b | b[true] 8 30 isLb | b[false] L[34] 9 35 … | 9 37 println | O[77:…] s["one"] one 9 38 ; | 10 40 gotoL | L[62] 14 64 ++i | V[0x001c:i test·x] 14 65 ; | i[2] 14 68 <=ii | V[0x001c:i test·x] V[0x001b:i test·limit] 14 70 isLb | b[true] L[74] 14 71 ; | 14 73 gotoL | L[24] 8 27 ==ii | V[0x001c:i test·x] i[1] 8 28 !b | b[false] 8 30 isLb | b[true] L[34] 8 31 ; | 8 33 gotoL | L[41] 10 44 ==ii | V[0x001c:i test·x] i[2] 10 45 !b | b[true] 10 47 isLb | b[false] L[51] 11 52 … | 11 54 println | O[77:…] s["two"] two 11 55 ; | 12 57 gotoL | L[62] 14 64 ++i | V[0x001c:i test·x] 14 65 ; | i[3] 14 68 <=ii | V[0x001c:i test·x] V[0x001b:i test·limit] 14 70 isLb | b[true] L[74] 14 71 ; | 14 73 gotoL | L[24] 8 27 ==ii | V[0x001c:i test·x] i[1] 8 28 !b | b[false] 8 30 isLb | b[true] L[34] 8 31 ; | 8 33 gotoL | L[41] 10 44 ==ii | V[0x001c:i test·x] i[2] 10 45 !b | b[false] 10 47 isLb | b[true] L[51] 10 48 ; | 10 50 gotoL | L[58] 13 59 … | 13 61 println | O[77:…] s["many"] many 13 62 ; | 14 64 ++i | V[0x001c:i test·x] 14 65 ; | i[4] 14 68 <=ii | V[0x001c:i test·x] V[0x001b:i test·limit] 14 70 isLb | b[false] L[74] 16 75 … | 16 76 endfunc | O[77:…] 2 8 ; | 3 9 dump | 3 10 ; | 4 12 gotoL | L[77] As in the dump, the first column of numbers is the original source line number and the second column is Mere’s instruction number. Now we can match Mere’s instruction number to an original source line, we have the information needed to write some nice error handling and error messages. I’ve also been working on updating Mere ICE to show line numbers next to the code in the editor. And yes, when you’ve been looking at all of this for hours on end it does start to look like[1] something out of “The Matrix” :) é π Ÿ ѝ ž ј Ŕ ж ω Ï ī Ū Œ ï Ǻ ţ ι η Ş Œ Ä Ѕ Ĥ Ό  Ŧ ŀ ř ё Ї Ћ Щ ъ Õ Ī Ţ љ æ Ч Ш Υ ю π Ћ ş Đ З т ч ī Џ И ς Ц Á є Ѓ Ѕ Ċ ĸ ĸ ẅ Њ å ń û Û ø ż Ѐ É Ò Ģ ǿ ω Ч Ô Р Ẅ ý ί њ β Ŷ į Ú ē Ĵ ŷ Ï ћ ŵ Ẁ É д Ι У ц ł Ò Ï љ Ā Ŏ ij Σ Б Ň П Ť І Ο á ş ŀ Ŧ ς Ŵ â σ Α š Щ Π Ö Ă Ų Ś ċ ά Д м Ω ї Ğ đ Ė Ī Η к ј χ ς Ņ Ỳ Ж ѐ Ŵ К ÿ Љ ќ ω û м α η ď Џ ĕ й ā Œ а ā Œ ô Č Ó ŧ Ϋ ķ Ŗ Ļ ı ẅ ѝ Ε Ņ Ђ ђ ú ϊ с Ò ν ī Ы Ĕ ë ъ Ц ј È Џ ј ř Ó Ż ß ŭ ς ij є ũ Ţ Т ũ ī х Į я Θ Υ Ѓ ñ α ί Þ φ С ą з Ў э Ý č Ж ō Ț ţ î ξ ł Ů Н ё ō š û ш Ѐ й Γ Λ Ẃ љ ы ν Ź Ô č Ț È ĵ ф з П Ŗ ϋ п Ī Ή Ф ē ч Ч ζ İ Ÿ ш Ţ Ŗ ĩ й О Ε ŏ ѐ ŋ ʼn Ț Ĩ ў Ы į Ħ ĺ ü ϊ ĥ Α ĸ Έ Õ Щ π Ġ ѓ α Ш ī л ý Ö ό Ŭ ſ ʼn ỳ Ǻ ŧ ё ξ Ê Ń ŝ Ь ò ť â з Ĥ þ Ù Ε ř Ή σ β Ъ Ā é ќ ε æ ē ё È Ω ø ι Í Ħ Ѕ ī ș Σ ă Ε Β К Ş φ Ð Ϊ Ð Ź Ǻ ť Ú ï υ ή û Ŷ Ď Β Х ĥ Ŋ õ ç Ħ IJ Ÿ š Х ĵ Ť ō ů ß α Ǻ Н Η ň Ā ģ Ω ζ к Ί Н ú í Щ ø Ђ Ļ ѐ Њ п δ Ǻ з ĩ ж ċ ħ Э Ђ Ç ў Ώ ή χ ß ύ É Ğ φ Ċ Ǽ ĩ я П й ş Ŕ Ѓ О ώ Ý Î г ǽ ψ Ќ Ǻ ή Ύ Η Ý Є Ų ů γ Ş Ú Φ Т ά П -- Diddymus [1] If you want your own “Matrix” generator, here’s my quick 5 minute hack: chars = "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ" chars += "ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľ" chars += "ĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽ" chars += "žſƒǺǻǼǽǾǿȘșȚțΆΈΉΊΌΎΏΐΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫάέήίΰαβγδεζηθικλ" chars += "μνξοπρςστυφχψωϊϋόύώЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫ" chars += "ЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџҐґẀẁẂẃẄẅỲỳ" lc = len chars line = "" for x = 0; x < 39; x++ if rnd 100 > 80 line += chars[rnd lc -1] + " " else line += " " fi next for x = 0; x < 24; x++ range k; v; line is k % 2 == 1; continue if v != " " if rnd 100 > 10 line[k] = chars[rnd lc -1] else line[k] = " " fi elif rnd 100 > 85 line[k] = chars[rnd lc -1] fi next println line next Up to Main Index Up to Journal for August, 2023