Up to Main Index Up to Journal for August, 2023 JOURNAL FOR SATURDAY 26TH AUGUST, 2023 ______________________________________________________________________________ SUBJECT: Too magical for some? DATE: Sat 26 Aug 22:16:41 BST 2023 After the last journal entry there were some comments on the “regexp string” feature. I originally said: A “regexp string” is only found after the regexp keyword, the string must be quoted with backticks, the string is only interpreted once — at compile time. People felt it was a little bit too magical. In the sense that a normal string used in a particular way had special meaning, and it could catch people out. I was also told I was trying to be too clever — what if regexp was followed by a concatenation such as `string` + `string`? Was it still a special case? The “regexp string” example I gave was: re = regexp ` ^ # anchor at beginning 0x # hexadecimal prefix [a-fA-F0-9] # a hexadecimal digit + # one or more times $ # anchor at end ` println literal re When the code ran it would produce: regexp(`^0x[a-fA-F0-9]+$`) What to do? I still wanted the feature. I’ve had a play with various ideas trying to come up with something that was simple, explicit and non-magical. My solution is to add another regular expression specific operator ~x for extended strings. The ~x operator takes a string, removes embedded comments and white-space, returns the resulting string. Let me demonstrate using the above example: re = regexp ~x ` ^ # anchor at beginning 0x # hexadecimal prefix [a-fA-F0-9] # a hexadecimal digit + # one or more times $ # anchor at end ` println literal re The ~x operator works on a string and returns a string. Any string operations work as expected — no magic. As regexp and strings are closely related, I did have more magic that let you use them interchangeably. For example, the comparison operators let you compare regexp and strings. For now I’ve dropped those changes as there are no other instances in the language where types behave like that. If you want to compare a regexp and a string you can use a string conversion: re = regexp `[0-9]` println string re == "[0-9]" I guess in the long run this makes sense, a regexp should never be equal to a string as they are different types. After all the tweaking and poking can you still do crazy shit? > cat crazy.mr text = `"abc", "def,ghi", "jkl"` range ; v; text ~s (~x ` ^"|"$ # starts/ends with literal " `) "" ~c ~x ` " # literal " \s* # optional white-space , # literal , \s* # optional white-space " # literal " ` println v next >mere crazy.mr abc def,ghi jkl > Guess that depends on your definition of crazy shit… -- Diddymus Up to Main Index Up to Journal for August, 2023