Up to Main Index Up to Journal for April, 2017 JOURNAL FOR FRIDAY 21ST APRIL, 2017 ______________________________________________________________________________ SUBJECT: Info, tips & updated on debugging WolfMUD DATE: Fri 21 Apr 23:58:54 BST 2017 With the recent journal entries last month on debugging, performance and sorting out deallocation of objects for garbage collection I’ve had a lot of questions asking how I actually debug WolfMUD. Hopefully this post, and some recent changes on the public dev branch, will help with debugging WolfMUD and provide ideas for debugging Go programs in general. My main tool is the obvious one, to sprinkle Printf lines throughout the code: log.Printf("[DEBUG] %p %[1]T %s: %s", t, t.UID(), FindName(t).Name("?")) This will write messages to the log something like: 2017/04/21 20:04:22.702800 thing.go:61: [DEBUG] 0xc42010fda0 *attr.Thing #UID-2Q: "the tavern door" For temporary debugging I always put "[DEBUG]" in the messages so that the lines are easy to find and take out again once I’ve finished debugging. If a particular message is very useful for debugging I’ll drop the "[DEBUG]" and condition the message with a configuration option. I also use the #DUMP command a lot to inspect the state of Things. The #DUMP command is a player command that can be used from within WolfMUD if enabled in the server configuration[1]. An example of the output from ‘#DUMP CHEST’: 0xc4200eb770 *attr.Thing #UID-3B, 7 attributes: 0xc4201249e0 *attr.Description "This is a very stout wooden chest about 2 feet wide and 1 foot deep. Thick metal bands bind it." 0xc420124a00 *attr.Name "an iron bound chest" 0xc420124a40 *attr.Alias 2 aliases: "CHEST", "#UID-3B" 0xc4200eb530 *attr.Reset After: 5m0s Jitter: 1m0s Spawn: false 0x0 event.Cancel 0xc4200eb560 *attr.Cleanup After: 5m0s Jitter: 1m0s 0x0 event.Cancel 0xc4200f4b90 *attr.Inventory Lock ID: 83, 1 items (split: 0): 0xc4200eb740 *attr.Thing #UID-3A, 7 attributes: 0xc4200f4be0 *attr.Inventory Lock ID: 84, 1 items (split: 0): 0xc4200eb650 *attr.Thing #UID-39, 6 attributes: 0xc420124a80 *attr.Alias 2 aliases: "BALL", "#UID-39" 0xc4200eb5c0 *attr.Reset After: 0s Jitter: 2m0s Spawn: true 0x0 event.Cancel 0xc4200eb5f0 *attr.Cleanup After: 1s Jitter: 0s 0x0 event.Cancel 0xc420124aa0 *attr.Description "This is a small, red, rubber ball." 0xc420124ac0 *attr.Name "a small red ball" 0xc4200eb620 *attr.Locate -> Origin: 0xc4200f4be0 a small leather pouch, Where: 0xc4200f4be0 a small leather pouch 0xc420124b00 *attr.Description "This is a small pouch made of soft dark leather." 0xc420124b20 *attr.Name "a small leather pouch" 0xc420124b60 *attr.Alias 2 aliases: "POUCH", "#UID-3A" 0xc4200eb6b0 *attr.Reset After: 1m0s Jitter: 1m0s Spawn: false 0x0 event.Cancel 0xc4200eb6e0 *attr.Cleanup After: 10m0s Jitter: 10m0s 0x0 event.Cancel 0xc4200eb710 *attr.Locate -> Origin: 0xc4200f4b90 an iron bound chest, Where: 0xc4200f4b90 an iron bound chest 0xc4200eb7a0 *attr.Locate -> Origin: 0xc4200e3b80 Fireplace, Where: 0xc4200e3b80 Fireplace The #DUMP command provides a lot of information about an item, player, location or any other Thing and will recursively display information for Inventory content. So using #DUMP on a location will list all of it’s content including items, players and their content — the main reason why I add aliases to locations even though they are not really needed. When debugging clean ups, resets and respawning I needed to be able to check the status of items that were not in play. One disadvantage of #DUMP is that it only works on aliases for a Thing in the current location. To be able to #DUMP an arbitrary Thing I used the unsafe package to dump any Thing by its address. For example ‘#DUMP 0xc4200eb650’: 0xc4200eb650 *attr.Thing #UID-39, 6 attributes: 0xc420124a80 *attr.Alias 2 aliases: "BALL", "#UID-39" 0xc4200eb5c0 *attr.Reset After: 0s Jitter: 2m0s Spawn: true 0x0 event.Cancel 0xc4200eb5f0 *attr.Cleanup After: 1s Jitter: 0s 0x0 event.Cancel 0xc420124aa0 *attr.Description "This is a small, red, rubber ball." 0xc420124ac0 *attr.Name "a small red ball" 0xc4200eb620 *attr.Locate -> Origin: 0xc4200f4be0 a small leather pouch, Where: 0xc4200f4be0 a small leather pouch A common technique I used was to #DUMP a Thing to find it’s address. Then junk the Thing / wait for reset or respawn and then #DUMP the address to examine the state of the Thing. Since discovering debug.SetPanicOnFault the #DUMP command no longer crashes the server if invalid addresses are given or the address is not that of a Thing ;) Also while working on clean ups, resets and respawning I needed to make sure items were being properly garbage collected. For this I added some logging to the NewThing and Free methods as well as a finalizer to Thing so I knew when it was garbage collected. This appears in the log as: NewThing: 0xc420014300 *attr.Thing #UID-6K: "a small red ball" Disposing: 0xc420014300 *attr.Thing #UID-6K: "a small red ball" Finalizing: 0xc420014300 *attr.Thing #UID-6K Note, I’ve dropped the date and time from these log lines to keep the line lengths short. Also note, this works best if Stats.Rate is set low — I usually set it to 10s — and Stats.GC is set to true. The changes for logging Thing information has not been released until now as they make the log very noisy. The format of the finalizer for Thing is quite simple and generic, I often reuse variations of it with a "[DUMP]" prefix when debugging allocations and garbage collection issues: runtime.SetFinalizer(t, func(t has.Thing) { log.Printf("Finalizing: %s", t) }) This change is now in the public dev branch along with some configuration changes which I will go into shortly. For general performance, allocation and memory debugging I use the standard pprof tool along with a #DEBUG command. The #DEBUG command allows players to start and stop different profiles from within WolfMUD. For example: >#debug memprof start Memory profiling turned on. // wait a while... >#debug memprof stop Memory profiling turned off. This generates a file called memprof in the server’s current directory that we can examine with Go’s pprof tool: >go tool pprof ./server memprof File: server Type: inuse_space Time: Apr 21, 2017 at 9:36pm (UTC) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 16094.02kB, 69.27% of 23234.47kB total Dropped 69 nodes (cum <= 116.17kB) Showing top 10 nodes out of 72 flat flat% sum% cum cum% 3749.62kB 16.14% 16.14% 3749.62kB 16.14% runtime.malg src/runtime/proc.go 2187.28kB 9.41% 25.55% 2187.28kB 9.41% attr.NewAlias attr/alias.go 2031.05kB 8.74% 34.29% 2031.05kB 8.74% attr.NewInventory internal/brl.go 1562.34kB 6.72% 41.02% 3005.62kB 12.94% comms.newClient comms/client.go 1251.75kB 5.39% 46.41% 5095.88kB 21.93% runtime.systemstack asm_amd64.s 1249.88kB 5.38% 51.78% 1249.88kB 5.38% attr.(*Thing).Add attr/thing.go 1249.88kB 5.38% 57.16% 2812.22kB 12.10% net.(*netFD).accept fd_unix.go 1093.64kB 4.71% 61.87% 1093.69kB 4.71% frontend.New frontend/frontend.go 937.41kB 4.03% 65.91% 937.41kB 4.03% net.sockaddrToTCP tcpsock_posix.go 781.17kB 3.36% 69.27% 781.17kB 3.36% attr.NewInventory attr/inventory.go (pprof) Note, the names on the right above have been badly mangled to keep the line lengths short for this post only. If the figures look high that’s because there were a lot of ‘players’ on the server when the example was made. The corresponding status line from the logs: U[110Mb +46kb] O[357548 +869] T[10184 +0] G[10020 +0] P 10000/10000 That’s 110Mb of memory used 357,548 objects 10,184 Things 10,020 goroutines for 10,000 players ;) #DEBUG can be used for CPU, memory and block profiling and for making heap dumps. It can also force a client goroutine to panic — for testing error handling. The #DEBUG command has not been released before and was only for my testing and debugging but it is now on the public dev branch and behind a new configuration option. So what are these new configuration options I’ve been mentioning? Debugging creates a lot of noise in the log. Some of the debugging commands are also not suitable for players on a ‘public’ server — even if it’s just run for a few friends. These reasons are why I’ve held back for so long on committing the changes. To deal with this I have improved the debug configuration for the server and finally added all of my debugging code and commands for others to use. The new configuration section looks like: // // Debug configuration // Debug.LongLog: false Debug.Panic: false Debug.AllowDump: false Debug.AllowDebug: false Debug.Events: false Debug.Things: false The default configuration in data/config.wrj has been updated and the new options are detailed in docs/configuration-file.txt but briefly: LongLog: If true adds millisecond precision and file/line number of where log message was written to log lines. Panic: If true causes client and event goroutine to panic and terminate the server instead of recovering gracefully. AllowDump: If true enables the #DUMP command. AllowDebug: If true enables the #DEBUG command. Events: If true writes log messages when an event is queued, cancelled or delivered. Things: If true writes log messages when a Thing is created, disposed of or finalized. The old Server.Debug setting has been removed. The new #DEBUG command takes the following options — detailed in cmd/debug.go: DEBUG MEMPROF ( START | STOP ) DEBUG CPUPROF ( START | STOP ) DEBUG BLOCKPROF ( START | STOP ) DEBUG HEAPDUMP DEBUG PANIC You can also use ON and OFF instead of START or STOP. And that is the whirlwind tour of debugging WolfMUD. There are also a few other updates on the public dev branch. The $ACTION and $CLEANUP commands now work with a passed alias. A count of the number of live Things has been added to the log status lines. The frog in the gardens has been added as has the tavern cat. Lease acquire and release log messages have been dropped as they were actually misleading and confusing under heavy loads. Anything else I wanted to say? I now have a Raspberry Pi running Raspbian for testing WolfMUD and a spare backup. Both are Model B rev 2 boards running at 900MHz. Currently the Pi can handle about 4,000 players. At some point I’ll probably upgrade to a Raspberry Pi 3. Maybe get a cheap, dinky Pi Zero W for WolfMUD as well. I’ve also switched over to the Go 1.8.1 release for WolfMUD without issues. -- Diddymus [1] Commands starting with a ‘#’ like #DUMP and #DEBUG are supposed to be privileged commands but I haven’t implemented access levels yet. Up to Main Index Up to Journal for April, 2017