Up to Main Index                          Up to Journal for December, 2015

                   JOURNAL FOR SATURDAY 11TH DECEMBER, 2015
______________________________________________________________________________

SUBJECT: Minor dev update, networking and memory leaks
   DATE: Sat 12 Dec 18:35:30 GMT 2015

I'm currently still working on the networking code. The code itself is quite
simple. As the WolfMUD server can run for a long period of time memory leaks
are a big concern. Memory leaks are what I have been trying to track down this
week.

Before I go on about the memory leaks there is also something else I want to
cover first. Adding networking means that there are many more goroutines
running. So I've also been testing the code with Go's race detector. In doing
so I found a data race bug in the BRL - Big Room Lock. This is now fixed and
the code has been pushed out to the public git dev branch. It's a little odd
as I used nearly the same code as before - which wasn't detected as being
racy. Maybe the race detector has improved or there was a race and I simply
didn't trigger it before.

In doing the network and race testing I threw up to 20480 simultaneous clients
and 1,000,000 consecutive clients at the server. Testing takes a long time and
is very unglamorous and laborious.

For testing I expanded the number of locations in the world to ten. However,
there is only a single entry point. With only a single entry point into the
world there is a lot of contention for the BRL when all clients initially
connect. As the clients move to other locations and spread out things become
responsive again and playable. Note that at the moment I have not implemented
any optimisations yet, such as crowd control[1].

For testing I've been running the server a lot with debugging turned on:


  GODEBUG=schedtrace=10000,scheddetail=1 GOTRACEBACK=2 ./server


I noticed that a lot of goroutines were dead and not being reused or reclaimed
by the garbage collector. They just lingered around and showed up as:


  G10: status=6() m=-1 lockedm=-1
  G38: status=6() m=-1 lockedm=-1
  :
  : (repeated a lot of times...)
  :


Under Go 1.5.2, runtime/runtime2.go defines status 6 as Gdead. I've had a
similar issue before as well[2]. It looks like they are being pinned by a
reference some how. I've made sure I'm closing down the network connections
and checking for any and all errors. I've even added code to make sure
channels are drained and closed and that references to other structures are
set to nil. Still there is a small leak and dead goroutines piling up.

More debugging lies ahead :( I'll just publish this and then get on with it...

--
Diddymus

  [1] Crowd control, see: ../../2012/12/14.html

  [2] Memory leaks, see: ../../2012/4/29.html
                         ../../2012/5/1.html
                         ../../2012/5/22.html


  Up to Main Index                          Up to Journal for December, 2015