WolfMUD: Journal for Sunday 29th July, 2018


  Up to Main Index                              Up to Journal for July, 2018

                     JOURNAL FOR SUNDAY 29TH JULY, 2018
______________________________________________________________________________

SUBJECT: A thought on scheduling
   DATE: Sun 29 Jul 22:53:41 BST 2018

The hot summer continues. Temperatures in my study have been over 30°C this
week. Tried killing one of my Raspberry Pi (model B). The Pi was sat on top of
a network switch and both became rather hot. I didn’t notice until the Pi
stopped responding on the network. Once the Pi had been moved and left to cool
for a bit it rebooted just fine and started working normally again.

This week I have stopped fiddling around with Raspberry Pi[1] and returned to
working on WolfMUD properly. I’ve mostly been concentrating on the writing of
recordjar files — coding tests and making some minor improvements to the code
in the recordjar package.

Previously, when I was load testing the Pi Zero, I noticed that scheduling of
goroutines was very unfair (random) under heavy load. This caused pauses when
processing commands, and you would start to see things like:


  >sneeze
  You see Diddymus enter.
  >
  You see Diddymus go north.
  >
  You sneeze. Aaahhhccchhhooo!
  >


In this example you type the command ‘sneeze’ and press enter. While you are
waiting for the sneeze to acquire the necessary location locks Diddymus enters
and then leaves again. Entering and leaving are two separate commands to your
one. This can arise due to goroutine scheduling being random. Both yourself
and Diddymus need location locks. Diddymus gets the locks first and enters,
the locks are then released. Due to other players on the server someone else
then gets the location locks while yourself and Diddymus are waiting. Again
Diddymus then gets the location locks before you and leaves. Finally, you get
the location locks and sneeze.

I don’t want to globally serialise access to locations. Otherwise a player
wanting locks for one location would end up waiting on a player in an
independent location. On multicore machines, where we can handle multiple
players at once, performance would suffer. Serialising locks per location
would require add a channel to every location, and each channel would have to
have a capacity equal to the maximum number of players allowed on the server.
The channels need to be that large, otherwise players would block writing to
the channel and which player gets unblock is again random — leaving us back at
square one. I could allocate the channels from a pool as and required, but I
think that could get overly complicated quickly.

The answer may be a hybrid of what we have now and queuing. Instead of using a
mutex for location locking we switch to channels. The advantage of using a
channel is that we can use a switch statement to try and get a lock and do
something else if the lock is not immediately available. The something else
would be to release any locks we have acquired and then queue. Queued players
are then allowed to acquire locks one after the other in FIFO order. This
could also get overly complicated quickly.

For now I’m just mulling over ideas, and making a few notes for later. I don’t
want to get distracted from writing those pesky, but essential, tests.

--
Diddymus

  [1] The Raspberry Pi are now setup "just right" for how I like to work.


  Up to Main Index                              Up to Journal for July, 2018