Day 13: Packet Scanners

Notice: If you have coding experience, I highly recommend to try and solve the puzzle in a language you already know. It's a lot of fun and you'll be able to focus on Go more later on.

Was one of your previous solutions really really slow? We’re going to fix that today. Go read the problem description, it’s a fun one. Go on, I’ll wait here until you’re ready.

Good, you’re back! Let’s get started. On the agenda for today: more on interfaces and benchmarking.

Part A: interfaces

You may have noticed it through the previous posts, but I like to write the solution as a close model of the problem. I would rather create a a bunch of structs, rather than to use a mathematical solution. Sometimes it works out alright, sometimes it doesn’t. In this case, I’ll start with a Firewall struct that is a slice of layers. It has a method called Passthrough that calculates the summed severity of passing through all the layers without delay.

type Firewall []Layer

func (f Firewall) Passthrough() int {
	sev := 0

	// Pass through the firewall
	for _, l := range f {
		// Check if caught at this layer
		if l.Caught() {
			sev += l.Severity()
		}
		// Move the scanners on all layers forward
		for _, s := range f {
			s.Move()
		}
	}
	return sev
}

Okay, that looks simple enough. Looking at the program description though, there is no continuous list of layers. Some layers are empty and don’t have a Scanners. So there are two different behaviours, but the Firewall only consists of Layer objects. Maybe we should make Layer an interface. To get the Passthrough function to work, the interface will need three methods: Caught, Severity and Move.

type Layer interface{
	// Check if the scanner is in a position to catch us.
	Caught() bool	
	// Calculate the severity of the layer.
	Severity() int
	// Move the scanner one position.
	Move()
}

To implement this interface, the types only need to have these methods. There is no explicit reference to the Layer type. The compiler looks at the methods of the struct and determines on its own if this type matches the functions in the interface. To avoid the hard parts first, here’s the interface implementation for layers without scanner:

type ZeroLayer struct{}

func (l ZeroLayer) Caught() bool {
	return false
}

func (l ZeroLayer) Severity() int {
	return 0
}

func (l ZeroLayer) Move() {
	// noop
}

Really, that’s all there is to it. The packet is never caught, severity is always 0 and there is no scanner to move. And because the ZeroLayer struct has the three methods, it implements the Layer interface. The implementation for layers with a scanner, is a bit more complex as it contains actual logic:

type SecurityLayer struct {
	Depth int
	Range int

	Scanner int
	Dir     bool
}

func (l SecurityLayer) Caught() bool {
	return l.Scanner == 0
}

func (l SecurityLayer) Severity() int {
	return l.Depth * l.Range
}

func (l *SecurityLayer) Move() {
	if l.Dir {
		l.Scanner++
		// if scanner is at the end, we move back down again
		l.Dir = (l.Scanner < l.Range-1)
	} else {
		l.Scanner--
		// if scanner is 0, we move back up again
		l.Dir = (l.Scanner == 0)
	}
}

Part B: Benchmarking

Putting all these things together, gives us the solution for part A. The nice thing is, with these methods, we can also create a solution for part B. In the main function, increase a delay counter to find the delay where you don’t get caught. Pass the delay to a method of the firewall:

func (f Firewall) Caught(delay int) bool {
	// Move the scanners forward for `delay` picoseconds.
	for i := 0; i < delay; i++ {
		for _, l := range f {
			l.Move()
		}
	}

	// Pass through the firewall
	for _, l := range f {
		// Check if caught
		if l.Caught() {
			return true
		}
		// Move the scanners forward
		for _, s := range f {
			s.Move()
		}
	}
	return false
}

If you try it this way, you’ll be waiting a long time. A really long time. If you look at the structure of that function, there are a number of nested loops. In my input file, there are 98 layers in the firewall. This means that, with a delay of 100 picoseconds, there would be 98*(100+98)=19404 calls to move scanners. But you also had to check the 99 delays before 100 picoseconds: 98*(100+98) + 98*(99+98) + 98*(98+98) + ... + 98*(0+98). At just a delay of 100, that already results in a huge amount of calls to move scanners. And I can tell you, the solution delay is a lot higher! But you know, you don’t have to trust my math. Trust my benchmarking. The test framework that comes bundled with Go also has a very powerful benchmarking feature. The benchmarking feature executes all functions with a Benchmark prefix, located in files with a _test.go suffix. The function below runs the Caught method with a delay of 10000. The number of times the function is executed, b.N, is determined by the test framework. The Reset function resets the position of the scanners for each of the layers, to avoid creating new objects all the time.

func BenchmarkPartB(b *testing.B) {
	fw := Firewall([]Layer{
		&SecurityLayer{0, 4, 0, true},
		&SecurityLayer{1, 2, 0, true},
		&SecurityLayer{2, 3, 0, true},
		&ZeroLayer{},
		&SecurityLayer{4, 5, 0, true},
		&ZeroLayer{},
		&SecurityLayer{6, 8, 0, true},
		&ZeroLayer{},
		&SecurityLayer{8, 6, 0, true},
		&ZeroLayer{},
	})

	for i := 0; i < b.N; i++ {
		fw.Reset()
		fw.Caught(10000)
	}
}

Running the benchmark is done by adding the -bench commandline argument to the go test command.

thomas@local:~$ go test -bench=. github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4   	    2000	    552338 ns/op
PASS
ok  	github.com/blackskad/adventofcode/2017/day13	1.175s

As you can see, it doesn’t really give you a lot of information by default. Only that the loop ran 2000 times, it took 550000 ns per iteration and the whole benchmark ran for 1.175s. Right, not a lot of inforamtion to find the hotspots in our code at first sight. If you look at your filesystem, there will be a binary file with more information that you can look at with a tool called pprof. Fortunately, there are other tools that are more visual than pprof. One of them is go-torch to generate flamegraphs of your program. The installation is quite easy, just make sure the git clone is part of your $PATH. I’ve put it in my current working directory.

thomas@local:~$ go get github.com/uber/go-torch
thomas@local:~$ git clone https://github.com/brendangregg/FlameGraph.git

To create the flamegraph, we’ll need to pass in the cpuprofile commandline flag to the go test command. With the resulting files, we can call the go-torch to create the actual flamegraph.

thomas@local:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4   	    3000	    562426 ns/op
PASS
ok  	github.com/blackskad/adventofcode/2017/day13	1.924s

thomas@local:~$ $GOPATH/bin/go-torch day13.test day13.prof
INFO[16:44:46] Run pprof command: go tool pprof -raw -seconds 30 day13.test day13.prof
INFO[16:44:47] Writing svg to torch.svg

The image below is the flamegraph for this very slow implementation. It’s a little small here, but open it in a new tab, and you’ll see the real magic. The image allows you to drill down in your callstack. The width of the bar represents the number of samples for that function. The wider the bar, the more time is spend in the function. As you can see, we’re spending nearly 54% of the samples in the Move function.

Slowest flamegraph

There are a number of ways to optimize this and reduce the number of Move calls. The obvious first place is the loop to skip the delay. Instead of doing delay single move operations, can’t we figure out a way to do a single jump operation? Of course we can, with a bit of math:

func (l *SecurityLayer) Jump(delay int) {
	moves := delay % (2 * (l.Range - 1))
	if moves < l.Range {
		l.Scanner = moves
		l.Dir = (l.Scanner < l.Range-1)
	} else {
		l.Scanner = 2*(l.Range-1) - moves
		l.Dir = (l.Scanner == 0)
	}
}

The new Caught method now looks like this:

func (f Firewall) Caught(delay int) bool {
	// Move the scanners forward for `delay` picoseconds.
	for _, l := range f {
		l.Jump(delay)
	}

	// Pass through the firewall
	for _, l := range f {
		// Check if caught
		if l.Caught() {
			return true
		}
		// Move the scanners forward
		for _, s := range f {
			s.Move()
		}
	}
	return false
}

The benchmark looks a lot better now. The time for a single iteration went down to 500ns and we were able to run 3 million iterations.

thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4   	 3000000	       536 ns/op
PASS
ok  	github.com/blackskad/adventofcode/2017/day13	2.323s

The flamegraph also looks a lot better. The Move function was reduced to a little over 30% of the samples. The new Jump function takes about 14% and now the Caught function also shows up at around 6%.

Flamegraph using jump

But surely we can do even better, no? Ofcourse we can. We’re calculating all the positions of the scanners at all times. But we’re not interested in the position of scanner in layer N at time (delay+N-1) or (delay+N+1). We only want to know where the scanner in layer N at time delay+N. So let’s simply loop through the layers and immediately jump to the position that we need to evaluate.

func (f Firewall) Caught(delay int) bool {
	// Pass through the firewall
	for i, l := range f {
		// Move the scanner forward for `delay + i` picoseconds.
		l.Jump(delay + i)
		// Check if caught
		if l.Caught() {
			return true
		}
	}
	return false
}

Oh hey, look, no more calls to Move! And the time per iteration dropped to 180 ns, allowing for 10 million iterations.

thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4   	10000000	       184 ns/op
PASS
ok  	github.com/blackskad/adventofcode/2017/day13	2.233s

Flamegraph only using Jump

But now another function shows up on the flamegraph: the Reset function takes 13% of the samples, but doesn’t actually contribute anything to the solution. We should be able to get rid of it. There is no reason to keep track of the position and direction of the scanner anymore. So let’s create a new function PositionAt that returns the position of the scanner at time N. If that function returns 0, the packet is caught.

func (l *SecurityLayer) PositionAt(delay int) int {
	moves := delay % (2 * (l.Range - 1))
	if moves < l.Range {
		return moves
	}
	return 2*(l.Range-1) - moves
}

func (f Firewall) Caught(delay int) bool {
	// Pass through the firewall
	for i, l := range f {
		// Calculate the position and check if caught
		if l.PositionAt(delay+i) == 0 {
			return true
		}
	}
	return false
}

Well look at that, we’ve got it down to 117ns per iteration at 20 million iterations! And the flamegraph only contains useful things!

thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4   	20000000	       117 ns/op
PASS
ok  	github.com/blackskad/adventofcode/2017/day13	2.624s

Flamegraph using PositionAt

So here we are. I can finally calculate my part B solution in a very fast way. The delay I was looking for was 3830344 by the way. The full solution is available in my github repository. See you all tomorrow!