Day 13: Packet Scanners
# 13 Dec 2017Notice: If you have coding experience, I highly recommend to try and solve the puzzle in a language you already know. It's a lot of fun and you'll be able to focus on Go more later on.
Was one of your previous solutions really really slow? We’re going to fix that today. Go read the problem description, it’s a fun one. Go on, I’ll wait here until you’re ready.
Good, you’re back! Let’s get started. On the agenda for today: more on interfaces and benchmarking.
Part A: interfaces
You may have noticed it through the previous posts, but I like to write the solution as a close model of the problem. I would rather create a a bunch of structs, rather than to use a mathematical solution. Sometimes it works out alright, sometimes it doesn’t. In this case, I’ll start with a Firewall
struct that is a slice of layers. It has a method called Passthrough
that calculates the summed severity of passing through all the layers without delay.
type Firewall []Layer
func (f Firewall) Passthrough() int {
sev := 0
// Pass through the firewall
for _, l := range f {
// Check if caught at this layer
if l.Caught() {
sev += l.Severity()
}
// Move the scanners on all layers forward
for _, s := range f {
s.Move()
}
}
return sev
}
Okay, that looks simple enough. Looking at the program description though, there is no continuous list of layers. Some layers are empty and don’t have a Scanners. So there are two different behaviours, but the Firewall only consists of Layer
objects. Maybe we should make Layer
an interface. To get the Passthrough
function to work, the interface will need three methods: Caught
, Severity
and Move
.
type Layer interface{
// Check if the scanner is in a position to catch us.
Caught() bool
// Calculate the severity of the layer.
Severity() int
// Move the scanner one position.
Move()
}
To implement this interface, the types only need to have these methods. There is no explicit reference to the Layer type. The compiler looks at the methods of the struct and determines on its own if this type matches the functions in the interface. To avoid the hard parts first, here’s the interface implementation for layers without scanner:
type ZeroLayer struct{}
func (l ZeroLayer) Caught() bool {
return false
}
func (l ZeroLayer) Severity() int {
return 0
}
func (l ZeroLayer) Move() {
// noop
}
Really, that’s all there is to it. The packet is never caught, severity is always 0 and there is no scanner to move. And because the ZeroLayer struct has the three methods, it implements the Layer interface. The implementation for layers with a scanner, is a bit more complex as it contains actual logic:
type SecurityLayer struct {
Depth int
Range int
Scanner int
Dir bool
}
func (l SecurityLayer) Caught() bool {
return l.Scanner == 0
}
func (l SecurityLayer) Severity() int {
return l.Depth * l.Range
}
func (l *SecurityLayer) Move() {
if l.Dir {
l.Scanner++
// if scanner is at the end, we move back down again
l.Dir = (l.Scanner < l.Range-1)
} else {
l.Scanner--
// if scanner is 0, we move back up again
l.Dir = (l.Scanner == 0)
}
}
Part B: Benchmarking
Putting all these things together, gives us the solution for part A. The nice thing is, with these methods, we can also create a solution for part B. In the main function, increase a delay counter to find the delay where you don’t get caught. Pass the delay to a method of the firewall:
func (f Firewall) Caught(delay int) bool {
// Move the scanners forward for `delay` picoseconds.
for i := 0; i < delay; i++ {
for _, l := range f {
l.Move()
}
}
// Pass through the firewall
for _, l := range f {
// Check if caught
if l.Caught() {
return true
}
// Move the scanners forward
for _, s := range f {
s.Move()
}
}
return false
}
If you try it this way, you’ll be waiting a long time. A really long time. If you look at the structure of that function, there are a number of nested loops. In my input file, there are 98 layers in the firewall. This means that, with a delay of 100 picoseconds, there would be 98*(100+98)=19404
calls to move scanners. But you also had to check the 99 delays before 100 picoseconds: 98*(100+98) + 98*(99+98) + 98*(98+98) + ... + 98*(0+98)
. At just a delay of 100, that already results in a huge amount of calls to move scanners. And I can tell you, the solution delay is a lot higher! But you know, you don’t have to trust my math. Trust my benchmarking. The test framework that comes bundled with Go also has a very powerful benchmarking feature. The benchmarking feature executes all functions with a Benchmark
prefix, located in files with a _test.go
suffix. The function below runs the Caught method with a delay of 10000. The number of times the function is executed, b.N
, is determined by the test framework. The Reset function resets the position of the scanners for each of the layers, to avoid creating new objects all the time.
func BenchmarkPartB(b *testing.B) {
fw := Firewall([]Layer{
&SecurityLayer{0, 4, 0, true},
&SecurityLayer{1, 2, 0, true},
&SecurityLayer{2, 3, 0, true},
&ZeroLayer{},
&SecurityLayer{4, 5, 0, true},
&ZeroLayer{},
&SecurityLayer{6, 8, 0, true},
&ZeroLayer{},
&SecurityLayer{8, 6, 0, true},
&ZeroLayer{},
})
for i := 0; i < b.N; i++ {
fw.Reset()
fw.Caught(10000)
}
}
Running the benchmark is done by adding the -bench
commandline argument to the go test
command.
thomas@local:~$ go test -bench=. github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4 2000 552338 ns/op
PASS
ok github.com/blackskad/adventofcode/2017/day13 1.175s
As you can see, it doesn’t really give you a lot of information by default. Only that the loop ran 2000 times, it took 550000 ns per iteration and the whole benchmark ran for 1.175s. Right, not a lot of inforamtion to find the hotspots in our code at first sight. If you look at your filesystem, there will be a binary file with more information that you can look at with a tool called pprof. Fortunately, there are other tools that are more visual than pprof. One of them is go-torch to generate flamegraphs of your program. The installation is quite easy, just make sure the git clone is part of your $PATH
. I’ve put it in my current working directory.
thomas@local:~$ go get github.com/uber/go-torch
thomas@local:~$ git clone https://github.com/brendangregg/FlameGraph.git
To create the flamegraph, we’ll need to pass in the cpuprofile commandline flag to the go test
command. With the resulting files, we can call the go-torch to create the actual flamegraph.
thomas@local:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4 3000 562426 ns/op
PASS
ok github.com/blackskad/adventofcode/2017/day13 1.924s
thomas@local:~$ $GOPATH/bin/go-torch day13.test day13.prof
INFO[16:44:46] Run pprof command: go tool pprof -raw -seconds 30 day13.test day13.prof
INFO[16:44:47] Writing svg to torch.svg
The image below is the flamegraph for this very slow implementation. It’s a little small here, but open it in a new tab, and you’ll see the real magic. The image allows you to drill down in your callstack. The width of the bar represents the number of samples for that function. The wider the bar, the more time is spend in the function. As you can see, we’re spending nearly 54% of the samples in the Move function.
There are a number of ways to optimize this and reduce the number of Move calls. The obvious first place is the loop to skip the delay. Instead of doing delay
single move operations, can’t we figure out a way to do a single jump operation? Of course we can, with a bit of math:
func (l *SecurityLayer) Jump(delay int) {
moves := delay % (2 * (l.Range - 1))
if moves < l.Range {
l.Scanner = moves
l.Dir = (l.Scanner < l.Range-1)
} else {
l.Scanner = 2*(l.Range-1) - moves
l.Dir = (l.Scanner == 0)
}
}
The new Caught
method now looks like this:
func (f Firewall) Caught(delay int) bool {
// Move the scanners forward for `delay` picoseconds.
for _, l := range f {
l.Jump(delay)
}
// Pass through the firewall
for _, l := range f {
// Check if caught
if l.Caught() {
return true
}
// Move the scanners forward
for _, s := range f {
s.Move()
}
}
return false
}
The benchmark looks a lot better now. The time for a single iteration went down to 500ns and we were able to run 3 million iterations.
thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4 3000000 536 ns/op
PASS
ok github.com/blackskad/adventofcode/2017/day13 2.323s
The flamegraph also looks a lot better. The Move function was reduced to a little over 30% of the samples. The new Jump function takes about 14% and now the Caught function also shows up at around 6%.
But surely we can do even better, no? Ofcourse we can. We’re calculating all the positions of the scanners at all times. But we’re not interested in the position of scanner in layer N at time (delay+N-1)
or (delay+N+1)
. We only want to know where the scanner in layer N at time delay+N
. So let’s simply loop through the layers and immediately jump to the position that we need to evaluate.
func (f Firewall) Caught(delay int) bool {
// Pass through the firewall
for i, l := range f {
// Move the scanner forward for `delay + i` picoseconds.
l.Jump(delay + i)
// Check if caught
if l.Caught() {
return true
}
}
return false
}
Oh hey, look, no more calls to Move! And the time per iteration dropped to 180 ns, allowing for 10 million iterations.
thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4 10000000 184 ns/op
PASS
ok github.com/blackskad/adventofcode/2017/day13 2.233s
But now another function shows up on the flamegraph: the Reset function takes 13% of the samples, but doesn’t actually contribute anything to the solution. We should be able to get rid of it. There is no reason to keep track of the position and direction of the scanner anymore. So let’s create a new function PositionAt
that returns the position of the scanner at time N. If that function returns 0, the packet is caught.
func (l *SecurityLayer) PositionAt(delay int) int {
moves := delay % (2 * (l.Range - 1))
if moves < l.Range {
return moves
}
return 2*(l.Range-1) - moves
}
func (f Firewall) Caught(delay int) bool {
// Pass through the firewall
for i, l := range f {
// Calculate the position and check if caught
if l.PositionAt(delay+i) == 0 {
return true
}
}
return false
}
Well look at that, we’ve got it down to 117ns per iteration at 20 million iterations! And the flamegraph only contains useful things!
thomas@localhost:~$ go test -bench=. -cpuprofile=day13.prof github.com/blackskad/adventofcode/2017/day13/
goos: darwin
goarch: amd64
pkg: github.com/blackskad/adventofcode/2017/day13
BenchmarkPartA-4 20000000 117 ns/op
PASS
ok github.com/blackskad/adventofcode/2017/day13 2.624s
So here we are. I can finally calculate my part B solution in a very fast way. The delay I was looking for was 3830344 by the way. The full solution is available in my github repository. See you all tomorrow!