Google's FooBar Challenge

Lachlan Gibson

24 May 2023, updated 8 June 2023

Recently, I completed Google's FooBar challenge in April 2023. This involved solving a series of mathematical and coding problems using Python 2.7 or Java. The challenge presented five levels of increasing difficulty, with 1-3 tasks per level. In this document, I summarise the tasks I undertook and explain how I resolved each one. For further insight into the challenge, see articles by Turing and Medium.

The document is structured into five sections, each pertaining to a different challenge level. Every level contains a section for each task. Within each task section, there are three main components. The first is a simplified task description, adapted from the original text. The second part consists of an explanation of how I approached and solved the problem. The final part features my Python code, which can be revealed by clicking on the corresponding box. The constraints for coding in Python or Java are delineated below.

Should you wish to attempt a task yourself, I suggest you firstly peruse the task description and attempt the problem without reading my solution. Feel free to search online for any pertinent mathematical or coding principles, but refrain from seeking a direct solution to the problem. If you find yourself stuck for an extended period, then consider reading my solution explanation, but do not yet reveal my code. I recommend revealing and examining my code only if you are absolutely stuck, or if you wish to compare your solution with mine. Remember, some of the challenges are complex and may take several hours or even several days to solve. Enjoy the process!

Challenge 1 Challenge 2 Challenge 3 Challenge 4 Challenge 5 Bonus

Python Constraints
==================
Your code will run inside a Python 2.7.13 sandbox. All tests will be run by calling the solution() function. Standard libraries are supported except for bz2, crypt, fcntl, mmap, pwd, pyexpat, select, signal, termios, thread, time, unicodedata, zipimport, zlib. Input/output operations are not allowed. Your solution must be under 32000 characters in length including new lines and other non-printing characters.

Java Constraints
================
Your code will be compiled using standard Java 8. All tests will be run by calling the solution() method inside the Solution class. Execution time is limited. Wildcard imports and some specific classes are restricted (e.g. java.lang.ClassLoader). You will receive an error when you verify your solution if you have used a restricted class. Third-party libraries, input/output operations, spawning threads or processes and changes to the execution environment are not allowed. Your solution must be under 32000 characters in length including new lines and other non-printing characters.

Challenge 1

The first level contained only a single task, titled Re-ID, which involved efficiently generating prime numbers. A task summary, adapted from the original challenge instructions, is given below.

Re-ID
=====

Write a function solution(n) which takes in the starting index n of a string of all primes, "2357111317192329...", and returns the next five digits in the string. For example, if n is 3 then their ID number will be "71113". The value of n will always be between 0 and 10000.

-- Test cases --
Input : solution(0)
Output: 23571

Input : solution(3)
Output: 71113

If the string of prime numbers is known, then the solution is trivially given by returning the five digits starting at index $n$ . Therefore, I solved this problem by defining a function that firstly generated this string of primes. This function works by checking the primality of all integers from $2$ to $m=20+\lfloor n \ln 10 \rfloor$ . This choice of $m$ ensures that the resulting string of primes is at least $n+5$ characters long, without too much excessive computation.

This is a result of the prime number theorem which gives an asymptotic expression for the prime counting function,

\pi(m)\sim\frac{m}{\ln m},

which counts the number of prime numbers $\leqslant m$ . The maximum number of digits each prime can have is $\lceil \log m \rceil$ (base 10). Therefore, having $n+5 < \pi(m)\lceil \log m \rceil$ should ensure the resulting string is long enough when $n$ is large. This condition is satisfied by

m=20+\lfloor n \ln 10 \rfloor,

which I checked empirically worked for when $n$ is low.

Each integer is considered prime if no previously identified prime numbers divide it. This involves checking all previously identified prime numbers less than or equal to the square-root of the current candidate number (there is no point checking factors higher than the square-root because they all pair with divisors less than the square-root). Once the list of prime numbers is generated then the integers are converted to strings before being joined into a single string.

--- Reveal Code ---

Challenge 2

The second level contained two tasks, titled Please Pass the Coded Messages and Bunny Worker Locations.

Please Pass the Coded Messages
==============================

You have L, a list containing some digits (0 to 9). Write a function solution(L) which finds the largest number that can be made from some or all of these digits and is divisible by 3. If it is not possible to make such a number, return 0 as the solution. L will contain anywhere from 1 to 9 digits. The same digit may appear multiple times in the list, but each element in the list may only be used once.

-- Test cases --
Input : solution([3, 1, 4, 1])
Output: 4311

Input : solution([3, 1, 4, 1, 5, 9])
Output: 94311

To solve this problem, let's first consider the simpler problem of finding the largest number containing all the digits in the list L. Obviously, this is achieved by sorting the list with the largest on the left to the smallest on the right, so that the larger digits correspond to the higher decimal positions. Removing any digit will strictly decrease the size of the number (except when all digits are zero). Therefore, this result forms an upper bound to the original problem of finding the largest number that is also divisible by three.

Now we can make use of the property of decimal numbers that they are divisible by 3 if and only if the sum of their digits is also divisible by three. Or, more generally, that the number is congruent modulo 3 to its digit sum. Since the order of a summation does not effect the result, we can use this rule to check the modularity of a list of numbers without knowing the order. Next we note that removing any two digits never lowers the value by less than only removing any one digit. Also note that removing a larger digit will lower the value by more than removing a lower digit. Therefore, we can solve the problem by firstly minimising the number of digits that need to be removed, and then minimising the values of the removed digits.

The sum of digits can be congruent to 0, 1 or 2 (mod 3). In the first case, any permutation of all digits is divisible by 3 without removing any digits. Therefore, the solution returns the number containing all the digits in descending order. If the sum is congruent to 1 or 2, then there are three cases. Firstly, one of the digits is congruent to the sum, in which case, we just remove the smallest such digit. Secondly, two of the digits are congruent to twice the sum, in which case we remove the two smallest such digits. Thirdly, no combination of digits is divisible by 3, so return 0. After removing any digits, the solution is the number containing all the remaining digits in descending order. Note, that if there are no remaining digits, then return 0.

--- Reveal Code ---

Bunny Worker Locations
======================

In a triangular grid of values each cell can be represented as points (x, y), with x being the distance from the vertical wall, and y being the height from the ground. For example, the value at (1, 1) is 1, the value at (3, 2) is 9, and the value at (2,3) is 8. This pattern of numbering continues indefinitely.

| 7
| 4 8
| 2 5 9
| 1 3 6 10

Write a function solution(x, y) which returns the grid value at the location (x, y). Both x and y will be at least 1 and no greater than 100,000. Since the values can be very large, return your solution as a string representation of the number.

-- Test cases --
Input : solution(5, 10)
Output: 96

Input : solution(3, 2)
Output: 9

The grid changes by 1 each step along a diagonal, which gives the following recursive relationships

f(x,y)=f(x+1,y-1)-1,

f(x,y)=f(x+k,y-k)-k,

where $k$ can be any integer satisfying $0 < k < y$ . The grid coordinates can be as large as 100000 so generating the entire grid would probably be much slower than directly calculating the values from a closed form expression for $f(x,y)$ . To find one, I first noticed that the first row is given by the triangular numbers. This makes sense since the values are given by the size of each diagonal, which grow by 1 each time. Therefore,

f(x,1)=\sum_{k=1}^x k=\frac{x(x+1)}{2}.

So, we can get a closed form expression by choosing $k=y-1$ in the recursive relationship and then using the result for $f(x,1)$ by substituting $x\rightarrow x+y-1$

f(x,y)=\frac{(x+y-1)(x+y)}{2}-y+1.

--- Reveal Code ---

Challenge 3

The third level contained three tasks, titled Prepare the Bunnies' Escape, The Grandest Staircase Of Them All and Fuel Injection Perfection.

Prepare the Bunnies' Escape
===========================

Write a function solution(map) that generates the length of the shortest path along a rectangular grid from (0,0) to (w-1,h-1), where w is the grid width, h is the grid height and the map is represented as a matrix of 0s and 1s, where 0s are passable space and 1s are impassable walls. The path may remove up to one wall. The path length is the total number of nodes you pass through, counting both the entrance and exit nodes. The starting and ending positions are always passable (0). The map will always be solvable, though you may or may not need to remove a wall. The height and width of the map can be from 2 to 20. Moves can only be made in cardinal directions; no diagonal moves are allowed.

-- Test cases --
Input : solution([[0, 1, 1, 0], [0, 0, 0, 1], [1, 1, 0, 0], [1, 1, 1, 0]])
Output: 7

Input : solution([[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 0], [0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 1], [0, 1, 1, 1, 1, 1], [0, 0, 0, 0, 0, 0]])
Output: 11

To solve this problem I represent maps as graphs where nodes correspond to passable spaces and edges correspond to adjacent passable spaces. Then I use the A* algorithm to find the shortest path length from the start and end nodes. To improve the efficiency of the algorithm I use the Manhattan distance as a lower bound to the remaining distance between any node and the end node. After implementing the A* algorithm I use it to find the shortest path lengths of all possible maps with exactly 1 or 0 walls removed. The function then returns the smallest of these path lengths.

--- Reveal Code ---

The Grandest Staircase Of Them All
==================================

Write a function called solution(n) that takes a positive integer n and returns the number of different staircases that can be built from exactly n bricks. n will always be between 3 and 200 (inclusive). Each type of staircase should consist of 2 or more steps with no two steps being at the same height. All steps must contain at least one brick. A step's height is classified as the total amount of bricks that make up that step.

For example, when n = 3, you have only 1 choice of how to build the staircase, with the first step having a height of 2 and the second step having a height of 1: (# indicates a brick)

#
##
21

When n = 4, you still only have 1 staircase choice:

#
#
##
31

But when n = 5, there are two ways you can build a staircase from the given bricks. The two staircases can have heights (4, 1) or (3, 2), as shown below:

#
#
#
##
41

#
##
##
32

-- Test cases --
Input : solution(200)
Output: 487067745

Input : solution(3)
Output: 1

To solve this problem I used the principles of dynamic programming to write a recursive function which relied on memoisation to run efficiently. Consider the function $w(n,m,s)$ which calculates the number of valid staircases that can be built with $n$ bricks, with each stair containing a maximum of $m$ bricks and at least $s$ more steps. We can express $w$ as a recursive function. Specifically, the number of staircases that can be built is the sum of the number of staircases that can be built given the size of the next step. For example, if there are 5 bricks, then the number of possible staircases is the number of staircases that end with a height of 1, plus the number that end with a height of 2, plus a height of 3 and plus a height of 4.

There are two base cases to this recursion. The first occurs when there are too many bricks. This occurs when the number of bricks is more than a maximal staircase with a constant step size of 1, $n > m(m+1)/2$ . In this case, the number of valid staircases is 0. The second base case occurs when there are no bricks $n=0$ , and the minimum number of additional steps is less than zero ( $s\leqslant 0$ ). In this case, the number of valid staircases is 1. Therefore,

w(n,m,s)=\begin{cases} 0,\quad n > m(m+1)/2\\[0.5em] 1,\quad n=0 \text{ and } s\leqslant 0 \\[0.5em] \sum_{k=1}^{\min(n,m)} w(n-k,k-1,s-1) \end{cases}

with the solution is given by $w(n,n,2)$ .

--- Reveal Code ---

Fuel Injection Perfection
=========================

Write a function called solution(n) which takes a positive integer as a string and returns the minimum number of operations needed to transform n to 1. Quantities will not exceed 309 digits long. The three allowed operations are +1, -1 and /2 (if the quantity is even).

For example:
n = 4 requires two operations: 4 -> 2 -> 1
n = 15 requires five operations: 15 -> 16 -> 8 -> 4 -> 2 -> 1

-- Test cases --
Input : solution('15')
Output: 5

Input : solution('4')
Output: 2

To solve the problem we can identify several principles for minimising the number of operations to reach 1. Firstly, no shortest trajectory would contain a $+1$ operation immediately followed by a $-1$ operation, or vice versa, because any trajectory containing any such consecutive operations could be shortened by 2 steps by removing them. Secondly, even numbers should always be divided by 2. This is because a trajectory that repeats the other operations followed by a division can be shortened by dividing first. For example,

\begin{aligned} &2k \rightarrow 2k\pm 1 \rightarrow 2k\pm 2 \rightarrow k\pm 1, \\ &2k \rightarrow k \hphantom{2 + 1} \rightarrow k\pm 1, \end{aligned}

where $k$ is any natural number. Basically, the number of consecutive addition or subtraction operations can be cut in half (or half plus 1) by dividing first. Note that this assumes that eventually there will be another halving operation. Except in the cases of $n=2$ and $n=3$ where both halving and subtracting 1 will work, in every other case halving requires fewer operations than subtracting 1. Specifically, continuously subtracting 1 requires exactly $n-1$ operations, while halving every even and subtracting 1 off every odd number will require no more than $2\lceil \log_2 (n+1) \rceil -2$ (when $n$ is one less than a power of 2) operations and no less than $\lceil \log_2 n \rceil$ (when $n$ is a power of 2) operations. Obviously, repeated additions can never reach 1 since they strictly increase the number. Therefore, for all $n>3$ there exists a shortest sequence of operations whereby every even value is halved, and the number of even values is at least one.

To decide on a rule for odd values, let us consider the two possible cases where $n=4k+1$ and $n=4k+3$ for any natural number, $k$ . In the first case we can draw the following tree.

flow diagram
In this case adding 1 inevitably leads to either $k$ or $k+1$ , both of which can be reached in fewer or equal number of operations by subtracting 1. Therefore, when $n=4k+1$ subtracting 1 is weakly faster than adding 1. Given this strategy, we can make an analogous decision tree for the $n=4k+3$ case.

flow diagram
Now choosing to subtract 1 will inevitably lead to either $k$ or $k+1$ , both of which can be reached in fewer or equal number of operations by adding 1. Therefore, when $n=4k+3$ adding 1 is weakly faster than subtracting 1.

Finally, let us consider the simple small cases of which cannot be expressed as $n=4k+r$ . When $n=1$ we do not need any operations so return 0, when $n=2$ follow the halving even rule to get to 1 needs only 1 operation, and when $n=3$ subtracting 1, rather than adding 1 is faster. Therefore, we can define the following recursive function to solve the problem,

f(n)=\begin{cases} 0, & n = 1,\\[0.5em] 1 + f(n/2), & n \equiv 0 \mod{2},\\[0.5em] 1 + f(n-1), & n \equiv 1 \mod{4}, \text{ or } n=3, \\[0.5em] 1 + f(n+1), & \text{otherwise.} \end{cases}

This solution can be implemented using a recursive function or a while loop.

--- Reveal Code ---

Challenge 4

The fourth level contained two tasks, titled Bringing a Gun to a Trainer Fight and Free the Bunny Workers.

Bringing a Gun to a Trainer Fight
=================================

Write a function solution(dimensions, your_position, trainer_position, distance) that gives an array of 2 integers of the width and height of the room, an array of 2 integers of your x and y coordinates in the room, an array of 2 integers of the trainer's x and y coordinates in the room, and returns an integer of the number of distinct directions that you can fire to hit the elite trainer, given the maximum distance that the beam can travel.

The room has integer dimensions [1 < x_dim <= 1250, 1 < y_dim <= 1250]. You and the elite trainer are both positioned on the integer lattice at different distinct positions (x, y) inside the room such that [0 < x < x_dim, 0 < y < y_dim]. Finally, the maximum distance that the beam can travel before becoming harmless will be given as an integer 1 < distance <= 10000.

For example, if you and the elite trainer were positioned in a room with dimensions [3, 2], your_position [1, 1], trainer_position [2, 1], and a maximum shot distance of 4, you could shoot in seven different directions to hit the elite trainer (given as vector bearings from your location): [1, 0], [1, 2], [1, -2], [3, 2], [3, -2], [-3, 2], and [-3, -2]. As specific examples, the shot at bearing [1, 0] is the straight line horizontal shot of distance 1, the shot at bearing [-3, -2] bounces off the left wall and then the bottom wall before hitting the elite trainer with a total shot distance of sqrt(13), and the shot at bearing [1, 2] bounces off just the top wall before hitting the elite trainer with a total shot distance of sqrt(5).

-- Test cases --
Input : solution([3,2], [1,1], [2,1], 4)
Output: 7

Input : solution([300,275], [150,150], [185,100], 500)
Output: 9

The key insight which helped me solve this problem was to view reflections as additional rooms (but mirrored). Therefore, the reflections form an infinite grid of repeated mirror copies of the room. This concept is illustrated below using the example in the task description. Therefore, I solved this problem by following these steps. Firstly, I generated all coordinates corresponding to all positions (both yours and the targets, both real and reflected) in a square shape with side length larger than the allowed distance. Then I filtered these coordinates to only allow points within the allowed distance to you. From these coordinates, I identified a set of directions, which were the vectors from you to the coordinate, scaled by the greatest common divisor of both dimensions. For each direction the shortest distance was saved. Next I removed any directions to hit the target that would hit yourself first (directions that appear in both sets in which the distance to hit yourself was shorter.) The final result is the size of the set of directions to hit the target.

7	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
6	—	—	+	—	—	+	—	—	+	—	—	+	—
5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
4	—	—	+	—	—	+	—	—	+	—	—	+	—
3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
2	—	—	+	—	—	+	—	—	+	—	—	+	—
1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
0	—	—	+	—	—	+	—	—	+	—	—	+	—
-1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-2	—	—	+	—	—	+	—	—	+	—	—	+	—
-3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-4	—	—	+	—	—	+	—	—	+	—	—	+	—
-5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
	-5	-4	-3	-2	-1	0	1	2	3	4	5	6	7

Illustrated example from the task description. The shaded region represents cells in range of you. Cells shaded in green denote the directions which would hit the target.

--- Reveal Code ---

Free the Bunny Workers
======================

Given the number of bunnies available and the number of locks required to open a work room, write a function solution(num_buns, num_required) which returns a specification of how to distribute the keys such that any num_required bunnies can open the locks, but no group of (num_required - 1) bunnies can.

Each lock is numbered starting from 0. The keys are numbered the same as the lock they open (so for a duplicate key, the number will repeat, since it opens the same lock). For a given bunny, the keys they get is represented as a sorted list of the numbers for the keys. To cover all of the bunnies, the final solution is represented by a sorted list of each individual bunny's list of keys. Find the lexicographically least such key distribution - that is, the first bunny should have keys sequentially starting from 0.

num_buns will always be between 1 and 9, and num_required will always be between 0 and 9 (both inclusive). For example, if you had 3 bunnies and required only 1 of them to open the cell, you would give each bunny the same key such that any of the 3 of them would be able to open it, like so:

[
[0],
[0],
[0],
]

If you had 2 bunnies and required both of them to open the cell, they would receive different keys (otherwise they wouldn't both actually be required), and your solution would be as follows:

[
[0],
[1],
]

Finally, if you had 3 bunnies and required 2 of them to open the cell, then any 2 of the 3 bunnies should have all of the keys necessary to open the cell, but no single bunny would be able to do it. Thus, the solution would be:

[
[0, 1],
[0, 2],
[1, 2],
]

-- Test cases --
Input : solution(4, 4)
Output: [[0], [1], [2], [3]]

Input : solution(5, 3)
Output: [[0, 1, 2, 3, 4, 5], [0, 1, 2, 6, 7, 8], [0, 3, 4, 6, 7, 9], [1, 3, 5, 6, 8, 9], [2, 4, 5, 7, 8, 9]]

Input : solution(2, 1)
Output: [[0], [0]]

We know that no combination of num_required - 1 bunnies will collectively have all the required keys, they must be missing at least one key for a required lock. However, including any other additional bunny will complete the set of keys. Therefore, for every combination of num_required - 1 bunnies, every other bunny holds the missing keys. Therefore, we can minimally issue keys by assigning a set of keys for each lock to a unique combination of num_buns - num_required + 1. In this way, every possible combination of num_required will hold all required keys, and every possible combination of num_required - 1 will not. We know this minimises the total number of keys because removing any key will result in a combination of bunnies that cannot open that lock, regardless of how the keys are assigned.

--- Reveal Code ---

Challenge 5

The fifth and last level contained only a single task, titled Dodge the Lasers!

Dodge the Lasers!
=================

Write a function solution(str_n) which, given the string representation of an integer n, returns the sum of (floor(1*sqrt(2)) + floor(2*sqrt(2)) + ... + floor(n*sqrt(2))) as a string. That is, for every number i in the range 1 to n, it adds up all of the integer portions of i*sqrt(2).

For example, if str_n was "5", the solution would be calculated as
floor(1*sqrt(2)) +
floor(2*sqrt(2)) +
floor(3*sqrt(2)) +
floor(4*sqrt(2)) +
floor(5*sqrt(2))
= 1+2+4+5+7 = 19
so the function would return "19".

str_n will be a positive integer between 1 and 10^100, inclusive. Since n can be very large (up to 101 digits!), using just sqrt(2) and a loop won't work. Sometimes, it's easier to take a step back and concentrate not on what you have in front of you, but on what you don't.

-- Test cases --
Input : solution('77')
Output: 4208

Input : solution('5')
Output: 19

The task is essentially to code a function that can evaluate

f(n) =\sum_{i=1}^n \lfloor i \sqrt{2} \rfloor

for any integer $1 \leqslant n \leqslant 10^{100}$ . In my first attempt to solve this problem I used the Fourier series to derive an approximate solution, and then guess and checked to "hack" the verification process to tweak the result to pass the checks. The fractional part of a number $x$ , the difference between the number and its floor, forms a sawtooth function with a known Fourier series,

\lbrace x\rbrace = x - \lfloor x \rfloor = \frac{1}{2} - \frac{1}{\pi} \sum_{k=1}^\infty\frac{\sin(2\pi k x)} {k}.

Therefore, we can express the function $f(n)$ as an infinite series,

\begin{aligned} f(n) &= \sum_{i=1}^n \left\lfloor i \sqrt{2} \right\rfloor, \\ &= \sum_{i=1}^n \left[ i \sqrt{2} -\frac{1}{2} + \frac{1}{\pi} \sum_{k=1}^\infty\frac{\sin(2\pi k i \sqrt{2})}{k} \right], \\ &= \frac{n(n+1)}{\sqrt{2}} -\frac{n}{2} + \frac{1}{\pi} \sum_{k=1}^\infty \frac{\sin(\sqrt{2}\pi k n) \sin(\sqrt{2}\pi k (n+1))}{k \sin(\sqrt{2}\pi k)}. \end{aligned}

The problem with this approach is that the infinite series sometimes converged very slowly, requiring too many terms to stabilise. Furthermore, passing the checks via guess and check did not leave me with a satisfying ending to an otherwise rewarding challenge. Therefore, I continued working on the problem, and during some research I discovered a breakthrough. The key insight into solving this problem is to identify the summand as a Beatty sequence. The trick is to exploit the complementary Beatty sequence to create a recursive formula to compute the series. Let the largest term be $m=\lfloor n\sqrt{2} \rfloor$ ,

\begin{aligned} f(n) &= \sum_{k=1}^n \left\lfloor k\sqrt{2} \right\rfloor, \\ &= \sum_{k=1}^m k - \sum_{k=1}^{m-n} \left\lfloor k(2+\sqrt{2}) \right\rfloor, \\ &= \sum_{k=1}^m k - 2\sum_{k=1}^{m-n} k - \sum_{k=1}^{m-n} \left\lfloor k\sqrt{2} \right\rfloor, \\ &= \frac{m(m+1)}{2} - (m-n)(m-n+1) - f(m-n). \end{aligned}

The size of the function argument is reduced by a factor of at least $\sqrt{2}-1 < 0.5$ . Therefore, the depth of the recursion is bounded by $\log_2 n$ for large enough $n$ , and is approximately $-\log_2 n / \log_2(\sqrt{2}-1)$ .

Note that the code must be able to compute $m=\lfloor n\sqrt{2}\rfloor$ . The naive approach would require $\sqrt{2}$ to be known to very high precision. Instead, I accomplished this by computing $m=\lfloor\sqrt{2n^2} \rfloor$ using a form of Heron's method that was adapted for integer operations only.

--- Reveal Code ---

Bonus Tasks

After completing all five levels of the challenge, I found that I could still request new tasks. I believe these bonus tasks are alternative or old level 5 tasks. These extra tasks were titled Expanding Nebula, Disorderly Escape and Escape Pods.

Expanding Nebula
================
The gas of the steadily expanding nebula can be modelled as a 2D grid. You find that the current existence of gas in a cell of the grid is determined exactly by its 4 nearby cells, specifically, (1) that cell, (2) the cell below it, (3) the cell to the right of it, and (4) the cell below and to the right of it. If, in the current state, exactly 1 of those 4 cells in the 2x2 block has gas, then it will also have gas in the next state. Otherwise, the cell will be empty in the next state.

For example, let's say the previous state of the grid (p) was:
.O..
..O.
...O
O...

To see how this grid will change to become the current grid (c) over the next time step, consider the 2x2 blocks of cells around each cell. Of the 2x2 block of [p[0][0], p[0][1], p[1][0], p[1][1]], only p[0][1] has gas in it, which means this 2x2 block would become cell c[0][0] with gas in the next time step:
.O -> O
..

Likewise, in the next 2x2 block to the right consisting of [p[0][1], p[0][2], p[1][1], p[1][2]], two of the containing cells have gas, so in the next state of the grid, c[0][1] will NOT have gas:
O. -> .
.O

Following this pattern to its conclusion, from the previous state p, the current state of the grid c will be:
O.O
.O.
O.O

Note that the resulting output will have 1 fewer row and column, since the bottom and rightmost cells do not have a cell below and to the right of them, respectively.

Write a function solution(g) where g is an array of array of bools saying whether there is gas in each cell (the current scan of the nebula), and return an int with the number of possible previous states that could have resulted in that grid after 1 time step. For instance, if the function were given the current state c above, it would deduce that the possible previous states were p (given above) as well as its horizontal and vertical reflections, and would return 4. The width of the grid will be between 3 and 50 inclusive, and the height of the grid will be between 3 and 9 inclusive. The solution will always be less than one billion (10^9).

-- Test cases --
Input: solution([[True, True, False, True, False, True, False, True, True, False], [True, True, False, False, False, False, True, True, True, False], [True, True, False, False, False, False, False, False, False, True], [False, True, False, False, False, False, True, True, False, False]])
Output: 11567

Input: solution([[True, False, True], [False, True, False], [True, False, True]])
Output: 4

Input: solution([[True, False, True, False, False, True, True, True], [True, False, True, False, False, False, True, False], [True, True, True, False, False, False, True, False], [True, False, True, False, False, False, True, False], [True, False, True, False, False, True, True, True]])
Output: 254

The key to solving this problem is to use bitwise operations to significantly improve computational efficiency. Essentially, lists of booleans can be interpreted as a binary representation of an integer. For example, [True, False, True] would be 101 in binary which is equivalent to 5 in decimal. Bitwise operations allows us to compare two lists in a single operation. My solution works as follows.

g is a boolean array (list of lists of bools) that is an outcome of a previous state, s, such that for all rows and cols g[row][col] is True if and only if exactly one of the following is True: s[row][col], s[row+1][col], s[row][col+1], s[row+1][col+1]. solution(g) returns the number of unique previous states s that can produce g. It does this efficiently in four steps:

Rows of g and s are represented as the decimal integers that are equal to a binary integer where each digit is given by the column of that row. Representing the rows as integers allows efficient boolean computations using the bitwise operations &, |, <<, >>.
Every combination of row pairs of s is simulated to generate a possible row in g. If such a row exists in g, then these potential row pairs are recorded in a dictionary called potentials.
For every row in g, all possible values of the corresponding row in g are tracked as well as the number of unique ways to reach that combination, given all the previous rows.
The total number of unique states, s, is given by adding up the number of ways the last row of s is possible for each possible value of that last row.

--- Reveal Code ---

Disorderly Escape
=================
Write a function solution(w, h, s) that takes 3 integers and returns the number of unique, non-equivalent configurations that can be found on a star grid w blocks wide and h blocks tall where each celestial body has s possible states. Equivalency is defined as above: any two star grids with each celestial body in the same state where the actual order of the rows and columns do not matter (and can thus be freely swapped around). Star grid standardization means that the width and height of the grid will always be between 1 and 12, inclusive. And while there are a variety of celestial bodies in each grid, the number of states of those bodies is between 2 and 20, inclusive. The solution can be over 20 digits long, so return it as a decimal string. The intermediate values can also be large, so you will likely need to use at least 64-bit integers.

For example, consider w=2, h=2, s=2. We have a 2x2 grid where each celestial body is either in state 0 (for instance, silent) or state 1 (for instance, noisy). We can examine which grids are equivalent by swapping rows and columns.

00
00

In the above configuration, all celestial bodies are "silent" - that is, they have a state of 0 - so any swap of row or column would keep it in the same state.

00 00 01 10
01 10 00 00

1 celestial body is emitting noise - that is, has a state of 1 - so swapping rows and columns can put it in any of the 4 positions. All four of the above configurations are equivalent.

00 11
11 00

2 celestial bodies are emitting noise side-by-side. Swapping columns leaves them unchanged, and swapping rows simply moves them between the top and bottom. In both, the *groupings* are the same: one row with two bodies in state 0, one row with two bodies in state 1, and two columns with one of each state.

01 10
01 10

2 noisy celestial bodies adjacent vertically. This is symmetric to the side-by-side case, but it is different because there's no way to transpose the grid.

01 10
10 01

2 noisy celestial bodies diagonally. Both have 2 rows and 2 columns that have one of each state, so they are equivalent to each other.

01 10 11 11
11 11 01 10

3 noisy celestial bodies, similar to the case where only one of four is noisy.

11
11

4 noisy celestial bodies.

There are 7 distinct, non-equivalent grids in total, so solution(2, 2, 2) would return 7.

-- Test cases --
Input: solution(2, 3, 4)
Output: 430

Input: solution(2, 2, 2)
Output: 7

To solve this problem I had to brush-up on my group theory. I found the Group Theory course on YouTube by Fields Medalist Professor Richard Borcherds a helpful resource. The symmetries of grids form a group. Specifically, the direct product of two finite symmetric groups $S_w \times S_h$ . This is because grids are symmetric under row permutations and column permutations independently. A symmetric group of degree $n$ has an order of $n!$ , so the order of $S_w \times S_h$ is $w!h!$ .

The number of configurations is equivalent to the number of orbits in the group $S_w \times S_h$ . This can be computed using Burnside's counting theorem.

N(w,h,s)= \frac{1}{w!h!} \sum_{g_w\in S_w } \sum_{g_h\in S_h}s^{C(g_w,g_h)}

where $C(g_w,g_h)$ is the number of cycles in $g_h \times g_w$ . Many permutations contain the same combination of cycles, so we can group these terms to significantly reduce the size of the sum.

For example, the permutation, $g_w = (2, 1, 4,5,3)$ ,contains two cycles: one of length 2 and one of length 3. The total length of the permutation is the sum of the cycle lengths, $5 = 2+3$ . Let $Q_w$ and $Q_h$ be the sets of unique combinations of cycle lengths in $S_w$ and $S_h$ respectively. For example,

Q_4=\{(4), (3, 1), (2, 2), (2, 1, 1), (1, 1, 1, 1)\},

Q_5=\{(5), (4, 1), (3, 2), (3, 1, 1), (2, 2, 1), (2, 1, 1, 1), (1, 1, 1, 1, 1)\}.

When combining the row and column permutations, every cycle in the row permutation is paired with every cycle in the column permutation. The number of cycles in the array formed by a pairing, is the greatest common divisor (gcd). Therefore, the total number of configurations is

N(w,h,s)= \frac{1}{w!h!} \sum_{q_w\in Q_w } \sum_{q_h\in Q_h } \frac{w!}{\prod_{i=1}^w i^{m_i} m_i!} \frac{h!}{\prod_{j=1}^h j^{m_j} m_j!} s^{\sum_{a \in q_w}\sum_{b \in q_h}\gcd(a,b)}

where $m_i$ and $m_j$ are the number of cycles of lengths $i$ and $j$ in $q_w$ and $q_h$ respectively.

--- Reveal Code ---

Escape Pods
===========
Write a function solution(entrances, exits, path) that takes an array of integers indexing where the entrances are, an array of integers indexing where the exits are, and an array of an array of integers of the corridors, returning the maximum number of bunnies that can get through at each time step as an int. The entrances and exits are disjoint and thus will never overlap. The path element path[A][B] = C describes that the corridor going from A to B can fit C bunnies at each time step. There are at most 50 rooms connected by the corridors and at most 2000000 bunnies that will fit at a time.

For example, if you have:
entrances = [0, 1]
exits = [4, 5]
path = [
[0, 0, 4, 6, 0, 0], # Room 0: Bunnies
[0, 0, 5, 2, 0, 0], # Room 1: Bunnies
[0, 0, 0, 0, 4, 4], # Room 2: Intermediate room
[0, 0, 0, 0, 6, 6], # Room 3: Intermediate room
[0, 0, 0, 0, 0, 0], # Room 4: Escape pods
[0, 0, 0, 0, 0, 0], # Room 5: Escape pods
]

Then in each time step, the following might happen:
0 sends 4/4 bunnies to 2 and 6/6 bunnies to 3
1 sends 4/5 bunnies to 2 and 2/2 bunnies to 3
2 sends 4/4 bunnies to 4 and 4/4 bunnies to 5
3 sends 4/6 bunnies to 4 and 4/6 bunnies to 5

So, in total, 16 bunnies could make it to the escape pods at 4 and 5 at each time step. (Note that in this example, room 3 could have sent any variation of 8 bunnies to 4 and 5, such as 2/6 and 6/6, but the final solution remains the same.)

-- Test cases --
Input: solution([0], [3], [[0, 7, 0, 0], [0, 0, 6, 0], [0, 0, 0, 8], [9, 0, 0, 0]])
Output: 6

Input: solution([0, 1], [4, 5], [[0, 0, 4, 6, 0, 0], [0, 0, 5, 2, 0, 0], [0, 0, 0, 0, 4, 4], [0, 0, 0, 0, 6, 6], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]])
Output: 16

My first idea to solve this problem was to use integer linear programming. However, after some more thought and research I realised the problem could be modelled as a flow network, where the maximum flow can be efficiently computed using the push-relabel algorithm. I implimented the version taught in the lecture from A Second Course in Algorithms (Stanford CS261, Winter 2016) found on YouTube.

--- Reveal Code ---

7	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
6	—	—	+	—	—	+	—	—	+	—	—	+	—
5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
4	—	—	+	—	—	+	—	—	+	—	—	+	—
3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
2	—	—	+	—	—	+	—	—	+	—	—	+	—
1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
0	—	—	+	—	—	+	—	—	+	—	—	+	—
-1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-2	—	—	+	—	—	+	—	—	+	—	—	+	—
-3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-4	—	—	+	—	—	+	—	—	+	—	—	+	—
-5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
	-5	-4	-3	-2	-1	0	1	2	3	4	5	6	7

7	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
6	—	—	+	—	—	+	—	—	+	—	—	+	—
5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
4	—	—	+	—	—	+	—	—	+	—	—	+	—
3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
2	—	—	+	—	—	+	—	—	+	—	—	+	—
1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
0	—	—	+	—	—	+	—	—	+	—	—	+	—
-1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-2	—	—	+	—	—	+	—	—	+	—	—	+	—
-3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-4	—	—	+	—	—	+	—	—	+	—	—	+	—
-5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
	-5	-4	-3	-2	-1	0	1	2	3	4	5	6	7

7	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
6	—	—	+	—	—	+	—	—	+	—	—	+	—
5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
4	—	—	+	—	—	+	—	—	+	—	—	+	—
3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
2	—	—	+	—	—	+	—	—	+	—	—	+	—
1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
0	—	—	+	—	—	+	—	—	+	—	—	+	—
-1	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-2	—	—	+	—	—	+	—	—	+	—	—	+	—
-3	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
-4	—	—	+	—	—	+	—	—	+	—	—	+	—
-5	Y	T	\|	T	Y	\|	Y	T	\|	T	Y	\|	Y
	-5	-4	-3	-2	-1	0	1	2	3	4	5	6	7