# Google's FooBar Challenge

Recently, I completed Google's FooBar challenge in April 2023. This involved solving a series of mathematical and coding problems using Python 2.7 or Java. The challenge presented five levels of increasing difficulty, with 1-3 tasks per level. In this document, I summarise the tasks I undertook and explain how I resolved each one. For further insight into the challenge, see articles by Turing and Medium.

The document is structured into five sections, each pertaining to a different challenge level. Every level contains a section for each task. Within each task section, there are three main components. The first is a simplified task description, adapted from the original text. The second part consists of an explanation of how I approached and solved the problem. The final part features my Python code, which can be revealed by clicking on the corresponding box. The constraints for coding in Python or Java are delineated below.

Should you wish to attempt a task yourself, I suggest you firstly peruse the task description and attempt the problem without reading my solution. Feel free to search online for any pertinent mathematical or coding principles, but refrain from seeking a direct solution to the problem. If you find yourself stuck for an extended period, then consider reading my solution explanation, but do not yet reveal my code. I recommend revealing and examining my code only if you are absolutely stuck, or if you wish to compare your solution with mine. Remember, some of the challenges are complex and may take several hours or even several days to solve. Enjoy the process!

## Challenge 1

The first level contained only a single task, titled Re-ID, which involved efficiently generating prime numbers. A task summary, adapted from the original challenge instructions, is given below.

If the string of prime numbers is known, then the solution is trivially given by returning the five digits starting at index $n$. Therefore, I solved this problem by defining a function that firstly generated this string of primes. This function works by checking the primality of all integers from $2$ to $m=20+\lfloor n \ln 10 \rfloor$. This choice of $m$ ensures that the resulting string of primes is at least $n+5$ characters long, without too much excessive computation.

This is a result of the prime number theorem which gives an asymptotic expression for the prime counting function,

which counts the number of prime numbers $\leqslant m$. The maximum number of digits each prime can have is $\lceil \log m \rceil$ (base 10). Therefore, having $n+5 < \pi(m)\lceil \log m \rceil$ should ensure the resulting string is long enough when $n$ is large. This condition is satisfied by

which I checked empirically worked for when $n$ is low.

Each integer is considered prime if no previously identified prime numbers divide it. This involves checking all previously identified prime numbers less than or equal to the square-root of the current candidate number (there is no point checking factors higher than the square-root because they all pair with divisors less than the square-root). Once the list of prime numbers is generated then the integers are converted to strings before being joined into a single string.

`--- Reveal Code ---`

## Challenge 2

The second level contained two tasks, titled Please Pass the Coded Messages and Bunny Worker Locations.

To solve this problem, let's first consider the simpler problem of finding the largest number containing all the digits in the list L. Obviously, this is achieved by sorting the list with the largest on the left to the smallest on the right, so that the larger digits correspond to the higher decimal positions. Removing any digit will strictly decrease the size of the number (except when all digits are zero). Therefore, this result forms an upper bound to the original problem of finding the largest number that is also divisible by three.

Now we can make use of the property of decimal numbers that they are divisible by 3 if and only if the sum of their digits is also divisible by three. Or, more generally, that the number is congruent modulo 3 to its digit sum. Since the order of a summation does not effect the result, we can use this rule to check the modularity of a list of numbers without knowing the order. Next we note that removing any two digits never lowers the value by less than only removing any one digit. Also note that removing a larger digit will lower the value by more than removing a lower digit. Therefore, we can solve the problem by firstly minimising the number of digits that need to be removed, and then minimising the values of the removed digits.

The sum of digits can be congruent to 0, 1 or 2 (mod 3). In the first case, any permutation of all digits is divisible by 3 without removing any digits. Therefore, the solution returns the number containing all the digits in descending order. If the sum is congruent to 1 or 2, then there are three cases. Firstly, one of the digits is congruent to the sum, in which case, we just remove the smallest such digit. Secondly, two of the digits are congruent to twice the sum, in which case we remove the two smallest such digits. Thirdly, no combination of digits is divisible by 3, so return 0. After removing any digits, the solution is the number containing all the remaining digits in descending order. Note, that if there are no remaining digits, then return 0.

`--- Reveal Code ---`

The grid changes by 1 each step along a diagonal, which gives the following recursive relationships

where $k$ can be any integer satisfying $0 < k < y$. The grid coordinates can be as large as 100000 so generating the entire grid would probably be much slower than directly calculating the values from a closed form expression for $f(x,y)$. To find one, I first noticed that the first row is given by the triangular numbers. This makes sense since the values are given by the size of each diagonal, which grow by 1 each time. Therefore,

So, we can get a closed form expression by choosing $k=y-1$ in the recursive relationship and then using the result for $f(x,1)$ by substituting $x\rightarrow x+y-1$

`--- Reveal Code ---`

## Challenge 3

The third level contained three tasks, titled Prepare the Bunnies' Escape, The Grandest Staircase Of Them All and Fuel Injection Perfection.

To solve this problem I represent maps as graphs where nodes correspond to passable spaces and edges correspond to adjacent passable spaces. Then I use the A* algorithm to find the shortest path length from the start and end nodes. To improve the efficiency of the algorithm I use the Manhattan distance as a lower bound to the remaining distance between any node and the end node. After implementing the A* algorithm I use it to find the shortest path lengths of all possible maps with exactly 1 or 0 walls removed. The function then returns the smallest of these path lengths.

`--- Reveal Code ---`

To solve this problem I used the principles of dynamic programming to write a recursive function which relied on memoisation to run efficiently. Consider the function $w(n,m,s)$ which calculates the number of valid staircases that can be built with $n$ bricks, with each stair containing a maximum of $m$ bricks and at least $s$ more steps. We can express $w$ as a recursive function. Specifically, the number of staircases that can be built is the sum of the number of staircases that can be built given the size of the next step. For example, if there are 5 bricks, then the number of possible staircases is the number of staircases that end with a height of 1, plus the number that end with a height of 2, plus a height of 3 and plus a height of 4.

There are two base cases to this recursion. The first occurs when there are too many bricks. This occurs when the number of bricks is more than a maximal staircase with a constant step size of 1, $n > m(m+1)/2$. In this case, the number of valid staircases is 0. The second base case occurs when there are no bricks $n=0$, and the minimum number of additional steps is less than zero ($s\leqslant 0$). In this case, the number of valid staircases is 1. Therefore,

with the solution is given by $w(n,n,2)$.

`--- Reveal Code ---`

To solve the problem we can identify several principles for minimising the number of operations to reach 1. Firstly, no shortest trajectory would contain a $+1$ operation immediately followed by a $-1$ operation, or vice versa, because any trajectory containing any such consecutive operations could be shortened by 2 steps by removing them. Secondly, even numbers should always be divided by 2. This is because a trajectory that repeats the other operations followed by a division can be shortened by dividing first. For example,

where $k$ is any natural number. Basically, the number of consecutive addition or subtraction operations can be cut in half (or half plus 1) by dividing first. Note that this assumes that eventually there will be another halving operation. Except in the cases of $n=2$ and $n=3$ where both halving and subtracting 1 will work, in every other case halving requires fewer operations than subtracting 1. Specifically, continuously subtracting 1 requires exactly $n-1$ operations, while halving every even and subtracting 1 off every odd number will require no more than $2\lceil \log_2 (n+1) \rceil -2$ (when $n$ is one less than a power of 2) operations and no less than $\lceil \log_2 n \rceil$ (when $n$ is a power of 2) operations. Obviously, repeated additions can never reach 1 since they strictly increase the number. Therefore, for all $n>3$ there exists a shortest sequence of operations whereby every even value is halved, and the number of even values is at least one.

To decide on a rule for odd values, let us consider the two possible cases where $n=4k+1$ and $n=4k+3$ for any natural number, $k$. In the first case we can draw the following tree.

In this case adding 1 inevitably leads to either $k$ or $k+1$, both of which can be reached in fewer or equal number of operations by subtracting 1. Therefore, when $n=4k+1$ subtracting 1 is weakly faster than adding 1. Given this strategy, we can make an analogous decision tree for the $n=4k+3$ case.

Now choosing to subtract 1 will inevitably lead to either $k$ or $k+1$, both of which can be reached in fewer or equal number of operations by adding 1. Therefore, when $n=4k+3$ adding 1 is weakly faster than subtracting 1.

Finally, let us consider the simple small cases of which cannot be expressed as $n=4k+r$. When $n=1$ we do not need any operations so return 0, when $n=2$ follow the halving even rule to get to 1 needs only 1 operation, and when $n=3$ subtracting 1, rather than adding 1 is faster. Therefore, we can define the following recursive function to solve the problem,

This solution can be implemented using a recursive function or a while loop.

`--- Reveal Code ---`

## Challenge 4

The fourth level contained two tasks, titled Bringing a Gun to a Trainer Fight and Free the Bunny Workers.

The key insight which helped me solve this problem was to view reflections as additional rooms (but mirrored). Therefore, the reflections form an infinite grid of repeated mirror copies of the room. This concept is illustrated below using the example in the task description. Therefore, I solved this problem by following these steps. Firstly, I generated all coordinates corresponding to all positions (both yours and the targets, both real and reflected) in a square shape with side length larger than the allowed distance. Then I filtered these coordinates to only allow points within the allowed distance to you. From these coordinates, I identified a set of directions, which were the vectors from you to the coordinate, scaled by the greatest common divisor of both dimensions. For each direction the shortest distance was saved. Next I removed any directions to hit the target that would hit yourself first (directions that appear in both sets in which the distance to hit yourself was shorter.) The final result is the size of the set of directions to hit the target.

7 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

6 | — | — | + | — | — | + | — | — | + | — | — | + | — |

5 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

4 | — | — | + | — | — | + | — | — | + | — | — | + | — |

3 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

2 | — | — | + | — | — | + | — | — | + | — | — | + | — |

1 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

0 | — | — | + | — | — | + | — | — | + | — | — | + | — |

-1 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

-2 | — | — | + | — | — | + | — | — | + | — | — | + | — |

-3 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

-4 | — | — | + | — | — | + | — | — | + | — | — | + | — |

-5 | Y | T | | | T | Y | | | Y | T | | | T | Y | | | Y |

-5 | -4 | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |

`--- Reveal Code ---`

We know that no combination of num_required - 1 bunnies will collectively have all the required keys, they must be missing at least one key for a required lock. However, including any other additional bunny will complete the set of keys. Therefore, for every combination of num_required - 1 bunnies, every other bunny holds the missing keys. Therefore, we can minimally issue keys by assigning a set of keys for each lock to a unique combination of num_buns - num_required + 1. In this way, every possible combination of num_required will hold all required keys, and every possible combination of num_required - 1 will not. We know this minimises the total number of keys because removing any key will result in a combination of bunnies that cannot open that lock, regardless of how the keys are assigned.

`--- Reveal Code ---`

## Challenge 5

The fifth and last level contained only a single task, titled Dodge the Lasers!

The task is essentially to code a function that can evaluate

for any integer $1 \leqslant n \leqslant 10^{100}$. In my first attempt to solve this problem I used the Fourier series to derive an approximate solution, and then guess and checked to "hack" the verification process to tweak the result to pass the checks. The fractional part of a number $x$, the difference between the number and its floor, forms a sawtooth function with a known Fourier series,

Therefore, we can express the function $f(n)$ as an infinite series,

The problem with this approach is that the infinite series sometimes converged very slowly, requiring too many terms to stabilise. Furthermore, passing the checks via guess and check did not leave me with a satisfying ending to an otherwise rewarding challenge. Therefore, I continued working on the problem, and during some research I discovered a breakthrough. The key insight into solving this problem is to identify the summand as a Beatty sequence. The trick is to exploit the complementary Beatty sequence to create a recursive formula to compute the series. Let the largest term be $m=\lfloor n\sqrt{2} \rfloor$,

The size of the function argument is reduced by a factor of at least$\sqrt{2}-1 < 0.5$. Therefore, the depth of the recursion is bounded by $\log_2 n$ for large enough $n$, and is approximately $-\log_2 n / \log_2(\sqrt{2}-1)$.

Note that the code must be able to compute $m=\lfloor n\sqrt{2}\rfloor$. The naive approach would require $\sqrt{2}$ to be known to very high precision. Instead, I accomplished this by computing $m=\lfloor\sqrt{2n^2} \rfloor$ using a form of Heron's method that was adapted for integer operations only.

`--- Reveal Code ---`

## Bonus Tasks

After completing all five levels of the challenge, I found that I could still request new tasks. I believe these bonus tasks are alternative or old level 5 tasks. These extra tasks were titled Expanding Nebula, Disorderly Escape and Escape Pods.

The key to solving this problem is to use bitwise operations to significantly improve computational efficiency. Essentially, lists of booleans can be interpreted as a binary representation of an integer. For example, [True, False, True] would be 101 in binary which is equivalent to 5 in decimal. Bitwise operations allows us to compare two lists in a single operation. My solution works as follows.

g is a boolean array (list of lists of bools) that is an outcome of a previous state, s, such that for all rows and cols g[row][col] is True if and only if exactly one of the following is True: s[row][col], s[row+1][col], s[row][col+1], s[row+1][col+1]. solution(g) returns the number of unique previous states s that can produce g. It does this efficiently in four steps:

`--- Reveal Code ---`

To solve this problem I had to brush-up on my group theory. I found the Group Theory course on YouTube by Fields Medalist Professor Richard Borcherds a helpful resource. The symmetries of grids form a group. Specifically, the direct product of two finite symmetric groups $S_w \times S_h$. This is because grids are symmetric under row permutations and column permutations independently. A symmetric group of degree $n$ has an order of $n!$, so the order of $S_w \times S_h$ is $w!h!$.

The number of configurations is equivalent to the number of orbits in the group $S_w \times S_h$. This can be computed using Burnside's counting theorem.

where $C(g_w,g_h)$ is the number of cycles in $g_h \times g_w$. Many permutations contain the same combination of cycles, so we can group these terms to significantly reduce the size of the sum.

For example, the permutation, $g_w = (2, 1, 4,5,3)$,contains two cycles: one of length 2 and one of length 3. The total length of the permutation is the sum of the cycle lengths, $5 = 2+3$. Let $Q_w$ and $Q_h$ be the sets of unique combinations of cycle lengths in $S_w$ and $S_h$ respectively. For example,

When combining the row and column permutations, every cycle in the row permutation is paired with every cycle in the column permutation. The number of cycles in the array formed by a pairing, is the greatest common divisor (gcd). Therefore, the total number of configurations is

where $m_i$ and $m_j$ are the number of cycles of lengths $i$ and $j$ in $q_w$ and $q_h$ respectively.

`--- Reveal Code ---`

My first idea to solve this problem was to use integer linear programming. However, after some more thought and research I realised the problem could be modelled as a flow network, where the maximum flow can be efficiently computed using the push-relabel algorithm. I implimented the version taught in the lecture from A Second Course in Algorithms (Stanford CS261, Winter 2016) found on YouTube.

`--- Reveal Code ---`