A foray into objects and recursion in Python

Andre Yu Tiamco
Geek Culture
Published in
10 min readJun 21, 2021

--

Source

It has been quite some time since I last studied and experimented with Python. I initially taught myself via YouTube videos and our lord and savior Stack Overflow at the age of around 13. Back then, there were no goals other than to feed my own curiosity and to create a few small projects of which I could be genuinely proud. Additionally, I was fascinated by the idea of artificial intelligence, specifically those which are behind the “computer-controlled” characters in various games. I both hated getting my ass kicked by the A.I. in turn-based games and was fascinated by how they worked.

Enter the minimax algorithm. I’m not exactly sure how I initially encountered the concept, but it is likely that I stumbled across it whilst learning about recursion and recursive functions.

In computer science, and coding specifically, a function is called recursive when it calls itself in order to calculate or create its output. As a result, the function recurs repeatedly until a certain termination point is reached (usually identifiable by the inputs given to the function in that instance), wherein some predetermined (or easily determined) output is returned, allowing all of the nested functions to finally return their values, starting from the inside and moving out.

Perhaps the most famous example of this is a function that spits out the nth number in the Fibonacci sequence. This list of numbers is defined by the following rules:

  1. The 1st Fibonacci number is 0.
  2. The 2nd Fibonacci number is 1.
  3. The nth Fibonacci number is equal to the sum of the (n-2)th and (n-1)th Fibonacci numbers.

As is clear from the above, in order to determine the value of any Fibonacci number beyond the first two, one must first calculate the two that precede it. In order to calculate those, the numbers in the sequence that precede them must also be determined, and so on and so forth. Thus, in order to calculate any Fibonacci number, it would appear that one must start from the beginning.

That is exactly what a recursive function does, in essence. It calls itself, diving deep into its own ‘nesting’ before beginning to return result after result, working toward the end goal: the result for the initial input. Take the following function, which takes n as input and returns the nth Fibonacci number:

fibonacci(n) returns the nth Fibonacci number via recursions.

Here, fibonacci(n) returns the sum of the two prior Fibonacci numbers (found by calling itself) unless a terminal point with a known or well-defined result is reached (in this case, if n== 0 or if n== 1).

The minimax algorithm is similar, and it is a method of creating A.I. for simple, turn-based games. Here, the input is all the information the A.I. needs to know (or knows) about the game, such as the board state, who’s turn it is, etc. The output is the move choice (this can be a column in, say, Connect 4, or a spot on the grid in Tic-Tac-Toe) that results in the maximum score. The kicker is this: the resulting score for each move choice is calculated as the minimum that the opposing player could allow in any future continuation. The scores for each continuation (that is, the move after the next move to be played) are each determined by the maximum score that the A.I. could reach in the continuations of the continuations.

Essentially, the minimax algorithm looks deep into the future, checking each of the branching paths. Only at the terminal nodes (a win or a loss, which are each assigned some value) can it begin to work its way back into the present, picking the minimum value among the future options on turns where the opponent would be in control, and picking the maximum on turns where the A.I. itself would be in control. Refer to the following diagram.

Source. The minimax algorithm alternates between picking the maximum and the minimum of the options it sees.

Say the orange nodes are the A.I.’s turns, and the blue nodes are that of its opponent. Each player takes turns selecting from the two options that come next on the tree. The A.I. seeks to land on the highest number possible, and the opponent seeks to land on the lowest.

The A.I essentially first delves deep into the terminal nodes, finding their values. Then, since the turn before these terminal nodes is orange, it picks the maximum among the blue choices for each orange node. Thus, D is assigned 4, E is assigned 6, F is assigned -3, and G is assigned 7. Now it looks one step backward. As these nodes are blue, it picks the minimum (as the opponent is working toward its loss). So, B is assigned 4, and C is assigned -3. Now, it looks into the present (in an actual implementation, many more layers of the future than just three would be seen) and picks the highest among its options. So, it chooses option B.

These are the basics of the minimax algorithm. In my youth, I was fascinated by what seemed to be such a complex and (dare I say) futuristic concept. With a simple set of steps and rules, an A.I. could be taught to play nearly any turn-based game with some level of proficiency (theoretically, perfection). In spite of my understanding of the algorithm as well as my determination to see it come into fruition, however, my several, hours-long attempts to implement the algorithm into a perfect Connect 4 A.I. repeatedly failed, and I dropped the idea altogether.

Looking back, I chalk my failure up to two incompetencies: a fundamental misunderstanding of objects and copies and a lack of creativity regarding optimization. So, I figure now that I’ve internalized both areas a little more thoroughly I’d retry after 8 years. In the time since then, I’ve armed myself with some important tools.

The first of these is a stronger understanding of objects. In my attempts from 8 years ago, I wrote a class called column in order to create multiple column objects. I figured that as the moves made in the game came down to selecting a column, this made sense. Trouble arose, however, when the minimax algorithm began its recursive loop.

In order to access and edit each column, I wrote global colname at the top of the minimax() (it wasn’t pretty). I would then assign the needed column to a variable name, say column_played , add a piece to the column, and then continue recurring. I didn’t realize that creating a new variable for the column object didn’t actually create a new object. So, any edits made to the new, hypothetical column variable traced themselves back into the real, present game.

A demonstration of this quirk is seen below:

Assigning an object to a variable doesn’t copy the object. Thus, changes made affect any variable assigned to the given object.

That’s not all, however. I also compiled each column into a list called board . Thus, even if I had made a proper copy of the board , the new list would still refer to the same old objects. What I needed was a deep copy, demonstrated below:

Changes to a deep copy leave the original list of objects entirely untouched.

Knowing this now, I set out to write Connect 4, determined to make it work this time. Additionally, I ditched the column class, instead opting to use NumPy arrays for the simplicity of indexing and built-in functions.

I quickly wrote a simple Connect 4 game designed to run in the terminal (nothing fancy). After seeing that it worked properly for 2 human players, I began to implement the minimax algorithm…

and immediately ran into the first of many roadblocks:

That’s a lot of calls!

As it turns out, Python has a limit on how many times a function can call itself. This is set at 1000 by default. And at 4,531,985,219,092 board states for a standard 6x7 Connect 4 grid, my minimax algorithm was attempting to call itself far more than 1000 times.

So, I set a limit on the depth to which the algorithm would look. The furthest I found I could go without an unreasonably long wait between moves was 5 moves deep. After reaching a depth of 5, I told the program to simply make a random move and move on.

With this change, the recursion limit was successfully avoided. Now, however, the A.I. was little more than a random generator. Unless a guaranteed win or loss (acting as a terminal node) could be spotted 5 moves into the future, the moves it made were entirely random. So, the A.I. would play an intelligent game of Connect 4 only when the game was nearing its end, at which point it had been playing so poorly that its loss was inevitable. I needed to find an answer.

Enter memoization, a technique commonly used to speed up recursive functions over repeated calls by avoiding repeated calls. I know what you’re thinking. But Andre, I thought the whole point was repeated calls! All will be explained.

Consider fibonacci() above. Each time the function is called, for the majority of numbers, the function must call itself two more times. If only it could just remember the resulting outputs for a given input…

What a brilliant idea. Let’s implement it.

The memo serves to store the results of inputs that the function has “seen” before.

Here, each time fibonacci() receives a new input, it checks if the input n is present in the keys to the dictionary memo , which is initialized to be empty. If it isn’t, the function proceeds as normal, except it adds the result of the calculation to memo with the input acting as its key. If it is, it simply returns the stored result using memo[n] .

In addition to helping avoid the recursion limit, this technique can also significantly speed up the function, as it no longer has to calculate the answers to the questions that it has already answered. It effectively writes its own cheat sheet in real-time! Compare the runtimes of the function with and without memoization:

A hefty 259 microseconds without memoization…
164 microseconds!? Boy howdy!

Nearly 100 microseconds are shaved! If I were to run this code 2,100,000 times, I would save enough time to cook an entire bag of popcorn!

Source

So, I set the A.I. to play me on a loop. Each time the algorithm came up with a move that was not just a random selection (not random meaning a win, loss, tie, or a board state saved in the memo), it would store that move in a dictionary with its key being equal to a tuple of the board state, what letter it is playing, etc. In addition, I realized that many of the board states seen in Connect 4 are identical to each other. Any mirror images (flipped left and right) are equivalent, requiring only the inverse move choice. Additionally, any state is identical to the one that is achieved by flipping the letter (X or O) of each piece and playing as though you were controlling that new letter. So, I wrote a function called flip_letters() that took in a board and spit out a proper, flipped copy. Then, I used numpy.fliplr() to create a flipped version of the board array. I added both of these changes to the memo as well, in addition to the result of doing both.

The result of the above equivalencies is that the algorithm could fill the memo around 4 times faster than it would have been!

After some 10 games, I could see the algorithm starting to work faster and faster. It occurred to me that I didn’t have the time to play 10s to hundreds of games with the A.I. just for it to update its own cheatsheet and reach a decent level of gameplay.

After doing some research, I came across a solution. I could simply save the memo into a file on my computer. This way, the A.I. would remember all of its previous “training games” even after I stopped and reran the program.

With the A.I. having been granted the gift of perfect and eternal memory by its benevolent god, yours truly, I eagerly started playing game after game with it. Admittedly, this got boring very fast, and the A.I. was learning very, very slowly. This is when my friend Jerry offered some input. “Hey, why not just make the A.I. play itself a few hundred times instead? Isn’t that what AlphaZero does?” he said, referencing the famous chess A.I. that defeats champions and fellow A.I. alike.

So that’s exactly what I did. After setting the A.I. to play itself for 200 hundred games and after waiting a couple of hours, I came back to check its progress.

Firstly, it was now playing much, much faster. Long gone were the 30-second waits in between moves. The A.I. now took an average of around 10 seconds to think of a move by my estimate, and when the games neared their ends (i.e. when there were fewer “futures” to consider), it would play at blistering speed.

Secondly, the file to which I saved the turbo-engine dictionary was absolutely massive. As of writing this, it weighs in at a considerable 305 MB on my hard drive, and it is far from finished growing. In fact, it currently only contains a measly 1.58 million board states. If you ask me, this big-boy file is but a small sacrifice to make in the name of speedy gameplay.

All in all, I’m very satisfied with the project. The process of writing, debugging, and problem-solving this simple game of single player Connect 4 introduced me to three new libraries (pickle, copy, and os ) whilst allowing me to re-familiarize myself with one more (numpy). Additionally, I was able to test and apply my newly gained knowledge on the quirks of Python’s objects and copies thereof. The end result is a program that I can be proud of, as it both redeems my preteen self and flexes my minimax muscles.

--

--