How I Used OPA to Help Me Solve Wordle
Several months ago my daughters started talking about this word game they were playing. It was from The New York Times and called Wordle. The premise was that you had six chances to guess a random five letter word. They soon started dropping hints that they wanted me to try it:
I soon gave in and started playing daily. The results soon started showing up in our family chat and most of the time they were not in my favor:
I soon realized that I needed to cheat use my engineering skills to help me out. So, this is my journey to “outsmart” my kids using Open Policy Agent (OPA).
The first question that you might have is, “Why use OPA?” — wouldn’t a general purpose language be better for this? Sure, OPA isn’t designed for this type of problem, but at the same time, it provides a REST interface and a nice way to deploy bundled data and rules. And when you break it down, all Wordle is is rules over a fixed set of data.
If you want to follow me on my journey, I’ll be using Strya DAS, for which you can sign up for free, here. Also, all the code examples I use can be found here.
Using OPA to create an unfair competitive advantage over my family in Wordle
First, let’s get a better understanding of the game itself. You have six chances to guess the daily five-letter Wordle. After you guess a word, the game will tell you if any of the individual letters are used in the word, and if so, they are used in the correct position. Extra letters that don’t show up in the daily Wordle are indicated as wrong. Wordle actually uses two different data sets, the first being words that could actually be the daily Wordle and another set of valid words, in total just under 13,000 words. The non-used words are actually a much larger part of the data set and are very valuable as they might give you information about individual letters.
As for those sets of words, I have some here. And before you start thinking you can just go into the data and find the daily Wordle by looking up the index, these words do not correspond to the Wordle number.
So let’s talk about the interfaces. Since I’m going to use the REST interface of OPA, I will need to define some JSON structures for input and output. The input should be fairly simple; I just need to know the word number I am guessing, the guesses themselves and if I want some help (let’s call this Jarvis).
For the output, how about an indicator for correctness, a mapping that shows how individual letters did (whether they’re in the Wordle and in the right position) and messages to the user (I’m going to separate the “normal” messages from the Jarvis messages).
(As for the guesses, a letter indicates correct letter & position, ‘*’ indicates correct letter & wrong position and ‘-’ indicates incorrect letter).
So now I’m going to create a Custom system in Styra DAS and add a data source named “words” for the words. A data source in DAS represents data to be included in the OPA policy bundle. Even though this is a git repository, I’m going to use an HTTPS data source and use the raw url. Creating a custom system automatically creates a default data source named dataset, but I’m going to delete that so as to not confuse myself later.
The default policy file in the system needs a little update. For now, I just want to validate the input and start setting up the output structure. So, I’m going to update it to look like this:
Just doing some simple validation here and setting myself up for more of the logic later (like adding just the default correct for now). I like using Incremental Definitions for my main rule; it lets me easily add to the result from other locations (more on that later). Similar for incrementally creating the message set. I’m just hard coding the formatted_results here just to see how it looks. I likely could have gone much further with the validation here, but I’m only trying to one up my kids here.
Now that I have this base policy, I can publish it and test it out. Publishing is as easy as pushing the publish button in the editor; as part of the system settings, the install instructions give me the steps to run this policy locally. The Custom system in DAS will set the default rule to main, so I can now issue the following cURL and get the results:
Now that I got the initial interface working, it’s time to get the game working. This is the remainder of the rego that I need (remove the hard coded formatted_results):
The invalid_guess and correct rules are just simple lookups. I also added some new conditions/messaging around them. The formatted_result, and supporting rules/functions, is where the fun is at. The get_mark function gets the correct “mark” based on the individual letters in guesses. The most challenging “mark” is the star (*) as this is where all the exceptional conditions come into play. If there’s any bugs left in the code, they will almost certainly be found in this condition. As a side, I do have unit tests around this that you can find in the full source code.
This now gives me a working Wordle clone:
This is helpful, as I can now use this to practice random words. But I think I need more than just practice to beat my kids. So it’s onto Jarvis mode.
My idea with Jarvis mode is to assign some sort of score to the words, eliminate words based on what I know about previous guesses, then take the highest remaining scored word. With this approach, there’s going to be multiple array iterations. Even though I’m not overly concerned with performance, I know that the initial score will be static, so I will precompute the word score and put it into the data source.
For the initial score, my thought is to get stats on both the position of letters and the overall use of letters in the used words. Then, use those stats to place a score on all the words, giving a penalty to words with duplicate letters. I’ll then place the scores into a new “scores” property in the data source. I’ll use the transformation feature of the data source in DAS to apply this transformation.
When I add this to the data source configuration, I will call the data.transform.out rule. Now the data source will have the additional scores property.
The actual logic for Jarvis is a little tricky, as you have to take into account all the guesses and sometimes the information you get from them can be overridden by another guess. For example, if my first guess contained an ‘e’ in the wrong position, the second guess contained an ‘e’ in the correct position, then I know I have a correct letter, but I don’t know anything else about ‘e’. But if I have two ‘e’s in the first word, one that’s in the wrong position and another that is unused, then after the second word (with one correct ‘e’), I know there should be no more ‘e’s in the word. This logic will be found in the best_word and supporting rules/
I also need to add in a rule to active Jarvis and some snarky messages.
The logic in best_word just turns out to be a fancy filter: first filtering out any word that doesn’t have letters that are known to be in the exact position, then words that don’t have enough of the letters that I know should be somewhere; next, letters that show up too many times in the word; and finally letters I know shouldn’t exist in our unknown positions (this is how I choose to deal with the double ‘e’ example from above — I added an ‘e’ to the unused letters and just check unknown positions).
Now I have a working cheat feat of engineering. I can active it by setting the Jarvis property to true:
So, this was my journey to household dominance at Wordle. Go ahead and see if you create a better algorithm or maybe add more snarky comments. This also got me thinking of what to try next. I think a Zork-type game would work well, so maybe I’ll give that a try next.
BTW, as I mentioned, the word numbers do not correspond to the actual game, so I’ve never actually used this for our daily challenge. And yes, the kids still better me most days.