The World of Wordle, Part 1: Linguistics Strategy

Dearest gentle reader, if you are anything like me, you like starting your daily wordle guess with a consonant-heavy approach, as opposed to famously tried and tested vowel-first ones like Audio, Adieu, and Crane. You and I both know this isn’t because it’s a more effective strategy, dare I say it’s because we like the risk and challenge of starting with something new everyday.

So… if you must know, I’m writing this from the comfort of my bed, with my little kitten Kuchi sleeping but a few centimeters away from me as I sneeze and cough my way through what was supposed to be a workday (but I took a sick leave). I spent the better half of today playing word games and analyzing over 1700 daily wordle answers since mid-2021.

There will be three parts to my Wordle series. This one, which takes a linguistics approach to understanding Wordle answers and how they’ve developed over time. The next one, which looks at it from an entropy / information theory point of view, and the final one, where I’ll look at it from a society / thought / behavior / semantic perspective.

Also, I have to admit that I hadn’t played Wordle in a very long time before today. I am more of a Quordle person. Not tooting my own horn, but once you have four of these bad boys, it’s hard to go back to one. TALK ABOUT HAVING OPTIONS! JESUS!


So! To understand the current state of Wordle, we must first acknowledge the volatility of our bookends. My dataset begins with a partial view of 2021 (approximately 6 months) and ends with a snapshot of 2026 (only 2 months). I like to think that the 2021 data represents the common, easy words that game creators usually burn through first. The 2026 data, while small, represents the long tail—the obscure, structurally complex words left over after years of play. By the way, I have a tendency of being incredibly wrong, so please don’t take my word for it. I am just describing popular sentiment. Let’s take a look at the data, shall we? (Also, you can probably tell I just finished watching the latest Bridgerton series just yesterday).


The Vowel Economy

I can’t tell you how many times my friend and I have had completely opposite first-word strategies. He’s all in to buy vowel positions until you can see the word— and he has a point, whereas I love spinning my wheel of fortune to land trigrams and consonants first. With 64.9% of words containing exactly two vowels, and the average word containing 1.95 vowels, I believe this can derive a structural blueprint for the perfect wordle answer.

Figure 8: The Vowel Economy

So if almost every word has two vowels, they would rarely sit next to each other (unless it's a double vowel like OO or EE). They usually act as anchors separated by consonants. And from what we can see above, E (929) and A (749) make up 50% of the vowel economy. But this doesn’t really help you or me. I know you’re practically begging me to tell you not what the most popular vowels are, but where they usually sit, so that you can understand the distance between them (deep…) and help bridge the gap (profound…).

What Position Do “I” Prefer?

In other words, when you see a particular letter, be it a consonant or a vowel, it would be delightful to know, historically, what position it preferred the most. So lemme tell you.

Figure 1: Position Frequency Heatmap (ALL TIME!)

Let’s start with the letters you could, upon seeing on your grid as YELLOW, know with extremely high certainty what position they might be GREEN in:

  • Y, Q, and J have a very very high chance (>70%) of being the last and first respectively. Quite an apt representation of all the five-letter English words we have, and before you say “oh Nalini, you could’ve just analyzed a 5-letter-words dictionary instead of a wordle-specific dataset.” NO! THIS IS MY LIFE! I WILL DO AS I PLEASE! Anyway, as I was saying… so words like “quiet”, “sorry”, and “jaunt”.

  • B and F are also strong ones, with a 60-ish % of being the first letter.

I know what' you’re thinking. This is great to know from an insight POV, but in the real world you have but 6 little shots at this game, and knowing what consonant appears where the most hardly helps. To that I will say… true. So let’s look at the vowels and a few other letters before I actually drop one of the coolest analyses of all time.

  • E and A are pretty frequent vowels. While A is a bit… capricious and likes a taste of all positions, E is vanilla and you’ll find it chilling at the end of the word half of the time.

Figure 3: Top 10 Starting and Ending Letters

"S" is the most common starting letter, yet it is virtually non-existent at the end of wordle answers. I suppose the linguistic reason is that in English, 5-letter words ending in ‘s’ are mostly plurals . Wordle removes plurals from the solution list. Woo. This probably forces it to migrate to the front. I also learned that "S" is a "fricative" consonant (which means it blends easily), it becomes the ultimate prefix builder. In other words, it’s like my romantic self— it can stand alone but also wants to hold hands.

Okay, but how does this even help!!! Looking at letters individually is such a robotic, useless, narrow view of how things should be. So, my dearest cutest lovelist dreamist hungriest reader, let me show you the most common three-letter-combinations that show up in Wordle.

The Aforementioned “coolest analyses of all time”

That’s what I’m talking about. If you ever find yourself with any two letters from the above pool, it would be wise to remember the third wheel letter and plug it in there for a solid green.

Figure 2: Top 3-letter combos

Figure 6: Top 5 Trigrams by Year

The frequency data paints a very very interesting picture of this game’s structural evolution, but given such a small sample size, it’s very hard to comment on the YOY “evolution”. It could just be super random, and I could have done some simple tests to determine whether this change is random or not but I chose not to. Anyway, initially gifting us the predictability of super Latinate patterns, I feel like the “randomness” shifted toward slightly unexpected Germanic rigidity. It seems like there was a reliance on standard silent-e endings and clean alternations that probably designed the rage-baity era of the rhyme chain. If you don’t know what this is, it’s basically where solving one word often unlocks a dozen others (like stove, drove, grove). While super maddening, at least these common root words like bring or thing kind of anchor the puzzle in somewhat colloquial territory.

But!!!! The transition from the vowel-heavy LEA of 2023 to the consonant-dense clusters of STA in 2024 and the double-letter endings of ILL and ULL in 2025-2026 suggests the poor editor is exhausting simple vocabulary and moving toward, dare I say, crunchier yummier word forms. The game effectively revoked the polite predictability of vowel teams to the brute force of consonant blends and double letters, leaving me, poor colloquial speaker of English Nalini, to be left gagging at words like Spill / Still / shill… instead of bring, thing, sting, and honestly, as I type this, I take back what I said earlier. Both are rage baity.

Personality Cards for the Top Starting Letters

Figure 7: The Top Letters and Their Personality Cards (P.S. the yellow one is “F” sorry)

I think this is probably my greatest gift to you all. I don’t have much to comment on it. It’s yours to keep! :)

One thing I find super interesting is that the data shows a split personality in ending letters. The "latin" group (C, B, P, T) seem to prefer ending in E. (I am guessing these are often words of Latin/French origin (like TRACE, BRACE, PRICE)). The "germanic" group (F, M, D) seem to prefer ending in Y, often descriptive adjectives or older English roots like FLAKY, MUDDY, HANDY. THIS IS NOT SOUND ADVICE, but I suppose if your word starts with a hard consonant like D or M, you could try to bet on the "Y". If it starts with a "soft" or blendable consonant like C or P, you are likely in Latin territory, go for the E?

Has Wordle Actually Become Harder Over Time?

In 2021, the average Scrabble score was low, and double letters appeared in only 34% of words. By early 2026, double letters appear in 39.1% of words (granted, there have only by 60-ish words in 2026), and the Scrabble average has jumped to nearly 10.

Figure 9: Average Scrabble Score

The Theory of Entropy: A finite list of 5-letter words exists. The game editor likely front-loaded the smooth words to make the game accessible in 2021/2022. As these words are exhausted, we’re forced to dip into the "crunchy" words category, such as those with high-value consonants (J, X, Q, Z) and kind of awkward double letters.

The 2026 dataset is small but the fact that it contains words like LINEN and MAMMA suggests that we have indeed entered the dregs of the dictionary. The Vowel Economy remains the same (still ~2 vowels), but the consonants surrounding them have become hostile. Much like the world around us.

Figure 5: Double Letters

I think that the dip in the middle represents was the Golden Era of accessible wordle. From 2022 to 2024 the frequency of double letters dropped to a low of 27.3%. I’m guessing that they either hired an intern, or the editor was burning through the most common, intuitive five-letter words that were clearly composed of five unique letters (like STARE, AUDIO, or LUNCH).

Aligned with what I said above, the sharp spike in 2025 and 2026 suggests vocabulary exhaustion. It means that nearly 2 out of every 5 words will have a double letter. Wow. How times and strategies change. In 2024, you could safely assume a word had five unique letters and be right 73% of the time. In 2026, assuming unique letters is a liability. What this means is that it’s likely that we will need to burn guesses specifically to check for duplicates (e.g., guessing SHEEP instead of SHAPE just to check the double E). Actually now that I think about it, letter repetition is very Germanic indeed.

A New Era?

I know what you’re wondering. Has Wordle started repeating letters now? Well. I found 4 words that appeared more than once.

  1. CIGAR: appeared 2 times (2026-02-02, 2021-06-19)

  2. SQUAD: appeared 2 times (2026-02-17, 2022-04-11)

  3. LINEN: appeared 2 times (2026-03-03, 2021-08-09)

  4. AWAKE: appeared 2 times (2026-02-21, 2021-06-23)

Wordle is famously not supposed to repeat words. But I see a lot of repeats happening this year… Hmm… Are we actually entering a new era?

That’s all from me today. I hope you enjoyed this little EDA! While exploring the dataset I really felt like it could be a great case study for Entropy, but I didn’t want all this exploration and analysis to go to waste so I decided to write about it. Well, hope you had fun, and as always, thanks for sticking around!!! :)

Appendix

Next
Next

Dating is a Cognitive Error