Monday, 28 July 2025

Playing Scrabble for keeps

So I've been hooked on Word Play, the Scrabble-based roguelite.

I play some video games here and there, and when I find one I really enjoy, I try to squeeze out all of its challenge juice (technical term). That usually means at least getting all the achievements.

As a result, when it comes to Word Play I have been putting far too many hours into beating Ultramarathon mode specifically: 20 rounds at the most difficult scoring.

Here's how I finally beat it.

(Roguelite players will be completely unsurprised to hear that this did not involve me being particularly good at Scrabble.)

 

Word Play screenshot.

Cracking open this game like an egg

It's all about synergies, of course, which means you need luck plus strategy. This run had the 'more rare and legendary modifiers' modifier, and I just doubled down on the first synergy I saw, which revolved around Upgrades.

My engine is made of modifiers:

➡️ I get a random common Upgrade when I play a word with 8+ tiles.

➡️ Each Upgrade gets +1 use.

➡️ +1 bonus point each time an Upgrade is used.

➡️ I gain a refresh when an Upgrade is used up completely. (This was switched out near the end of the run)

So if I play exclusively long words, I get a bunch of Upgrades, and I get more bonus points on every subsequent word. I also get refreshes (to help me get long words), but I don't use them, because so many of the common Upgrades let you refresh selectively. I quickly build up 40 refreshes and a hundred bonus points per play.

Not crucial to the engine, but I also get a modifier for x2 score with 3+ unplayed special tiles. All the Upgrades are turning my entire bag into a mess of special tiles, so this doubles all my scores without effort.

Finally, I get my first potion tile, an 'M' worth 3 points. Potion tiles give you plays equal to their score, but break, when played. I hoarded this until I lucked out and got my final modifier: if you play a four tile word, add a copy of the first tile to the letter bag.

So now, with my huge numbers of refreshes, I can in principle just refresh until I get my potion M, play it at the start of a four letter word, and have it break but add a copy to the tile bag, for a net +2 plays. This is huge when you start the run with 20 plays and only get a few more per round. So: rinse and repeat, interspersing with long words (to get more Upgrades (to get more refreshes)).

In practise, though, that's slow and unreliable. I didn't end up spending many refreshes getting the potion M out there. Instead, I was careful to have an Upgrade slot open at the end of each round, and at about round 12 I got exactly what I was hoping for: the uncommon Upgrade which adds your refresh count to a tile's score.

You can see where this is going. I used it on my potion M three times, discovering in the process that a tile's score maxes out at 99. Now I gain 99 plays each time I put the potion M at the start of a four-letter word. Over the course of a round I gain more plays than I will ever need.

That's why the little number in the bottom right of the screenshot says "1538", not the "15" or so that you would normally expect.

 

Descent into absurdism

At this point the run is essentially won, so I rejoice, but it will clearly be a slog. Even with most of my tiles turned emerald or golden with the bounty of Upgrades, I only get something like 800 points per play with a long word, so I'm going to need to play 100 good words in the final couple of rounds.

Aware of this, I have been burning my essentially-limitless plays rerolling modifiers, and it pays off at the end of round 17. I get the 'multiply final score by number of special tiles' modifier, one of a couple that would reliably boost my scores even further.

So I wave goodbye to 'gain a refresh when an Upgrade is used up', you made all this possible. Now if I spell a word like LAVENDERS, it scores 5936 points. I can and do cruise to the finish in a handful of plays per round.

Word Play screenshot.

And that is how I got the hardest achievement in this damn spelling game.

 

Your mileage may vary

None of this strategy is reliably reproducible, of course, due to randomness. But I think it's interesting that it worked, because it was the first Upgrades-based build which I had tried. Part of that is luck in the early rounds, naturally.

Builds that I tried and failed with, for the record:

➡️ All gold tiles

➡️ Fast-growing diamond tiles

➡️ All the emerald synergies

➡️ Double length points, board expanders, and lots of plus tiles

➡️ Dozens of attempts that never got the smallest synergy.

 

So that's most of the challenge juice squeezed out of Word Play! I recommend this game if you're a Scrabblehead. It's on Steam.

Update a few days later: I translated my run into "whoops, all wildcards". 300+ tiles of golden and dotted 99-point wildcards took me to round 50.

Word Play game screenshot. Round 50. A board full of golden 99-point asterisks. I have just received 3238590 points.

 

I spent scores, maybe hundreds of rerolls trying to get the "dotted tiles multiplier increases with each play" modifier which would have given me desperately-needed multiplier scaling, and which could have taken me even further. I never got it, though, so I called it at round 50, where winning just meant typing "******************" over and over again and waiting for the scoring to finish.

 

Word Play game screenshot. Ending the game.

Monday, 14 July 2025

Trying not to be a Gell-Mann Amnesiac

I sometimes wonder how much Gell-Mann Amnesia people experience. Paraphrasing Crichton, when you're a domain expert, you'll sometimes read an article that gets every aspect of your field completely and absurdly wrong, have a little laugh about it... then keep on reading and trusting articles that are about other fields, even from the same publication or writer.

As if they're some pure spring of wisdom which only coughed out a lump of mud when it came to the thing you happen to know about.

It's just an idea from a novelist, not the kind of cognitive bias that's supported by real-world studies that I know of, but you have to admit that it has a kind of... truthiness to it.

Stack this up with Dunning-Kruger and it's easy to become cynical. You might decide that actually, all the loudest voices are talking complete nonsense, all of the time. That might be too far. But I do think it pays to put deliberate hard effort into distinguishing domain experts from overconfident bullshitting pundits.

Now, anyone with their ear to the ground and a weather eye out for Gell-Mann Amnesia should have arrived at the obvious conclusion about generative AI. To wit, that the current state of the technology is that it is an overconfident bullshitter.

On being a piece of software and being confidently wrong

The case studies are easy to find, and the ones from domain experts sound pretty different from the ones from the tech industry and the reporters too busy and/or demoralised to do more than repackage their press releases as articles.

➡️ I am not a historian. The historians I've read say genAI gets softball history questions mostly right and deep ones mostly wrong. Sometimes subtly, sometimes dramatically. It just makes things up when the evidence is scarce. It makes errors of commission and omission as well as having misplaced focus and drawing weird conclusions from premises.

➡️ I am not an artist. The artists I listen to say genAI art looks bland and awful and organic because it doesn't understand composition or anatomy or separate objects (because it doesn't 'understand' anything). It can't make an image that isn't well-represented in the training data, like a camel and a steampunk automaton jousting from the backs of sumo wrestlers. Same in other kinds of media: filmmakers say genAI can't do film because it can't take direction or keep track of characters or have a consistent shot.

➡️ I am not a Wikipedia editor (except incidentally). Earlier this year there was a wretched moment when the Wikipedia editors were going to have genAI article summaries foisted on them, although I think that's turned around now. The skilled editors pointed out that the LLM summaries generally ranged from 'bad' to 'worthless' by Wiki standards: they didn't meet the tone requirements, left out key details or included incidental ones, injected "information" that wasn't in the article, and so on.

➡️ I am not a manager. The managers say genAI can't even collate timesheets reliably.

➡️ I am not a novelist. The novelists say a genAI book reads like a statistical summary of all creative writing anyone has ever done, including all the embarrassing teenage fanfiction. It sucks at originality. And because it doesn't have an internal model or understanding of its outputs, it can't keep track of things and make a coherent satisfying story. Things are vague, tropey, or contradictory.

➡️ I am not a lawyer. The lawyers are, um, well, by the sound of it a lot of them are being sanctioned for using generative AI to cite completely nonexistent caselaw. (☉__☉”)

➡️ I am not a public policy wonk. The bureaucratic wonks note that genAI can't summarise text. It shortens it and fills in the gaps with median seems-plausible-to-me pablum. The kind you get when you average out everything anyone has ever written on the internet. If you try to have an LLM summarise or draw conclusions from a study, it will usually do a bad job, fabricating statements more along the lines of what an average person would guess if they'd only read the study's title.

➡️ I am not a software engineer. The software engineers seem to have mixed opinions. They say that genAI works as code autocomplete (something that has existed for fifty years, but this new kind has pretty sophisticated lookahead, neat). At least some are saying it can't do principled software engineering, it introduces security flaws, its performance drops off for obscure languages, it overconfidently generates bad code, it plagiarises from code repositories that it doesn't have the rights to...

I could go on.

I'm no longer a domain expert in anything, this many years after my stint in academia. I think I'm halfway to being an expert in a few different areas, though. I deliberately concocted some thoughtful questions at the intersection of those areas, just to see.

For example, I asked about the (obvious) mapping of choose-your-path text adventure books onto mathematical graph structures, which the LLM chatbot identified. I followed up with technical questions about the features of those graphs in context: what would the game be like if they weren't digraphs, would you expect cyclic vs acyclic, would a finite state machine be more appropriate and if so why, etc.

And lo, the generative AI output was absurdly, hopelessly, and confidently wrong when given questions that needed expertise.

A lot of people with a lot of money would like you to think that genAI chatbots are going to fundamentally change the world by being brilliant at everything. From the sidelines, it doesn't feel like that's going to work out.

Sometimes I read posts from experts along the lines of

"I've noticed it's almost worthless at [my field], but it sounds like it's pretty useful for [other thing]."

But less so lately, maybe?

So I'm left wondering: are people experiencing massive Gell-Mann Amnesia about these chatbots? Or does everybody know that the emperor has no clothes?

(But oh no, we've invested so, so, so very much money into the emperor's finery, and all the wealthiest people at the imperial court agree: pleeeease could you keep squinting to see this amazing new clothing?)

 

Playing Scrabble for keeps

So I've been hooked on Word Play , the Scrabble-based roguelite. I play some video games here and there, and when I find one I really en...