The Writing Community  ·  AI & The Craft  ·  Issue No. 1
AI & the Craft — A Series for Writers

AI Doesn't Read.
It Eats.

Your words are not sentences to a machine. They're a meal — and the chef cuts them in ways that will genuinely surprise you.

A Beginner's Guide to Tokenization
Part 1 of the Series

Before you hand your prose to an AI — for feedback, for brainstorming, for anything at all — there's one thing you need to know. Something that will change how you think about every word you choose, every strange compound noun, every piece of slang you've been nervously workshopping for three drafts. Here it is:

An AI does not read your writing the way you do.

It doesn't even see words, really. Not in any way that maps to how we learned them in first grade. Instead, it sees something called tokens — and understanding this tiny, weird, overlooked concept is your first key to understanding how these tools actually work.

To an AI, the word "unbelievable" is not one thing. It might be three.

First: Forget What You Know About Words

When you read the word cat, your brain does something remarkable. In an instant, you conjure fur, whiskers, that particular disdainful stare. The word is a container, and you're unpacking it without even trying.

When an AI encounters the word cat, it processes a token. In this case, a whole-word token. But here's where it gets interesting: not every word gets that luxury.

A token is simply a chunk of text. It might be a whole word. It might be half a word. It might be just a punctuation mark or a single letter. The AI's brain — called a language model — was trained to break all text into these bite-sized pieces before it does anything else. Think of it as the machine's way of chewing before it swallows.

Live Demo — Hover Over Each Token
The quick brown fox jumped over the lazy dog.
That sentence contains 11 tokens — not 9 words. Notice "jumped" splits into "jump" + "ed".

See that? Jumped became two pieces. The model separated the root from the suffix. It does this constantly, automatically, invisibly. Your polished sentence, parsed into puzzle pieces.

Why Does It Chop Things Up Like This?

Here's the brilliant, counterintuitive logic: if an AI tried to memorize every single word that exists — in every language, every technical field, every slang dictionary, every username ever typed — it would need a vocabulary of millions upon millions of entries. It would be impossibly bloated.

Tokens are the solution. By breaking words into recurring chunks — common roots, suffixes, syllables — a model can handle almost any word it encounters, even ones it's never seen before, by assembling them from familiar parts.

Think of it like a master chef who doesn't need a recipe for every dish in the world. They understand salt, acid, fat, heat. From those fundamentals, they can improvise anything.

How You Read a New Word

You encounter "verisimilitude." You might not know it, but you can guess — it sounds like "verify" and "similar." Your ear finds the Latin roots. You make an educated inference from the whole shape of it.

How AI Processes a New Word

It sees: ver + isim + ilit + ude. Each piece has statistical relationships to thousands of other words. The meaning emerges from the pattern of those relationships.

✦ ✦ ✦

The Part That Will Genuinely Fascinate You

Here's where it gets weird — and this is the part writers in particular need to know about.

Common words get whole tokens. Rare words get chopped up.

A plain, everyday word like "the" or "said" is so common that the model has its own dedicated token for it. Fast, clean, no fuss. But an unusual word — technical jargon, a neologism, a very long compound noun — might get shredded into four, five, even six pieces.

Token Cost — Common vs. Rare Words
"the"
1 token
"writing"
1 token
"unforgettable"
2–3 tokens
"neurodivergent"
3–4 tokens
"hippopotomonstrosesquipedaliophobia"
~8 tokens

This has a real, practical consequence: the AI does not have the same intuitive feel for your rare, unusual, or invented words that it has for common ones. It's not that it can't handle them — it can piece them together — but they require more processing, more assembly. The experience is different from reading "dog."

Writers who use highly unusual vocabulary, invented terms, or non-English words should know: the AI is working harder to understand those words, and it may be slightly less confident in how it handles them.

The Spelling Mystery — Finally Solved

Have you ever asked an AI how many letters are in a word — say, "strawberry" — and it got it wrong? There's a famous example where models confidently miscount the R's in that very word. Writers find this baffling: how can something so powerful not count to ten?

Tokens explain everything.

strawberry straw · berry

THE AI SEES TWO TOKENS, NOT TEN LETTERS

The model doesn't look at individual letters — it sees tokens. When you ask it to count letters, it has to reconstruct the original spelling from the token chunks and then count. It's doing something slightly unnatural. A brilliant literary critic, asked to count the number of brushstrokes in a painting — they can do it, but it's not their native mode.

This is also why AI can occasionally struggle with precise wordplay, acrostics, or spelling-dependent puzzles. It isn't "reading" the way you are.

What This Means For You, The Writer

You don't need to become a computer scientist. You just need one mental image to carry with you:

When you give your writing to an AI, it doesn't see your sentences. It sees a river of tokens — chunks of language, flowing in sequence, each one in statistical conversation with all the others.

Your beautiful prose arrives and gets immediately, lovingly, efficiently dismantled at the door. Then reassembled, understood, and responded to. The meaning survives the process — remarkably well, most of the time. But the journey is stranger than you imagined.

Here are three things this changes for the practicing writer:

1. Invented words and neologisms deserve extra clarity. If you're writing speculative fiction with fabricated terminology, don't assume the AI will intuit your logic. Define your terms. Give it context. It's working from pieces, not wholes.

2. Highly phonetic or sound-dependent work needs a different lens. The AI experiences language statistically, not acoustically. When you ask for feedback on the sound of a sentence, you're asking it to do something somewhat outside its native strengths. It can do it — but calibrate your expectations.

3. Spelling, letter-counting, and anagram tasks need careful prompting. Ask explicitly, step by step. "List each letter individually, then count them." You're helping it do something that requires looking beneath the token level.

The machine is brilliant with meaning. It is mortal with letters.

Coming Up Next

Now that you know how AI chews language, the next post will explore what it does once it swallows — how a language model actually predicts what comes next, and what that means for why it sometimes sounds eerily perfect and sometimes goes completely off the rails.

Because once you understand tokens, you're ready for the real conversation about how these things think.

Or — don't think. Depending on who you ask.

END OF ISSUE NO. 1 — AI & THE CRAFT