

That’s when you get into more of the nuance with tokenization. It’s not a simple lookup table, and the AI does not have access to the original definitions of the tokens. Also, tokens do not map 1:1 onto words, and a word might be broken into several tokens. For example “There’s” might be broken into “There” + “'s”, and “strawberry” might be broken into “straw” + “berry”.
The reason we often simplify it as token = words is that it is the case for most of the common words.
Emily the engineer - https://youtube.com/@emilytheengineer - Does fun projects with 3d-printing
Evan and Katelyn - https://youtube.com/@evanandkatelyn - Does a lot of DIY/arts&crafts
Nerdforge - https://youtube.com/@nerdforge - Maker/Arts&Crafts, does a lot fantasy based stuff
Physics girl - https://youtube.com/@physicsgirl - Physics stuff, she has struggled with health issues the last couple of years, but her old stuff is still very good
Laura Kampf - https://youtube.com/@laurakampf - Maker, does a lot of woodworking and upcycling