Sometimes you’re mad about LLMs being trained on the creative work of thousands of non-consenting humans for reasons like, “It’s demeaning to our creativity, and it means that any ‘good idea’ that the software 'came up with’ was likely just copied straight from a human artist, who is getting no credit for it and may even end up being accused of stealing it from the software”
And other times, you’re mad about LLMs being trained on the creative work of thousands of non-consenting humans for reasons like, “those thousands of non-consenting humans ALL FUCKING SUCKED AT SPELLING”
…can dictionary-based spellcheckers made a comeback please. They weren’t perfect but holy god anything is better than this.
(turns autocorrect off for the 1000000th time and chants a futile incantation for it to maybe actually stay off)
Yeah definitely partly this.
But also: looking at my own past usage of Google docs, which seems to be the basis of most of this spellchecker training? It was not even mostly the type of writing that’s even expected to be following typical spelling and grammar rules.
It included things like: Shared shopping lists. Outlines for things I was maybe going to write someday. Hastily-written descriptions of things that had happened, intended mostly just as reminders to myself. Brainstorming for made-up languages. Scenes of fictional dialogue, written by other people and shared for editing purposes, in which multiple contributors argued about whether a character would talk that way (none of it focused on the story’s spelling and grammar, and the comments themselves certainly not proofread, because that wasn’t the point).
And yeah –none of this was filtered out. All of it was treated as equally valid source material for teaching the spellchecker “what writing is supposed to look like.” With the assumption that it’s all supposed to look the same.
Like training a cook on everything including restaurant filet mignon, Grandma’s apple pie, a desperate mom’s attempt to cook meatloaf for five kids who all borderline hate meatloaf, eight kinds of microwave ramen, whatever your stoned cousin comes up with to satisfy the munchies in the middle of the night, and the contents of the kitchen compost bin. And then saying, “that’s all Food. Now go cook some Food.”
It’s fucking ludicrous and I am astonished that this mess was ever sold as an actual finished software product– let alone become the default one used by every application that can use a spellchecker at all.