Training AI and my embarrassing naïvité

The Verge wrote an incredibly depressing article on the gig-economy workers teaching genAI models.

Laid-off lawyers, history PhDs, and scientists are now part of a miserable gig economy in which they’re teaching AI how to do their old jobs – from The Verge

This echoes an article of theirs three years ago, to which, in response, I asked “Does the generation of AI have to be so grim?

Yeah, this problem isn’t new and is just growing as white-collar professionals get caught up in the dragnet as these LLMs hoover up information and context.

Didn’t see it coming
Oh, I am so naïve. I am a glass half-full, most folks are nice, kinda guy. So I tend to not see nefarious alternatives (except when it comes to cybersecurity, tho – there, I see trouble everywhere).

For many years, I’ve been thinking of how we provide context to the materials on the web. Back in 2004-2009, when I was at Nokia, I saw how the Web 1.0 of static info put up by publishers was mixing with the Web 2.0 of humans tagging everything. The challenge arising was the concept of the Semantic Web and how do we easily provide context to the whole of the web – for better search, better learning, better insight.

What I said back then, seeing the tools we had at the time, was that the semantics could be created thru a mix of machines adding context, people adding context, and machines watching humans do their thing.

In particular, I was not a fan of all the “librarian” work of humans manually providing ontologies and context to data (not the same as folks tagging things as they go about their posting and such). The process was so easy to get lost in. I’d seen it first-hand, and should have expected where things would go.

Benign benefits?
As I mentioned in my earlier post from 2023:*

In 2016, I was working with a company that was using AI to annotate medical notes. I started thinking again about annotation in general and was wondering if there was some way we could employ folks to annotate. What’s more, realizing that, for so much annotation, you really just needed a human, regardless of education level, I had envisioned a benign system where the annotation was educational and beneficial to the annotator.

I kept thinking of places like West Virginia, where there were deep shifts in society as the coal industry fell apart. What if we could not only tap into all these unemployed, who would have more than enough skills to annotate, being human and all, and, through the annotation process, educate them, provide them with skills to help them into their next non-coal job?

Oh, how things turned out so differently for the industry. And, oh, I’m so embarrassed, how frakkin’ naïve of me. And, oh, how things have really spun out of control.

Power, desperation, and exploitation
The companies like Mercor, who have gig-economy platforms to collect contextual info for genAI are as close to exploitation economy as we can get. Indeed, I remember looking at sites like Guru and Fiverr when I was a consultant and not only realized they were a bid-race to the bottom, but I can’t bring myself to use them as a buyer (let alone a seller).

The issue is optimizing for the wrong thing leads to the wrong emergent behavior. And if you are not watching your ethics and manners, you’ll optimize for the wrong thing and build a Hunger Games negative-sum environment (and that’s not even mentioning the psychological damage from the environment and the materials being reviewed).

In the gig-economy, the only one who wins is the buyer. The gig worker is abused, and the gig company uses fake Monopoly money to fund everything, as the business model they’ve chosen isn’t sustainable.

But is it unsustainable overall? Is it unsustainable due to the ethics? Can someone profitably and ethically run a gig business?

More krap
We all use genAI daily (thank you, Google 🙄). So much push back on genAI has been about copyright, energy, water, RAM, and so forth.

We can add sweatshops to the list.

Ugh.

 

*Now that I think of it, even back then, the job of medical notes annotation was mostly underpaid women working at businesses with razor-thin margins. Hallmarks of an exploitative industry?

 

Image from Wikipedia

Thoughts?