this post was submitted on 01 Jul 2023
1015 points (96.4% liked)

Technology

67050 readers
3930 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Elon Musk said verified accounts would be limited to reading 6,000 posts per day while unverified users will be limited to 600.

you are viewing a single comment's thread
view the rest of the comments
[–] elghoto@lemmy.world 81 points 2 years ago (5 children)

I didn't read the article. But could it be that every platform is trying to limit LLMs to be trained on their data?

[–] JohnDClay@sh.itjust.works 69 points 2 years ago (2 children)

Definitely one reason, but this is also to push premium

[–] Pexistralxinz@lemm.ee 74 points 2 years ago (6 children)

Seems like the free internet as we knew it is dead. Any site with free, user-generated content to monetize Is about to try and suck every last dime from it.

[–] wtfeweguys@lemmy.whynotdrs.org 60 points 2 years ago

The internet as we knew it wasn’t free. We were the product. Here’s hoping their drive to force payment sends us on to decentralized, open source infrastructure.

[–] Zak@lemmy.world 30 points 2 years ago

Fortunately, we have the user-owned distributed internet to move to.

[–] MysteriousSophon21@lemmy.world 11 points 2 years ago* (last edited 2 years ago)

Yep, capitalist greed ruins everything. Which is why distributed networks run by the community are our best hope for the future.

Those fuckers would try to ruin this too, by bot attacks, by trying to cut deals with some of the admins or by running their own versions of Lemmy/Mastadon.

We as a community will have to handle whatever comes next.

[–] Marxine@lemmy.world 5 points 2 years ago

We now have the federated interne though, and I think it's got a way brighter future.

[–] Silviecat44@vlemmy.net 1 points 2 years ago

Honestly a paid internet is better. Just look at the Fediverse. Internet was never profitable. Now the data collection just needs to stop

[–] spaceribs@lemmy.world 1 points 2 years ago

They were always going to do that, the squeeze is basically required if you're planning on making a public offering and become beholden to investors.

[–] TheWorstNL@lemmy.world 26 points 2 years ago

Also they haven't paid Google for using their Cloud so they are moving their data.

[–] cort@lemm.ee 23 points 2 years ago (2 children)

No you misunderstand they desperately want them to be trained with their data. They just want them to pay hundreds of thousands to millions of dollars to do so. Twitter is not buckling under the weight of data scraping, Elon is just pissed that companies are data scraping instead paying his exorbitant API fees.

[–] DrakeRichards@lemmy.world 9 points 2 years ago (1 children)

They just want them to pay hundreds of thousands to millions of dollars to do so.

This is the hilarious part to me: some companies might pay these fees, but there will be many more who won’t and will instead use actual web scrapers to get their data anyways. As the number of individuals training LLM models increases in the next couple of years, this will create a much more significant traffic load compared to API calls.

[–] cort@lemm.ee 4 points 2 years ago* (last edited 2 years ago)

Yeah he doesn’t seem to understand he’s not selling the data, the data is public, he’s selling convenience. And if the convenience isn’t worth the price you’ve set, people will just take the extra effort and avoid the expense.

[–] itsJoelleScott@lemmy.world 4 points 2 years ago

Exactly. I do selenium scripting as my main task for work, and as soon as I heard about how high the api rates were my first through was "Jesus, it might slower than straight api calls, and the dynamic xpaths might suck, but I could write a script that scrapes the website for cheaper." Twitter is hurting for cash right now, and I imagine his effort to raise funds is the end goal here. He instituted the api policy, learned about another side effect, and continues to with the most extreme, devoid of nuance response each time.

All "in my opinion," of course.

[–] 21racecar12@lemmy.world 15 points 2 years ago (1 children)

Hmm. Sounds a lot like something /u/spez said. I wouldn’t expect Twitter to be a good LLM source with its current state anyway…Reddit would be a lot better contextually. The reality is Reddit and Twitter are bleeding cash and they’ve got brain-rotted CEOs that don’t pay their bills or have unrealistic plans and timelines for profitability.

[–] Marxine@lemmy.world 2 points 2 years ago

Most of Twitter's (and soon Reddit's) data to be fed to LLMs will be porn sharing bots at this point.

[–] fuzzzerd@kbin.social 3 points 2 years ago

Seems like a strange way to enforce it, at the user level vs the api client level, unless they're trying to guard against screen scraper types.

[–] linearchaos@lemmy.world 3 points 2 years ago

It's all fun and games till they train the AIs to make a million small accounts.