this post was submitted on 17 Aug 2025
698 points (99.7% liked)

Technology

403 readers
425 users here now

Share interesting Technology news and links.

Rules:

  1. No paywalled sites at all.
  2. News articles has to be recent, not older than 2 weeks (14 days).
  3. No videos.
  4. Post only direct links.

To encourage more original sources and keep this space commercial free as much as I could, the following websites are Blacklisted:

More sites will be added to the blacklist as needed.

Encouraged:

founded 3 months ago
MODERATORS
 

Comments

Source.

you are viewing a single comment's thread
view the rest of the comments
[–] gressen@lemmy.zip 82 points 2 days ago (3 children)

Write TOS that state that crawlers automatically accept a service fee and then send invoices to every crawler owner.

[–] BodilessGaze@sh.itjust.works 42 points 2 days ago (2 children)

Huawei is Chinese. There's literally zero chance a European company like Codeberg is going to successfully collect from a company in China over a TOS violation.

[–] wischi@programming.dev 15 points 1 day ago

It's not even a company. It's a non-profit "eingetragener Verein". They have very limited resources, especially money because they purely live on membership fees and donations.

[–] Lumisal@lemmy.world 6 points 2 days ago (1 children)

True, but it can help limit the European AI scrapers too

[–] BodilessGaze@sh.itjust.works 8 points 2 days ago* (last edited 2 days ago) (1 children)

I really doubt it. Lawsuits are expensive, and proving responsibility is difficult, since plausible deniability is easy. All scrapers need to do is use shared IPs (e.g. cloud providers), preferably owned by a company in a different legal jurisdiction. That could be the case here: a European company could be using Huawei Cloud to mask the source of their traffic.

[–] veniasilente@lemmy.dbzer0.com 5 points 2 days ago (1 children)

All scrapers need to do is use shared IPs (e.g. cloud providers),

Simple: just charge the cloud provider.

Once that gets strong enough they'll start placing terms against scraping in their TOS.

[–] wischi@programming.dev 5 points 1 day ago* (last edited 1 day ago)

And then they just throw it in the bin because there was never a contract between you and them. What to do then? Sue Microsoft, Amazon and Google

I'm sure Codeberg, a German non-profit Verein, has time and money to do that 🤣.

[–] wischi@programming.dev 38 points 2 days ago (1 children)

They typically don't include a billing address in the User Agent when crawling 🤣

[–] gressen@lemmy.zip 9 points 2 days ago (1 children)

That's a technicality. The billing address can be discovered for a nominal fee as well.

[–] wischi@programming.dev 7 points 1 day ago* (last edited 1 day ago)

I'm sure it can't, especially for foreign IP addresses, VPNs, and a ton of other situations. Even if directly connect to the internet just via your ISP, many countries in Europe (don't know about US) have laws that would require you to have very good reasons and a court order to get the info you need from the ISP - for a single(!) case.

If it would be possible to simply get the address of all digital visitors, we wouldn't have to develop all this anti scrape tech and just sue them.